Jack Morris 09月05日
AI研究现状:论文数量激增与科学进展缓慢的悖论
index_new5.html
../../../zaker_core/zaker_tpl_static/wap/tpl_guoji1.html

 

尽管人工智能领域人才涌现、资金充裕且论文发表数量呈指数级增长,但实际科学进展的速度似乎并未同步加快。文章探讨了这一现象,认为问题根源在于研究者的“雄心问题”——过度追求数量而非解决重大问题。当研究空间有限而研究者数量剧增时,研究趋于在现有知识的“缝隙”中进行插值,而非开辟新领域。真正的突破性研究应具有“扩展性”,能开辟更多探索空间。作者呼吁研究者应提高研究雄心,拥抱失败,专注于那些能真正推动领域前进的简单且可扩展性强的研究方向,借鉴哈明的研究建议,与他人合作,解决那些最根本、最宏大的问题。

📈 **论文数量爆炸与进展缓慢的矛盾**:当前AI领域面临一个显著的悖论:尽管每年提交给顶尖会议(如NeurIPS)的论文数量呈指数级增长,研究者和资金也日益增多,但许多研究者认为实际的科学进展速度并未同步提升,甚至可能有所放缓。这表明仅仅增加论文产出并不能直接等同于科学知识的快速增长,需要反思研究的质量和方向。

💡 **“雄心问题”与研究空间挤压**:文章的核心论点是,AI研究进展缓慢的根源在于研究者普遍存在“雄心问题”,即倾向于选择风险较低、更易于发表的“插值式”研究,而非开辟全新、更具挑战性的“扩展式”研究。当研究者数量远超可探索的“研究空间”增长速度时,研究领域会变得拥挤,导致研究者难以找到真正有价值的未被探索的领域,从而限制了整体科学的突破。

🚀 **“扩展性”研究的重要性与实践**:文章强调,真正重要的研究应是“扩展性”的,即它能开辟比其自身消耗的研究空间更大的新领域,并激发更多后续的思考和研究。作者借鉴哈明的观点,鼓励研究者提高研究雄心,敢于尝试“疯狂的想法”,即使这意味着更高的失败率。通过专注于那些简单但根本性的问题,并与同行深入交流,才能产出真正具有长远影响力的研究成果,而非被时间遗忘的“小论文”。

⏳ **经久不衰的研究特征:简洁与可扩展性**:通过分析NeurIPS的“Test of Time”获奖论文,文章发现真正能够经受时间考验的研究往往具有简洁性(simplicity)和良好的可扩展性(scalability)。现代AI系统的核心突破,如大型语言模型,往往建立在相对简单的原理之上,而非大量复杂的、已被遗忘的早期研究。这表明未来的AI突破也很有可能来自简单且易于扩展的创新。

What are the most important problems in your field? And why aren’t you working on them?

This is the pitch made by Richard Hamming in his famous talk You and Your Research. It’s been nearly forty years since the talk was first given and yet it feels like more people are ignoring this advice than ever before.

Thanks for reading! Subscribe for free to receive new posts and support my work.

This may be especially true in AI.

Our field is full of brilliant people – researchers, engineers, scientists. Students are flocking to major in computer science and learn about the technology from an early age. More people are applying to graduate school than ever before. The AI labs are raising money and publishing all sorts of findings in preprints and blog posts.

And yet it doesn’t feel like the actual science is progressing faster than it was five years ago. We’re publishing more papers than ever, but discovering about the same amount. I don’t think it’s a talent issue, and it’s certainly not a funding issue. I think we have an ambition issue.

More papers are being published than ever

This year there were over 25,000 papers submitted to NeurIPS. This represents 30% annual growth since 2017, when around 3,000 papers were submitted. This growth comes from all sorts of places. More companies are doing AI research. More universities are doing AI research. More people are getting into AI research.

The number of submitted papers to NeurIPS, the most popular AI conference, over the last fifteen years. It’s nearing exponential growth territory. (Data from papercopilot.com)

The number of submitted papers to NeurIPS, the most popular AI conference, over the last fifteen years. It’s nearing exponential growth territory. (Data from papercopilot.com)

With more people writing more papers about more different topics we’d expect some explosive growth. And yet I’d argue that the pace of progress feels pretty constant, if not slowing down by a small margin. This is a difficult thing to measure, so I polled a bunch of researchers:

If we take the mean response here we find that progress is moving faster – but only a bit. This should be surprising in an area that’s seen explosive growth along just about every dimension.

Expanding the space of everything

If we’re publishing exponentially more, why aren’t we learning exponentially more? One possible explanation is that the space of possible ideas we’re exploring has stayed a constant size, but the number of explorers has grown at a near-exponential pace. If this is an accurate model, then the world of research looks something like this:

If the number of researchers has increased by 10x, but amount of space is only slightly larger, then the current landscape of research looks something like the bubble on the right – but at a larger scale. And maybe packed tighter.

This mental model explains why many AI researchers report experiencing anxiety and frustration around the current research and publication process. There are so many people doing so many things that it’s simply hard to find any open space to occupy.

I think this anxiety also comes from a mindset that the amount of ‘research territory’ is constant, and good research will come from interpolating between existing ideas, rather than pushing forward into the unknown. This is exactly what Hamming warns against.

And this also means that the best research expands the space of territory available to us to explore. And yet clearly most research is doing something different – most papers aren’t groundbreaking. They’re comfortably nestled somewhere in high-dimensional space, in some unoccupied pigeonhole between things that already exist.

Let’s think of research ideas in physical terms. New research can be territory-expanding, in that it opens up more space for other researchers than it consumes, or territory-consuming, in that it `eats' some of the available territory without giving anything back.

The best research is territory-expanding. It's generous. It leaves things to the imagination, sparks more ideas than it explores. It’s explorative rather than exploitative. The most important research always has this property, where it expands space more than occupies it.

So why aren’t more people doing this kind of research?

Good research takes ambition

My theory for why most research isn’t territory-expanding is because the odds of producing something groundbreaking with a project are directly proportional to its chance of failure. This means that if you set out to change the world, you will probably fail. The converse here is that if you set your sights low, and choose a topic that’s more likely to bear fruit in some way, then your chances of `success' (publishing something) are higher, but the chances of it expanding the space of knowledge are smaller.

And thus, we end up with 25,000 NeurIPS submissions. This is what happens when the community optimizes too strongly for the number of papers submitted and too loosely for solving big problems.

I also want to recognize that for some groups, external factors are at play. I’ve been told that you need to publish at least one paper to get into a CS graduate program these days – meaning there are many people who just need to publish something, even if it isn’t their best work. And some tech companies prevent employees from publish findings that feel too important, giving their workers have a perverse incentive to make their research sound small to give their work a chance to see the light of day. Let’s ignore both of these edge cases.

Higher ambition means a higher rate of failure

One question every researcher should ask: what are you optimizing for?

If your goal as a researcher is to publish as much as possible and reach the maximum h-index then logic says to lower your ambition to the minimum possible level such that you can still publish. After a bit of practice, a good researcher with the singular goal of publishing can often write papers every few months. You’ll see this happen. Some conferences even have PhD students publishing multiple papers in the same conference indicating that they worked on multiple projects at the same time (I’m not pointing any fingers, as I have also done this once. Also, it can happen for different timeline-related reasons.)

If your goal is to move the field forward then you should plan differently. Raise your level of ambition. Try crazy things.

A consequence of this is that you will publish less. Deadlines will come around for which you have nothing to show. Perhaps you found out that your idea has already been done, or doesn’t work at scale, or is doing things that you just don’t understand. The most likely explanation is that most great things take more than three months to execute.

(As an aside, I have been doing research for six years or so and have already found that periods without publishing are really just fine. People care most about the one or two projects you’ve ever worked on that are most exciting. They care much less about the six month gaps where you were quietly trying to make something work. In my own career this has already happened several times: I spent my year in the Google AI Residency working on something that never quiet worked. I worked on a music model for six months that never went anywhere. In the third year of my PhD I wanted to build a contextual embedding model, which was a neat idea but took over a year to get the details right. The list does not stop there…)

Most research doesn’t matter much in the long run

But if you look back at the conference proceedings from ten years ago, they don’t feel exceptionally relevant:

A screenshot of the proceedings from NeurIPS 2015. Many of these papers did not stand the test of time.

These are interesting and important problems, but have almost nothing to do with the AI systems we use today. This makes one wonder how much of the content from this year’s 25,000 NeurIPS papers might still be relevant in 2035.

You might have your own hypotheses on why academic research tends toward small-but-eventually-forgotten problems. Predicting the future is hard. And most research moves in cycles that last less than a year, so it’s important to choose problems that are tractable within that timeframe.

Research that stands the test of time

For our purposes, we’re talking about research that affects the way we think about or design modern AI systems. This is an exceptionally small sliver of research papers. Under this definition, I’d guess less than 1% of published AI research ‘matters’.

Why is this number so small? One explanation is that modern AI grew out of machine learning and probabilistic modeling, which is a much deeper subfield with many sub-areas of study that have nothing to do with modern AI. Here’s one example: if you took statistics in school, you know that modern statistics is divided into two schools (Bayesian and frequentist) but modern AI is totally frequentist. Even though lots of people have researched Bayesian neural networks, they’ve basically been abandoned in favor of learning everything from data without a prior.

So what makes up this 1%? What research matters? To answer this question, let’s first gather a bit of data. NeurIPS (the biggest AI conference) gives “Test of Time” awards to ten-year-old papers that have made an outsized impact over the preceding decade. I asked Deep Research to gather me a list of the papers that have won the award:

Papers that have won the “Test of Time” award at NeurIPS, prepared by Deep Research.

Out of the ten papers that have won the Test-of-Time award, I notice the following themes:

And let’s remember that basically all deep learning research changed with the ImageNet paper in 2012, which only won its test-of-time award in 2022. So only the awards from 2022–2024 reflect research from the “modern” (post-AlexNet) era.

We can clearly see that in the long run, the AI research that matters is research that (a) is simple and (b) scales well. In fact, modern AI systems like Claude 4 and Veo 3 are so simple that they use almost none of the decades of published ML research. And yet this is likely what Hamming would have predicted – in the end, most science is forgotten.

This is probably a good thing. Downstream effects are that AI is easy for newcomers to learn, and relatively simple to implement. And it seems quite reasonable to bet that the next big breakthroughs in AI will be simple.

Remind yourself that simple questions are often the most ambitious. They often also take the longest to mentally grapple with, and are the hardest to engineer experiments for.

Putting Hamming’s advice into practice

I suspect that the problem here isn’t lack of ambitious ideas, it’s more related to a fear of failure. Most people that have done research for a long time confront big, messy, unanswered questions that continually resurface through day-to-day experimentation.

It seems the best strategy is to do whatever you can to can to confront those questions directly. There is no question too big to make into a research project. Einstein famously used the thought experiment of riding along on a beam of light to explore the ideas that ended up revolutionizing modern physics. What’s your question like this?

Perhaps you want to imagine your own personal data being stored inside an LLM, or what it would be like to slide down a gradient during training, or some new mechanism for visualizing and manipulating the internals of an LLM as it processes a sequence along during inference.

Hamming also notes that the best way to move forward is often by talking to capable people that you trust. He specifically mentions that solo daily brainstorming sessions haven’t proved fruitful in his career. The best way to make progress, he says, is to talk to someone and explain where you’re starting from all the way up to the point where you get stuck.

Maybe the specific question isn’t even that important, since so few people seem to be thinking like this. The bottom line is that we’d all be better off if most people worked on big ambitious projects for a longer time. We would see fewer papers, but each individual paper would have more depth. By publishing less, we would move faster.


Thanks to Ege Erdil for Jeffrey Emanuel for feedback on an early draft of this post.

Fish AI Reader

Fish AI Reader

AI辅助创作,多种专业模板,深度分析,高质量内容生成。从观点提取到深度思考,FishAI为您提供全方位的创作支持。新版本引入自定义参数,让您的创作更加个性化和精准。

FishAI

FishAI

鱼阅,AI 时代的下一个智能信息助手,助你摆脱信息焦虑

联系邮箱 441953276@qq.com

相关标签

AI研究 科学进展 论文发表 研究雄心 突破性研究 AI Research Scientific Progress Publication Research Ambition Breakthrough Research
相关文章