Interconnects 10月26日 23:27
AI领域高强度工作现状与反思
index_new5.html
../../../zaker_core/zaker_tpl_static/wap/tpl_guoji1.html

 

当前AI领域,特别是大语言模型(LLM)的快速发展,正以前所未有的压力驱使着从业者投入超乎寻常的工作时长,从996到002(午夜到午夜,仅休息两小时)的说法屡见不鲜。这种高强度工作既有社交媒体上的表演成分,也反映了行业竞争的真实状态。文章将这种现象与AI技术日新月异、技术前沿窗口期不断缩小的紧迫感联系起来,并类比了其他行业(如苹果在华工程师)曾面临的相似困境,强调了过度劳累可能带来的健康甚至生命风险。通过引用《华尔街日报》的报道和研究者的观点,文章探讨了AI研究与精英体育训练的相似之处,即追求极致、精益求精以及在激烈竞争中不断超越自我。同时,文章也反思了这种“卷”文化可能带来的负面影响,包括牺牲生活平衡、限制创造力,并强调了团队文化建设的重要性,认为健康的文化环境是决定长期成功的关键因素,而非仅仅依赖于人才和计算资源。

🚀 **AI领域的“内卷”现象与高强度工作:** 文章指出,当前AI,特别是大语言模型(LLM)领域存在普遍的高强度工作文化,从“996”到“002”(午夜到午夜,仅休息两小时)的工作时间描述,反映了行业竞争的激烈和技术前沿快速变化的压力。这种现象部分是社交媒体上的表现,但更多反映了真实的工作状态,影响着从业者及其社交圈。

🧠 **技术迭代加速与机遇窗口:** LLM领域的快速发展意味着技术前沿的窗口期正在迅速关闭,促使从业者投入更多时间和精力以保持相关性。这种紧迫感使得人们竞相提高技术产出质量,有时甚至不惜牺牲生活平衡,以应对不断抬高的行业标准。

⚖️ **工作强度与身心健康:** 文章通过引用《华尔街日报》的报道,揭示了AI研究人员为赢得技术军备竞赛而投入100小时工作周的情况。并将其与历史上的高压工作环境(如苹果在华工程师)进行对比,严肃指出过度劳累可能带来的健康问题,甚至生命危险,强调了休息对于维持心智敏锐度的重要性。

🏆 **精英体育的类比与文化建设:** 作者将当前LLM研究与精英运动员的训练进行类比,强调了目标远大、微小差距决定成败、日积月累的艰苦努力以及最终成果的衡量方式。同时,文章强调了团队文化的重要性,认为一个凝聚力强的文化环境比个体能力或资源更为关键,并指出健康的文化能够提升工作体验,使其不至于感觉像纯粹的“工作”。

💡 **创新与倦怠的平衡:** 过度工作可能导致思维僵化,限制创造性和新颖的解决方案。文章认为,休息能够带来智力上的回报,为创造力和洞察力提供空间。在LLM领域,即使有低垂的果实(易于实现的改进),也需要更多和更精细的资源投入,这加剧了倦怠感,使得研究者感到必须持续投入才能不落后。

One of the obvious topics of the Valley today is how hard everyone works. We’re inundated with comments on “The Great Lock In”, 996, 997, and now even a snarky 002 (midnight to midnight with a 2 hour break). Plenty of this is performative flexing on social media, but enough of it is real and reflecting how trends are unfolding in the LLM space. I’m affected. My friends are affected.

All of this hard work is downstream of ever increasing pressure to be relevant in the most exciting technology of our generation. This is all reflective of the LLM game changing. The time window to be a player at the most cutting edge is actually a closing window, not just what feels like one. There are many different sizes and types of models that matter, but as the market is now more fleshed out with resources, all of them are facing a constantly rising bar in quality of technical output. People are racing to stay above the rising tide — often damning any hope of life balance.

Interconnects is a reader-supported publication. Consider becoming a subscriber.

AI is going down the path that other industries have before, but on steroids. There’s a famous section of the book Apple in China, where the author Patrick McGee describes the programs Apple put in place to save the marriages of engineers traveling so much to China and working incredible hours. In an interview on ChinaTalk, McGee added “Never mind the divorces, you need to look at the deaths.” This is a grim reality that is surely playing out in AI.

The Wall Street Journal recently published a piece on how AI Workers Are Putting In 100-Hour Workweeks to Win the New Tech Arms Race. The opening of the article is excellent to capture how the last year or two has felt if you’re participating in the dance:

Josh Batson no longer has time for social media. The AI researcher’s only comparable dopamine hit these days is on Anthropic’s Slack workplace-messaging channels, where he explores chatter about colleagues’ theories and experiments on large language models and architecture.

Work addicts abound in AI. I often count myself, but take a lot of effort to make it such that work expands to fill available time and not that I fill everything in around work. This WSJ article had a bunch of crazy comments that show the mental limits of individuals and the culture they act in, such as:

Several top researchers compared the circumstances to war.

Comparing current AI research to war is out of touch (especially with the grounding of actual wars happening simultaneously to the AI race!). What they really are learning is that pursuing an activity in a collective environment at an elite level over multiple years is incredibly hard. It is! War is that and more.

In the last few months I’ve been making an increasing number of analogies to how working at the sharp end of LLMs today is similar to training with a team to be elite athletes. The goals are far out and often singular, there are incredibly fine margins between success and failure, much of the grinding feels over tiny tasks that add up over time but you don’t want to do in the moment, and you can never quite know how well your process is working until you compare your outputs with your top competition, which only happens a few times a year in both sports and language modeling.

In college I was a D1 lightweight rower at Cornell University. I walked onto a team and we ended up winning 3 championships in 4 years. Much of this was happenstance, as much greatness is, but it’s a crucial example in understanding how similar mentalities can apply in different domains across a life. My mindset around the LLM work I do today feels incredibly similar — complete focus and buy in — but I don’t think I’ve yet found a work environment where the culture is as cohesive as athletics. Where OpenAI’s culture is often described as culty, there are often many signs that the core team members there absolutely love it, even if they’re working 996, 997, or 002. When you love it, it doesn’t feel like work. This is the same as why training 20 hours a week while a full time student can feel easy.

Many AI researchers can learn from athletics and appreciate the value of rest. Your mental acuity can drop off faster than your physical peak performance does when not rested. Working too hard forces you to take narrower and less creative approaches. The deeper into the hole of burnout I get in trying to make you the next Olmo model, the worse my writing gets. My ability to spot technical dead ends goes with it. If the intellectual payoffs to rest are hard to see, your schedule doesn’t have the space for creativity and insight.

Crafting the team culture in both of these environments is incredibly difficult. It’s the quality of the team culture that determines the outcome more than the individual components. Yes, with LLMs you can take brief shortcuts by hiring talent with years of experience from another frontier lab, but that doesn’t change the long-term dynamic. Yes, you obviously need as much compute as you can get. At the same time, culture is incredibly fickle. It’s easier to lose than it is to build.

Some argue that starting a new lab today can be an advantage against the established labs because you get to start from scratch with a cleaner codebase, but this is cope. Three core ingredients of training: Internal tools (recipes, code-bases, etc.), resources (compute, data), and personnel. Leadership sets the direction and culture, where management executes with this direction. All elements are crucial and cannot be overlooked. The further along the best models get, the harder starting from scratch is going to become. Eventually, this dynamic will shift back in favor of starting from scratch, because public knowhow and tooling will catch up, but in the meantime the closed tools are getting better at a far faster rate than the fully open tools.

Share

The likes of SSI, Thinky, and Reflection1 are likely the last efforts that are capitalized enough to maybe catch up in the near term, but the odds are not on their side. Getting infinite compute into a new company is meaningless if you don’t already have your code, data, and pretraining architectures ready. Eventually the clock will run out for company plans to be just catching up to the frontier, and then figure it out from there. The more these companies raise, the more the expectations on their first output will increase as well. It’s not an enviable position, but it’s certainly ambitious.

In many ways I see the culture of Chinese technology companies (and education systems) as being better suited for this sort of catch up work. Many top AI researchers trained in the US want to work on a masterpiece, where what it takes in language modeling is often extended grinding to stabilize and replicate something that you know definitely can work.

I used to think that the AI bubble would pop financially, as seen through a series of economic mergers, acquisitions, and similar deals. I’m shifting to see more limitations on the human capital than the financial capital thrown at today’s AI companies. As the technical standard of relevance increases (i.e. how good the models people want to use are, or the best open model of a given size category), it simply takes more focused work to get a model there. This work is hard to cheat in time.

This all relates to how I, and other researchers, always comment on the low hanging fruit we see to keep improving the models. As the models have gotten better, our systems to build them have gotten more refined, complex, intricate, and numerically sensitive.2 While I see a similar amount of low-hanging fruit today as I did a year ago, the efforts (or physical resources, GPUs) it can take to unlock them have increased. This pushes people to keep going one step closer to their limits. This is piling on to more burnout. This is also why the WSJ reported that top researchers “said repeatedly that they work long hours by choice.” The best feel like they need to do this work or they’ll fall behind. It’s running one more experiment, running one more vibe test, reviewing one more colleague’s PR, reading one more paper, chasing down one more data contract. The to-do list is never empty.

The amount of context that you need to keep in your brain to perform well in many LM training contexts is ever increasing. For example, leading post-training pipelines around the launch of ChatGPT looked like two or maybe three well separated training stages. Now there are tons of checkpoints flying around getting merged, sequenced, and chopped apart in part of the final project. Processes that used to be managed by one or two people now have teams coordinating many data and algorithmic efforts that are trying to land in just a few models a year. I’ve personally transitioned from a normal researcher to something like a tech lead who is always trying to predict blockers before they come up (at any point in the post-training process) and get resources to fix them. I bounce in and out of problems to wherever the most risk is.

Cramming and keeping technical context pushes out hobbies and peace of mind.

Training general language models you hope others will adopt — via open weights or API — is becoming very much an all-in or all-out domain. Half-assing it is becoming an expensive way to make a model that no one will use. This wasn’t the case two years ago, where playing around with a certain part of the pipeline was legitimately impactful.

Culture is a fine line between performance and toxicity, and it’s often hard to know which you are until you get to a major deliverable to check in versus competitors.

Personally, I’m fighting off a double-edged sword of this. I feel immense responsibility to make all the future Olmo models of the world great, while simultaneously trying to do a substantial amount of ecosystem work to create an informed discussion around the state of open models. My goal around this discussion is for more real things to be built. ATOM Project is a manifestation of me feeling that both the U.S. ecosystem generally and the Olmo project are falling behind.

It doesn’t really seem like there will be an immediate fix or end goal at this, but looking back I’m sure it’ll be clear what the key moments were and whether or not my efforts here and elsewhere met my goals.

Will it all be worth it? How long do you plan to go on like this? It’s not like we’re really going to suddenly reach AGI and then all pack it up and go home. AI progress is a long-haul now.

For me, the only reason to keep going is to try and make AI a wonderful technology for the world. Some feel the same. Others are going because they’re locked in on a path to generational wealth. Plenty don’t have either of these alignments, and the wall of effort comes sooner.


Thanks to Ross Taylor, Jordan Schneider, and Jasmine Sun for feedback on this post.

1

Starting later is obviously far more challenging.

2

For example, the more GPUs you train on, and the bigger the AI model, you encounter more types of errors across the massive system.

Fish AI Reader

Fish AI Reader

AI辅助创作,多种专业模板,深度分析,高质量内容生成。从观点提取到深度思考,FishAI为您提供全方位的创作支持。新版本引入自定义参数,让您的创作更加个性化和精准。

FishAI

FishAI

鱼阅,AI 时代的下一个智能信息助手,助你摆脱信息焦虑

联系邮箱 441953276@qq.com

相关标签

AI 大语言模型 LLM 工作强度 倦怠 技术竞争 团队文化 AI industry Large Language Models LLM Work intensity Burnout Tech competition Team culture
相关文章