Saturday Links: Browsers, $1.4T, and Advances in Continuous Learning
Reinforcement learning environments, orbital data centres, browser wars and blocking shopping bots.
This week sees OpenAI ask for loan guarantees and then walk it back, Gemini may be coming to Siri, and Data centers in space (why is this not called Project Straylight?).
On to the most interesting stories:
- Meta and Hugging Face Launch OpenEnv, a Shared Hub for Agentic Environments. Hugging Face continues to do a great job in launching initiatives that share AI-adjacent resources. This announcement is about context and tooling for agents. Reinforcement Learning approaches are becoming more popular for AI agent training, and they require close control of the tools and actions available to an AI system. An interesting question will be whether openEnv specs will be usable for configuring production runtime environments. That seems like a requirement for agents trained in OpenEnv environments to be useful. The community itself is here.WEB WAR III: The Browser is Back. David Pierce at The Verge unpacks what is behind the recent surge of AI browser launches (still: please don't use them!). The article covers the power of browsers as the window onto the digital world that AI chatbots are now competing with. I agree with David that this is a fascinating reprise of battles that were mostly settled a decade ago by Chrome becoming dominant. What I'm less sure about is whether capturing browser market share matters for AI players. It seems more likely that as Chat AIs become the first choice for some behaviours, they will include more and more browser-like functionality. Browser usage share v's chatbots would then simply erode. OpenAI's strategy seems to be to threaten its competitors along almost every axis to keep them busy. Perplexity with its Comet offering perhaps sees this as a path to garner users who are not getting chatbot AI native. Perplexity with its Comet offering perhaps sees this as a path to garner users who are not getting chatbot AI native.$20B in revenue run-rate and $1.4T in data centre commitments. In an X-post this week, OpenAI CEO Sam Altman shared revenue numbers and projects. His quote was “We expect to end this year above $20 billion in annualized revenue run rate and grow to hundreds of billion [sic] by 2030. We are looking at commitments of about $1.4 trillion over the next 8 years.” The press reaction has mostly scoffed at the large differential between $20B and $1.4T, and questioned whether we are at peak bubble. Altman is good at thinking big, though, and it doesn't take a real stretch to imagine that OpenAI could be in the hundreds of billions in revenue by 2030. $20B is already an impressive achievement, but they are just scratching the surface of how they can monetize (ads, affiliate fees for eCommerce, business productivity, and much more). At a 200-300B revenue run rate, the data centre commitments (which are not necessarily annual) are feasible. It's also sensible to try to lock in some commitments now. OpenAI will indeed be too big to fail.On-Policy Distillation (Kevin Lu and team @ Thinking Machines). [Via Esteve Almirall]. One of the biggest breakthroughs still required for AI systems is to be able to learn post-training while "on the job". This "On Policy" learning isn't possible in a straightforward way for today's LLMs because the training process is long and involved, and even the Reinforcement Learning (RL) post-training requirements require huge numbers of samples. Worse, trying to adjust model weights post-training often causes extreme loss of prior capabilities. In this new paper, the Thinking Machine team presents methods to use a powerful supervisor model to iteratively provide feedback to a smaller operational model as it answers real queries and also the steps taken to reach those answers (this is the "distillation" mentioned in the title). The feedback is then used to create an incremental fine-tune to adapt the smaller model. Because the feedback being given is not just about the quality of the final answer, but also each token along the way, the signal for change is much stronger, and this seems to lead to much more effective learning. Also related to the FAIR paper from a few weeks ago.Amazon sends Perplexity cease-and-desist over AI browser agents making purchases. This seemingly small bit of news is actually caused by a serious long-term challenge for AI companies and existing web retailers. ChatGPT, Perplexity, and other AI services can now surface items for purchase (see Wallmart's ChatGPT deal) and would like to automate the purchase process. Ideally, so they can capture some of the value. Amazon, for its part, would like to keep shoppers coming to Amazon.com, hence the blocking action. Further, Amazon does not want Perplexity to gain private user information. How long will this last? Until AI-initiated purchases become such a large component of purchases that destination shopping sites will have no choice but to "partner" with AI firms.
Wishing you a great weekend!
