Fortune | FORTUNE 08月06日
From OpenAI to Nvidia, researchers agree: AI agents have a long way to go
index_new5.html
../../../zaker_core/zaker_tpl_static/wap/tpl_guoji1.html

 

在湾区举办的Agentic AI Summit汇聚了2000多名学生、研究人员和科技界人士,共同探讨AI代理技术。会上,OpenAI、Google DeepMind、Nvidia等公司的顶尖专家分享了AI代理的最新进展和应用前景,如自主完成任务、预订行程等。然而,与会者普遍持谨慎态度,指出AI代理在可靠性、安全性和实际应用方面仍面临挑战,距离大规模落地尚有距离。尽管如此,行业对AI代理的未来发展充满信心,认为基础设施和硬件的进步将推动其能力提升,尤其在编码等特定领域已显现“狭窄性胜利”。

💡Agentic AI Summit吸引了2000多名AI领域的学生、研究人员和科技从业者,展示了AI代理技术的巨大吸引力,并汇聚了OpenAI、Google DeepMind、Nvidia等公司的顶尖专家,彰显了该领域的前沿地位。

🚀AI代理被定义为能够利用其他软件工具自主完成任务的AI驱动系统,例如能够自动规划和预订假期行程。与传统的RPA(机器人流程自动化)相比,AI代理更具灵活性和强大的处理能力,旨在适应复杂多变的业务需求。

⚠️尽管AI代理技术备受关注,但与会专家普遍表达了审慎的态度。Google DeepMind的Ed Chi强调了演示环境与真实生产环境之间的差距,OpenAI的Sherwin Wu也指出,AI代理在日常工作中的实际改变尚未达到预期,其可靠性、安全性和可信度仍是关键挑战。

🌟尽管存在挑战,但行业对AI代理的未来发展依然充满乐观。Databricks的Ion Stoica对基础设施的改进表示赞赏,Nvidia的Bill Dally则认为硬件进步将赋能更强大、更高效的AI代理行为。在编码等特定领域,AI代理已取得显著进展,预示着其巨大的发展潜力。

🌍文章还提及了其他AI相关新闻,包括美国政府批准OpenAI、Google、Anthropic进入联邦AI供应商名单,AI投资对美国经济的影响,AI销售工具Clay的融资情况,以及Google DeepMind新推出的“世界模型”Genie 3,该模型能够从文本提示生成实时交互式模拟环境,是训练高级代理和实现通用人工智能的关键一步。

Only in the Bay Area does spending a Saturday geeking out about AI agents—alongside 2,000 students, researchers, and tech insiders crammed into UC Berkeley—feel like a totally normal weekend plan. As I picked up my badge at the day-long Agentic AI Summit and watched the line snake through the student union lobby, it felt less like an academic conference and more like Silicon Valley’s version of a buzzy New York brunch spot.

This was certainly due to the speaker lineup, which was stacked with top AI researchers and scientists, including Jakob Pachocki, chief scientist at OpenAI; Ed Chi, VP of research at Google DeepMind; Bill Dally, chief scientist at Nvidia; Ion Stoica, cofounder at Databricks & Anyscale, as well as a UC Berkeley professor; and Dawn Song, a pioneering UC Berkeley professor focused on AI security. 

The popularity might have been due to the buzzy topic—AI agents, generally defined as an AI-powered system that can complete tasks, mostly autonomously, using other software tools. Think not only suggested a vacation itinerary, but also booking the flight and making the hotel reservation.

As my colleague Jeremy Kahn said in a recent article, “This kind of automation is a perennial C-suite fever dream. Over the past decade, companies embraced ‘robotic process automation,’ or RPA. This was software that could automate repetitive tasks, such as cutting and pasting between database programs. But traditional RPA systems are inflexible and unable to deal with exceptions, and can usually handle only one narrow task.” Agentic AI is meant to be both more flexible and powerful, adapting to business needs.

In a January 2025 blog post, OpenAI CEO Sam Altman said, “We believe that, in 2025, we may see the first AI agents ‘join the workforce’ and materially change the output of companies.”

But despite the hype, the overall message at the Agentic AI Summit was cautious and grounded: Agents may be the buzziest trend in AI right now, but the tech still has a long way to go, they said. AI agents, unfortunately, aren’t always reliable. They may not remember what came before.

Google DeepMind’s Chi, for example, stressed the gap between what agents can do in curated demos versus what’s still needed in real-world production environments. Pachocki highlighted concerns around the safety, security, and trustworthiness of agentic systems, particularly when they’re integrated into sensitive applications or operate autonomously. 

“I still don’t think agents have really lived up to their promise,” said Sherwin Wu, head of engineering at OpenAI API. “Certain more generic cases have worked, but my day-to-day work doesn’t really feel that different with agents.”

While today’s agents may not currently live up to the massive hype (consider Salesforce CEO Marc Benioff’s recent claim that a shift to digital labor means he will be the “last CEO of Salesforce who only managed humans”), the speakers at the Agentic AI Summit still had plenty of optimism to share. Databricks’ Stoica expressed enthusiasm about infrastructure improvements that are making it easier to build agentic systems. Nvidia’s Dally suggested that continued hardware advances will enable more powerful and efficient agent behavior. Several pointed out “narrow wins” in specific domains, like coding.

Today’s AI agents may still have growing pains, but given the crowded UC Berkeley ballroom, the industry maintains its eye on the prize: AI agents that can reliably operate in the real world. The payoff, they believe, will be well worth the wait.

With that, here’s more AI news.

Sharon Goldman
sharon.goldman@fortune.com
@sharongoldman

AI IN THE NEWS

U.S. agency approves OpenAI, Google, Anthropic for federal AI vendor list. Reuters reported today that the General Services Administration, which is the U.S. government's central purchasing arm, added OpenAI's ChatGPT, Google's Gemini, and Anthropic's Claude to a list of approved AI vendors in order to accelerate use of the technology by government agencies. The tools will be available to the agencies through a platform with contract terms in place. The GSA said approved AI providers "are committed to responsible use and compliance with federal standards."

The AI spending boom could have real consequences for the U.S. economy. According to the Washington Post, Big Tech’s record-breaking investment in artificial intelligence—more than $350 billion this year from Google, Meta, Amazon, and Microsoft—is becoming a major economic force, even as the broader U.S. economy shows signs of slowing. While job growth is cooling, this massive AI spending spree is fueling construction of data centers and driving demand for chips, servers, and networking gear—potentially boosting GDP growth by up to 0.7% in 2025. But economists warn the growing reliance on tech giants to prop up the economy is risky: if the AI boom loses steam, the economic fallout could be significant. 

AI sales tool Clay raises $100 million at a $3.1 billion valuation. The New York Times Dealbook reported that Clay, which helps sales reps and marketers find new leads and turn them into customers, has raised $100 million at a $3.1 billion valuation.The round was led by CapitalG, an investment arm of Alphabet, Google’s parent company. Other participants included Meritech Capital Partners and Sequoia Capital. It comes around six months after the start-up raised money at a $1.25 billion valuation.

EYE ON AI RESEARCH

Google DeepMind's new Genie 3 'world model' creates real-time interactive simulations. Google DeepMind has unveiled Genie 3, a powerful new AI system that can generate rich, interactive virtual worlds from simple text prompts—making it possible to navigate dynamic environments in real time at 24 frames per second. But while it's tempting to immediately leap to using the model for the ultimate gaming experience, it’s actually the latest leap in the company’s long-term push toward 'world models'—or AI systems that can learn how the world works and simulate real-world environments. These are seen as key to training advanced agents and, eventually, achieving artificial general intelligence. Unlike prior video generators, Genie 3 allows users to move through AI-generated environments that stay visually consistent over several minutes—and even respond to commands like “make it snow” or “add a character.” For now, DeepMind is limiting access to Genie 3 to a small group of researchers and creators while it explores responsible deployment and risk.

FORTUNE ON AI

North Korean IT worker infiltrations exploded 220% over the past 12 months, with gen AI weaponized at every stage of the hiring process —by Amanda Gerut

AI is doing job interviews now—but candidates say they’d rather risk staying unemployed than talk to another robot —by Emma Burleigh

These charts show how China is pulling ahead of the U.S. in the race to power the AI future —by Matt Heimer and Nick Rapp

AI CALENDAR

Sept. 8-10: Fortune Brainstorm Tech, Park City, Utah. Apply to attend here.

Oct. 6-10: World AI Week, Amsterdam

Oct. 21-22: TedAI San Francisco. Apply to attend here.

Dec. 2-7: NeurIPS, San Diego

Dec. 8-9: Fortune Brainstorm AI San Francisco. Apply to attend here.

BRAIN FOOD

Could "depth of thought" be key to AI reasoning? 

A tiny new AI model is challenging what we know about how models learn to reason: Researchers from Singapore's Sapient Intelligence recently released the Hierarchical Reasoning Model (HRM), which draws inspiration from the brain’s layered thinking process—and the results have the AI community chattering. Despite being 100 times smaller than ChatGPT and trained on just 1,000 examples (with no internet data or step-by-step guidance), HRM solves tough logic problems like Sudoku, maze navigation, and abstract reasoning tasks that stump much larger models. Instead of mimicking human language, HRM reasons internally—quietly working through problems in hidden loops, much like a person thinking through a puzzle in their head. Its success hints at a radical shift in AI: one where depth of thought might matter more than scale.

Fish AI Reader

Fish AI Reader

AI辅助创作,多种专业模板,深度分析,高质量内容生成。从观点提取到深度思考,FishAI为您提供全方位的创作支持。新版本引入自定义参数,让您的创作更加个性化和精准。

FishAI

FishAI

鱼阅,AI 时代的下一个智能信息助手,助你摆脱信息焦虑

联系邮箱 441953276@qq.com

相关标签

AI代理 Agentic AI 人工智能 技术峰会 未来展望
相关文章