Last Week in AI 10月19日 00:05
AI 领域动态:芯片合作、平台升级与模型迭代
index_new5.html
../../../zaker_core/zaker_tpl_static/wap/tpl_guoji1.html

 

近期人工智能领域动作频频,OpenAI 与 Broadcom 达成合作,旨在共同设计和部署定制 AI 加速器芯片,以提高效率并降低对现有供应商的依赖。OpenAI 的 DevDay 2025 展示了其将 ChatGPT 转型为应用平台和代理操作系统的愿景,推出了 Apps inside ChatGPT、Apps SDK 和 AgentKit 等新功能。Anthropic 推出了 Claude Haiku 4.5,一款更小、更经济的 AI 模型,以应对日益激烈的市场竞争。Google 则更新了其 Veo 3.1 视频生成模型,并进一步整合到其 Flow 视频编辑器中。此外,Sora 的版权争议仍在发酵,而微软、Anthropic、Salesforce 等公司也纷纷推出新的 AI 工具和服务,旨在提升企业级应用和开发者体验。

💡 **OpenAI 携手 Broadcom 打造定制 AI 芯片**:为提升 AI 计算效率并减少对英伟达和 AMD 的依赖,OpenAI 正与 Broadcom 合作设计和部署定制 AI 加速器。这一举措预示着 OpenAI 将在基础设施上投入巨资,目标是构建大规模的计算能力,并可能显著降低其未来的计算成本。

🚀 **ChatGPT 升级为应用平台与代理操作系统**:OpenAI 在 DevDay 2025 上宣布了多项重大更新,将 ChatGPT 发展成为一个集应用、SDK 和 AgentKit 于一体的平台。通过 Apps inside ChatGPT 和 AgentKit,开发者可以构建和部署交互式 AI 代理,极大地扩展了 ChatGPT 的功能和应用场景,同时 Sam Altman 也透露 ChatGPT 的周活跃用户已达 8 亿。

💰 **Anthropic 推出更经济实惠的 Claude Haiku 4.5 模型**:为在竞争激烈的 AI 市场中保持优势,Anthropic 推出了 Claude Haiku 4.5,这是其 Claude 4.x 系列中最小、成本最低的模型。该模型在速度和性能上表现出色,能够与更大型的模型媲美,尤其是在计算机视觉和编码任务上,为用户提供了更具性价比的选择。

🎬 **Google 增强 Veo 视频生成能力并集成至 Flow**:Google 推出了 Veo 3.1,显著提升了其视频生成模型的视觉保真度、提示遵循度和编辑控制能力。新模型支持同步音频,并增强了图像到视频的质量,同时在 Flow 视频编辑器、Gemini 应用及 API 中提供更丰富的创作和编辑选项。

⚖️ **Sora 版权争议与行业合规挑战**:OpenAI 的视频生成模型 Sora 在快速普及的同时,也面临着关于训练数据版权的严峻审视。尽管 OpenAI 改变了其公开立场,但围绕内容版权、用户需求和平台治理的讨论仍在继续,这反映了 AI 发展中普遍存在的合规与伦理挑战。

Editorial note: apologies for the newsletter and podcast not having come out regularly lately, startup life has kept me and my co-host rather busy… I’ll do my best to resume weekly release cadence for both the newsletter and podcast starting this week.

OpenAI Inks Deal With Broadcom to Design Its Own Chips for A.I.

Related:

OpenAI signed a deal with Broadcom to co‑design and deploy custom AI accelerators, aiming to roll out racks of OpenAI‑designed chips starting late next year. The systems will integrate compute, memory, and networking on Broadcom’s Ethernet stack, targeting major efficiency gains for OpenAI’s workloads while reducing reliance on Nvidia and AMD. The partnership fits into a plan to build roughly 10 gigawatts of compute capacity, with OpenAI already constructing a data center in Abilene, Texas and planning additional sites in Texas, New Mexico, Ohio, and the Midwest. Industry estimates put a 1‑gigawatt AI data center at around $50 billion—about $35 billion of which is chips at current Nvidia pricing—highlighting how custom silicon could significantly cut compute costs.

The company also has large agreements with Nvidia, Oracle, and AMD. Nvidia said it intends to invest $100 billion, and AMD effectively granted 160 million shares (around 10% of AMD) to support OpenAI’s buildout—while Broadcom is not investing equity. Broadcom’s custom AI chips (XPUs) have strong demand from hyperscalers, and its stock jumped about 9.9% on the news; however, Broadcom clarified OpenAI is not the previously disclosed $10 billion customer.

Everything OpenAI announced at DevDay 2025: AgentKit, Apps SDK, ChatGPT, and more

Related:

OpenAI’s DevDay 2025 reframed ChatGPT as an app platform and agent OS, debuting Apps inside ChatGPT, a preview Apps SDK, and AgentKit. Apps run directly in ChatGPT responses with interactive UIs, video, login, and actions via the Model Context Protocol (MCP). Launch partners include Canva, Zillow, Coursera, Figma, Spotify, Booking.com, and Expedia, with DoorDash, Instacart, Uber, and AllTrails “coming soon.” Live demos showcased end-to-end “talking to apps”: generating a poster in Canva, auto-building a pitch deck, and pulling Zillow listings with natural-language filters and maps, including full‑screen renders inside ChatGPT.

Anthropic launches Claude Haiku 4.5, a smaller, cheaper AI model

Anthropic introduced Claude Haiku 4.5, its smallest and most affordable Claude 4.x model, now available to all users including on the free tier. The company says Haiku 4.5 is notably fast and “punches above its weight,” outperforming older larger models and even surpassing Claude Sonnet 4 on computer‑use tasks. On coding, Haiku 4.5 scores comparably to Claude Sonnet 4 and OpenAI’s GPT‑5 on SWE‑bench Verified, a benchmark for real‑world bug fixing. Pricing-wise, Haiku models run at about one‑third the cost of Sonnet, and Sonnet is roughly one‑fifth the cost of Opus—making Haiku 4.5 the lowest‑cost paid option while granting more capacity to free users due to its smaller size.

The release follows Sonnet 4.5 (September) and Opus 4.1 (August), with an updated Opus targeted for late 2024 or early 2025. Anthropic, valued at $183 billion with a revenue run rate nearing $7 billion and over 300,000 business customers, is accelerating launches amid competition with Google and OpenAI, which released GPT‑5 and expanded with infrastructure deals and Sora.

Google releases Veo 3.1, adds it to Flow video editor

Google rolled out Veo 3.1, an upgrade to its Veo 3 video generation model, focused on higher‑fidelity visuals, stronger prompt adherence, and richer editing controls. The update improves image‑to‑video quality and introduces better audio output, adding synchronized audio to features like reference‑image character control, first/last‑frame guided clip generation, and clip extension from trailing frames. Veo 3.1 also supports granular edits, including adding objects that blend into a clip’s visual style, with object removal coming soon in Flow. The new model is available now in Flow, the Gemini app, and via Vertex and Gemini APIs.

OpenAI Reverses Stance on Use of Copyright Works in Sora

Related:

OpenAI faced intense scrutiny over Sora’s training data and copyright handling, prompting a shift in its public stance as legal and public‑relations pressures mounted. The launch sparked rapid user adoption, propelled the app to the top of the U.S. App Store, and also triggered a wave of copycat apps. Commentary from Altman and reporting from multiple outlets underscore how copyright risk, user demand, and platform governance are intersecting for Sora’s rapid rollout.

Other News

Tools

Microsoft launches ‘vibe working’ in Excel and Word. Microsoft’s new “Agent Mode,” powered by OpenAI’s GPT‑5 (with an Anthropic‑powered Office Agent in Copilot chat), can generate, plan, and execute complex spreadsheets, documents, and slide decks from simple prompts.

Anthropic AI Releases Petri: An Open-Source Framework for Automated Auditing by Using AI Agents to Test the Behaviors of Target Models on Diverse Scenarios. Petri automates large‑scale alignment audits by orchestrating an auditor agent to run multi‑turn, tool‑augmented probes against target models, synthesize realistic environments and tools, and use an LLM judge to score transcripts across a default 36‑dimension rubric.

Anthropic turns to ‘skills’ to make Claude more useful at work. Organizations can create and share reusable “Skills”—sets of instructions, scripts, and resources—that teach Claude to perform specific workplace tasks, integrating across Claude.ai, Claude Code, the API, and the Claude Agent SDK.

Salesforce announces Agentforce 360 as enterprise AI competition heats up. The update adds new prompting and builder tools (including a beta Agent Script and Agentforce Builder with “Vibes” app‑vibe coding), deepens Agentforce’s Slack integration, and lets customers use reasoning models from Anthropic, OpenAI, and Google to build more predictable, flexible enterprise agents.

Slack is turning Slackbot into an AI assistant. Slackbot is gaining the ability to compile plans and summaries from across channels and files, search the workspace with natural language, and coordinate calendars—running inside a VPC so employers can opt out.

Google’s AI Mode image search is getting more conversational. Users will be able to refine searches with natural‑language follow‑ups and mix uploaded reference images with text prompts. The English rollout begins in the U.S. this week.

Google’s Search Live comes to India, AI Mode gets more languages. Google is launching Search Live in English and Hindi in India, expanding AI Mode to seven additional Indian languages, and leveraging local interactions to improve multimodal visual understanding over time.

Microsoft AI announces first image generator created in-house. Microsoft’s in‑house model prioritizes photorealism and speed, incorporates feedback from creative professionals to avoid generic styles, and has already ranked in the top 10 on AI benchmark site LMArena.

Zendesk says its new AI agent can solve 80% of support issues. Zendesk is introducing multiple LLM‑driven agents—an autonomous agent for most tickets, a co‑pilot for human technicians, and specialized admin, voice, and analytics agents—built from recent AI acquisitions and tested with customers who reported higher satisfaction.

Business

Amazon’s Zoox Robotaxis Have Arrived In Las Vegas. Zoox is offering free rides within a mapped, geofenced area along the Las Vegas Strip via a phone app; early rider reports have been mostly positive with no accidents reported.

Waymo’s robotaxis are coming to London. Waymo plans supervised data‑collection runs in London within weeks and aims to launch a fully driverless ride‑hail service via its app in 2026, with vehicles maintained by Moove.

OpenAI is the world’s most valuable private company after private stock sale. A secondary share sale paid $6.6 billion to current and former employees, with buyers including SoftBank and T. Rowe Price, valuing OpenAI at $500 billion and underscoring its fundraising momentum amid heavy infrastructure spending and ongoing product launches.

Meta partners up with Arm to scale AI efforts. Under a multi‑year deal, Meta will move ranking and recommendation systems onto Arm’s Neoverse platform to improve performance per watt as it expands data‑center capacity (including projects codenamed Prometheus and Hyperion).

Reflection AI raises $2B to be America’s open frontier AI lab, challenging DeepSeek. The new funding will secure large‑scale compute and recruit talent to train a frontier LLM (initially text‑focused, with future multimodal capabilities) whose publicly released model weights aim to offer an open‑access alternative, while monetization targets enterprise and sovereign deployments.

Supabase nabs $5B valuation, four months after hitting $2B. Supabase raised fresh funding, bringing total capital to $500 million, and included an option for community developers to buy stock as part of its Series E.

Character.AI removes Disney characters from platform after studio issues warning. Character.AI removed user‑created bots imitating Disney characters after receiving a cease‑and‑desist letter alleging unauthorized use of copyrighted and trademarked characters.

General Intuition lands $134M seed to teach agents spatial reasoning using video game clips. The startup is using Medal’s dataset of gaming clips to train agents and foundation models that learn spatial‑temporal reasoning from first‑person gameplay—targeting smarter in‑game bots and search‑and‑rescue drones—and raised $133.7M to scale research and engineering.

Research

Reasoning with Sampling: Your Base Model is Smarter Than You Think. Using a training‑free MCMC sampling method targeting a “power distribution” over base‑model outputs, the authors show inference‑time sampling can match or exceed RL post‑training (GRPO) on single‑shot and out‑of‑domain reasoning tasks while preserving multi‑sample diversity.

Evolution Strategies at Scale: LLM Fine-Tuning Beyond Reinforcement Learning. This paper presents a memory‑efficient, highly parallel evolution strategies implementation that directly searches billions of model parameters for LLM fine‑tuning, showing better sample efficiency, robustness, and stability than reinforcement learning on outcome‑only reasoning tasks.

Base Models Know How to Reason, Thinking Models Learn When. The authors argue that much of “thinking model” advantage comes from learning when to activate reasoning behaviors that base models already possess, enabling steered base models to recover most of the benchmark performance gap via a small fraction of targeted activation edits.

The Art of Scaling Reinforcement Learning Compute for LLMs. A predictive sigmoid‑like scaling framework and an RL recipe called ScaleRL—validated across hundreds of thousands of GPU‑hours—let researchers extrapolate RL performance from small runs, identify scalable methods, and improve asymptotic performance and compute efficiency.

Memory as Action: Autonomous Context Curation for Long-Horizon Agentic Tasks. MemAct treats memory curation as explicit editing actions and pairs it with a Dynamic Context Policy Optimization algorithm so agents can autonomously manage and optimize working memory for long‑horizon tasks while controlling token and latency costs.

Verbalized Sampling: How to Mitigate Mode Collapse and Unlock LLM Diversity. Verbalized Sampling is a prompting technique that asks models to output multiple responses with probabilities, countering typicality bias in preference data and recovering pretrained diversity without retraining.

Can GenAI Improve Academic Performance? Evidence from the Social and Behavioral Sciences. Using author‑level panel data and a difference‑in‑differences design, the study finds that researchers who began using GenAI after ChatGPT’s release increased publication output—especially early‑career and non‑English‑speaking authors—and saw a modest rise in average journal impact.

Concerns

OpenAI’s internal Slack messages could cost it billions in copyright suit. Internal Slack and email discussions about deleting a pirated LibGen training dataset—and whether lawyers advised that deletion—are now key evidence as plaintiffs seek to show intentional destruction of evidence and secure access to privileged communications, potentially increasing damages dramatically.

AI users sue Microsoft in antitrust class action over OpenAI deal | Reuters. A proposed class action alleges Microsoft’s investment and arrangements with OpenAI violate antitrust laws, seeking remedies over competitive harm.

Policy

California becomes first state to regulate AI companion chatbots. The new law requires age verification, content warnings, suicide‑prevention protocols, and clear disclosures that interactions are AI‑generated, and bans chatbots from portraying themselves as healthcare professionals. Violations can carry penalties, including fines for illegal deepfakes.

Analysis

How ByteDance Made China’s Most Popular AI Chatbot. ByteDance’s Doubao combines chat, image and short‑video generation, multimodal voice and video interaction, customizable AI agents, and deep integration with Douyin to reach a broad, nontechnical user base—amassing over 157 million monthly active users.

Over 50 Percent of the Internet Is Now AI Slop, New Data Finds. An analysis by Graphite of 65,000 English‑language articles using the Surfer detector finds AI‑written content rose sharply after ChatGPT’s 2022 debut and now sits at about 52% of new articles, though detector accuracy and sample biases could mean the true share of human content is higher.

Fish AI Reader

Fish AI Reader

AI辅助创作,多种专业模板,深度分析,高质量内容生成。从观点提取到深度思考,FishAI为您提供全方位的创作支持。新版本引入自定义参数,让您的创作更加个性化和精准。

FishAI

FishAI

鱼阅,AI 时代的下一个智能信息助手,助你摆脱信息焦虑

联系邮箱 441953276@qq.com

相关标签

OpenAI Broadcom AI芯片 ChatGPT AgentKit Anthropic Claude Haiku Google Veo 3.1 Sora AI平台 AI模型 DevDay 2025
相关文章