MarkTechPost@AI 09月28日
Google 更新 Gemini 2.5 Flash 和 Flash-Lite 模型
index_new5.html
../../../zaker_core/zaker_tpl_static/wap/tpl_guoji1.html

 

Google 发布了 Gemini 2.5 Flash 和 Gemini 2.5 Flash-Lite 预览模型的更新版本,在 AI Studio 和 Vertex AI 上提供。这些更新包括改进的代理工具使用、更高效的推理(Flash 模型),以及针对更严格指令遵循、减少冗余输出和增强多模态/翻译能力(Flash-Lite 模型)的优化。外部基准测试显示,Gemini 2.5 Flash-Lite 在速度和效率方面取得了显著提升,成为目前最快的专有模型之一。Google 还引入了滚动别名(如 gemini-flash-latest)方便用户快速迭代,但建议在生产环境中使用固定模型字符串以确保稳定性。新模型在长上下文处理、工具调用和成本效益方面也带来了改进。

✨ **模型性能提升**:Gemini 2.5 Flash 在代理工具使用和多轮推理方面得到增强,SWE-Bench Verified 分数显著提高,表明其在长距离规划和代码导航能力上有所突破。Gemini 2.5 Flash-Lite 则专注于更精准的指令遵循,减少输出冗余,并加强了多模态和翻译功能,内部测试显示其输出 token 数量大幅减少。

🚀 **速度与效率突破**:外部独立基准测试显示,Gemini 2.5 Flash-Lite 预览版(09-2025)在 AI Studio 的吞吐量测试中,输出速度高达约 887 token/s,成为其追踪的最快专有模型。同时,Gemini 2.5 Flash 的输出 token 减少了约 24%,Flash-Lite 减少了约 50%,这直接降低了输出 token 的成本和处理时间,尤其是在吞吐量受限的服务中。

💡 **成本与上下文优化**:Gemini 2.5 Flash-Lite 的 GA 基础价格为每 100 万输入 token 0.10 美元,每 100 万输出 token 0.40 美元。其支持高达 100 万 token 的上下文窗口,并具备可配置的“思考预算”和工具连接能力(如搜索 grounding、代码执行),非常适合需要交错阅读、规划和多工具调用的代理系统。

🔄 **别名管理与生产建议**:Google 推出了 `gemini-flash-latest` 和 `gemini-flash-lite-latest` 等滚动别名,方便用户快速接入最新预览版。但为确保生产环境的稳定性和可预测性,Google 建议用户固定使用特定的模型字符串(如 `gemini-2.5-flash`),并在切换别名时注意可能的变化。Google 会提前两周通过电子邮件通知别名指向的更新。

Google released an updated version of Gemini 2.5 Flash and Gemini 2.5 Flash-Lite preview models across AI Studio and Vertex AI, plus rolling aliases—gemini-flash-latest and gemini-flash-lite-latest—that always point to the newest preview in each family. For production stability, Google advises pinning fixed strings (gemini-2.5-flash, gemini-2.5-flash-lite). Google will give a two-week email notice before retargeting a -latest alias, and notes that rate limits, features, and cost may vary across alias updates.

https://developers.googleblog.com/en/continuing-to-bring-you-our-latest-models-with-an-improved-gemini-2-5-flash-and-flash-lite-release/

What actually changed?

https://developers.googleblog.com/en/continuing-to-bring-you-our-latest-models-with-an-improved-gemini-2-5-flash-and-flash-lite-release/

Independent Stats from the community thread

Artificial Analysis (the account behind the AI benchmarking site) received pre-release access and published external measurements across intelligence and speed. Highlights from the thread and companion pages:

Cost surface and context budgets (for deployment choices)

Browser-agent angle and the o3 claim

A circulating claim says the “new Gemini Flash has o3-level accuracy, but is 2× faster and 4× cheaper on browser-agent tasks.” This is community-reported, not in Google’s official post. It likely traces to private/limited task suites (DOM navigation, action planning) with specific tool budgets and timeouts. Use it as a hypothesis for your own evals; don’t treat it as a cross-bench truth.

Practical guidance for teams

Model strings (current)

Summary

Google’s new release update tightens tool-use competence (Flash) and token/latency efficiency (Flash-Lite) and introduces -latest aliases for faster iteration. External benchmarks from Artificial Analysis indicate meaningful throughput and intelligence-index gains for the Sept 2025. previews, with Flash-Lite now testing as the fastest proprietary model in their harness. Validate on your workload—especially browser-agent stacks—before committing to the aliases in production.

The post The Latest Gemini 2.5 Flash-Lite Preview is Now the Fastest Proprietary Model (External Tests) and 50% Fewer Output Tokens appeared first on MarkTechPost.

Fish AI Reader

Fish AI Reader

AI辅助创作,多种专业模板,深度分析,高质量内容生成。从观点提取到深度思考,FishAI为您提供全方位的创作支持。新版本引入自定义参数,让您的创作更加个性化和精准。

FishAI

FishAI

鱼阅,AI 时代的下一个智能信息助手,助你摆脱信息焦虑

联系邮箱 441953276@qq.com

相关标签

Gemini Google AI AI Models LLM Gemini 2.5 Flash Gemini 2.5 Flash-Lite AI Studio Vertex AI
相关文章