https://simonwillison.net/atom/everything 09月30日
Google发布Gemini 2.5 Flash和Flash-Lite新模型
index_new5.html
../../../zaker_core/zaker_tpl_static/wap/tpl_guoji1.html

 

Google发布了最新的Gemini 2.5 Flash和Flash-Lite模型,重点改进了指令遵循能力、降低了回答的冗长性,并增强了多模态和翻译能力。新模型在复杂任务中表现出更好的工具使用效率,同时在保持高质量输出的同时减少了token使用和延迟,提升了成本效益。

🔍 Gemini 2.5 Flash-Lite模型在三个关键主题上进行了改进:更好地遵循复杂指令和系统提示、减少回答的冗长性、以及增强多模态和翻译能力。

🚀 新模型在复杂任务中表现出更好的工具使用效率,特别是在SWE-Bench验证基准测试中,性能提升了5%,达到54%。

💡 Gemini 2.5 Flash-Lite模型在推理模式下更加高效,使用50%更少的输出token,并且比之前的版本快约40%,在Google AI Studio的API端点性能基准测试中,每秒输出约887个token。

🌐 新模型支持可选的推理token,通过示例命令展示了在思考和非思考模式下生成图像的能力,例如生成 pelicans riding bicycles 的SVG图像。

🎨 新模型在描述图像方面也表现出色,例如对生成的 pelican riding a bicycle 的图像进行了详细的单行替代文本描述,展示了其在多模态理解上的进步。

Improved Gemini 2.5 Flash and Flash-Lite (via) Two new preview models from Google - updates to their fast and inexpensive Flash and Flash Lite families:

The latest version of Gemini 2.5 Flash-Lite was trained and built based on three key themes:

    Better instruction following: The model is significantly better at following complex instructions and system prompts.Reduced verbosity: It now produces more concise answers, a key factor in reducing token costs and latency for high-throughput applications (see charts above).Stronger multimodal & translation capabilities: This update features more accurate audio transcription, better image understanding, and improved translation quality.

[...]

This latest 2.5 Flash model comes with improvements in two key areas we heard consistent feedback on:

    Better agentic tool use: We've improved how the model uses tools, leading to better performance in more complex, agentic and multi-step applications. This model shows noticeable improvements on key agentic benchmarks, including a 5% gain on SWE-Bench Verified, compared to our last release (48.9% → 54%).More efficient: With thinking on, the model is now significantly more cost-efficient—achieving higher quality outputs while using fewer tokens, reducing latency and cost (see charts above).

They also added two new convenience model IDs: gemini-flash-latest and gemini-flash-lite-latest, which will always resolve to the most recent model in that family.

I released llm-gemini 0.26 adding support for the new models and new aliases. I also used the response.set_resolved_model() method added in LLM 0.27 to ensure that the correct model ID would be recorded for those -latest uses.

llm install -U llm-gemini

Both of these models support optional reasoning tokens. I had them draw me pelicans riding bicycles in both thinking and non-thinking mode, using commands that looked like this:

llm -m gemini-2.5-flash-preview-09-2025 -o thinking_budget 4000 "Generate an SVG of a pelican riding a bicycle"

I then got each model to describe the image it had drawn using commands like this:

llm -a https://static.simonwillison.net/static/2025/gemini-2.5-flash-preview-09-2025-thinking.png -m gemini-2.5-flash-preview-09-2025 -o thinking_budget 2000 'Detailed single line alt text for this image'

gemini-2.5-flash-preview-09-2025-thinking

A minimalist stick figure graphic depicts a person with a white oval body and a dot head cycling a gray bicycle, carrying a large, bright yellow rectangular box resting high on their back.

gemini-2.5-flash-preview-09-2025

A simple cartoon drawing of a pelican riding a bicycle, with the text "A Pelican Riding a Bicycle" above it.

gemini-2.5-flash-lite-preview-09-2025-thinking

A quirky, simplified cartoon illustration of a white bird with a round body, black eye, and bright yellow beak, sitting astride a dark gray, two-wheeled vehicle with its peach-colored feet dangling below.

gemini-2.5-flash-lite-preview-09-2025

A minimalist, side-profile illustration of a stylized yellow chick or bird character riding a dark-wheeled vehicle on a green strip against a white background.

Artificial Analysis posted a detailed review, including these interesting notes about reasoning efficiency and speed:

    In reasoning mode, Gemini 2.5 Flash and Flash-Lite Preview 09-2025 are more token-efficient, using fewer output tokens than their predecessors to run the Artificial Analysis Intelligence Index. Gemini 2.5 Flash-Lite Preview 09-2025 uses 50% fewer output tokens than its predecessor, while Gemini 2.5 Flash Preview 09-2025 uses 24% fewer output tokens.Google Gemini 2.5 Flash-Lite Preview 09-2025 (Reasoning) is ~40% faster than the prior July release, delivering ~887 output tokens/s on Google AI Studio in our API endpoint performance benchmarking. This makes the new Gemini 2.5 Flash-Lite the fastest proprietary model we have benchmarked on the Artificial Analysis website

Fish AI Reader

Fish AI Reader

AI辅助创作,多种专业模板,深度分析,高质量内容生成。从观点提取到深度思考,FishAI为您提供全方位的创作支持。新版本引入自定义参数,让您的创作更加个性化和精准。

FishAI

FishAI

鱼阅,AI 时代的下一个智能信息助手,助你摆脱信息焦虑

联系邮箱 441953276@qq.com

相关标签

Google Gemini 2.5 Flash Gemini 2.5 Flash-Lite AI模型 指令遵循 多模态能力
相关文章