两天内两款中国AI模型亮相

https://simonwillison.net/atom/everything 10月02日

两天内两款中国AI模型亮相

近期，DeepSeek和Z.ai分别发布了DeepSeek-V3.2-Exp和GLM-4.6两款AI模型。DeepSeek-V3.2-Exp引入了DeepSeek Sparse Attention机制，旨在提高长文本场景下的训练和推理效率；GLM-4.6则将上下文窗口扩展至200K tokens，在代码基准测试中表现更优。

Two new models from Chinese AI labs in the past few days. I tried them both out using llm-openrouter:

DeepSeek-V3.2-Exp from DeepSeek. Announcement, Tech Report, Hugging Face (690GB, MIT license).

As an intermediate step toward our next-generation architecture, V3.2-Exp builds upon V3.1-Terminus by introducing DeepSeek Sparse Attention—a sparse attention mechanism designed to explore and validate optimizations for training and inference efficiency in long-context scenarios.

This one felt very slow when I accessed it via OpenRouter - I probably got routed to one of the slower providers. Here's the pelican:

GLM-4.6 from Z.ai. Announcement, Hugging Face (714GB, MIT license).

The context window has been expanded from 128K to 200K tokens [...] higher scores on code benchmarks [...] GLM-4.6 exhibits stronger performance in tool using and search-based agents.

Here's the pelican for that:

Fish AI Reader

AI辅助创作，多种专业模板，深度分析，高质量内容生成。从观点提取到深度思考，FishAI为您提供全方位的创作支持。新版本引入自定义参数，让您的创作更加个性化和精准。

FishAI

鱼阅，AI 时代的下一个智能信息助手，助你摆脱信息焦虑

联系邮箱 441953276@qq.com

相关标签

DeepSeek Z.ai AI模型 DeepSeek Sparse Attention GLM-4.6

相关文章

2024年中国虚拟现实（VR）行业研究报告

平安证券：继续看好AI主题的投资机会

成交均价最低至0.26元，中国绿证价格缘何大跌？

Meta据悉正与出版商讨论人工智能模型训练合作

四川发布航空电子市场应用场景和地方产业发展机会清单

【臺灣RMN實例：Pinkoi】以垂直領域數據自建模型，從自動化進階AI化投廣

Developments in Family of Claude Models by Anthropic AI: A Comprehensive Review

Microsoft’s New Category of Windows PCs designed for AI, Copilot+ PCs

国泰君安：看好国产AI大模型落地趋势中的应用场景投资机会

AI News Weekly - Issue #387: 10 Best AI PDF Summarizers - May 30th 2024