MarkTechPost@AI 10月29日 06:36
MiniMax发布M2:低成本高效能开源代码智能模型
index_new5.html
../../../zaker_core/zaker_tpl_static/wap/tpl_guoji1.html

 

MiniMax团队发布了MiniMax-M2,一款专为代码和智能代理工作流程优化的混合专家模型,并在Hugging Face上以MIT许可证开源。该模型拥有2290亿总参数,每个token激活约100亿参数,旨在降低内存占用和延迟,提升多轮交互和工具使用的效率。M2采用交错思考模式,内部推理过程包含在...标签中,建议用户在对话历史中保留这些标签以保证多步任务和工具链的性能。MiniMax官方宣称M2的定价仅为Claude Sonnet的8%,速度接近其两倍,并提供免费试用。

🚀 MiniMax发布M2开源模型:一款专为代码和智能代理工作流程设计的混合专家模型,总参数达2290亿,每个token激活约100亿参数,已在Hugging Face上以MIT许可证开源。

💡 架构特点与激活规模优势:M2采用紧凑的MoE架构,通过减少激活大小来降低内存压力和尾部延迟,从而提高在CI、浏览和检索等环节的并发运行能力,实现相对于同等质量稠密模型的速度和成本优势。

🧠 交错思考模式:M2模型采用交错思考方式,将内部推理过程封装在...标签中,官方强调在多轮对话和工具链使用中保留这些标签对于保证模型性能至关重要。

📊 基准测试表现:MiniMax团队报告了M2在Terminal Bench(46.3)、Multi SWE Bench(36.2)、BrowseComp(44.0)和SWE Bench Verified(69.4,含scaffold细节)等多个针对代码和智能代理的基准测试中取得的成果。

💰 成本与速度优势:MiniMax官方宣称M2的定价约为Claude Sonnet的8%,推理速度接近其两倍,并提供免费试用窗口,同时公布了具体的token价格和试用截止日期。

Can an open source MoE truly power agentic coding workflows at a fraction of flagship model costs while sustaining long-horizon tool use across MCP, shell, browser, retrieval, and code? MiniMax team has just released MiniMax-M2, a mixture of experts MoE model optimized for coding and agent workflows. The weights are published on Hugging Face under the MIT license, and the model is positioned as for end to end tool use, multi file editing, and long horizon plans, It lists 229B total parameters with about 10B active per token, which keeps memory and latency in check during agent loops.

https://github.com/MiniMax-AI/MiniMax-M2

Architecture and why activation size matters?

MiniMax-M2 is a compact MoE that routes to about 10B active parameters per token. The smaller activations reduce memory pressure and tail latency in plan, act, and verify loops, and allow more concurrent runs in CI, browse, and retrieval chains. This is the performance budget that enables the speed and cost claims relative to dense models of similar quality.

MiniMax-M2 is an interleaved thinking model. The research team wrapped internal reasoning in <think>...</think> blocks, and instructs users to keep these blocks in the conversation history across turns. Removing these segments harms quality in multi step tasks and tool chains. This requirement is explicit on the model page on HF.

Benchmarks that target coding and agents

The MiniMax team reports a set of agent and code evaluations are closer to developer workflows than static QA. On Terminal Bench, the table shows 46.3. On Multi SWE Bench, it shows 36.2. On BrowseComp, it shows 44.0. SWE Bench Verified is listed at 69.4 with the scaffold detail, OpenHands with 128k context and 100 steps.

https://github.com/MiniMax-AI/MiniMax-M2

MiniMax’s official announcement stresses 8% of Claude Sonnet pricing, and near 2x speed, plus a free access window. The same note provides the specific token prices and the trial deadline.

Comparison M1 vs M2

AspectMiniMax M1MiniMax M2
Total parameters456B total229B in model card metadata, model card text says 230B total
Active parameters per token45.9B active10B active
Core designHybrid Mixture of Experts with Lightning AttentionSparse Mixture of Experts targeting coding and agent workflows
Thinking formatThinking budget variants 40k and 80k in RL training, no think tag protocol requiredInterleaved thinking with <think>...</think> segments that must be preserved across turns
Benchmarks highlightedAIME, LiveCodeBench, SWE-bench Verified, TAU-bench, long context MRCR, MMLU-ProTerminal-Bench, Multi SWE-Bench, SWE-bench Verified, BrowseComp, GAIA text only, Artificial Analysis intelligence suite
Inference defaultstemperature 1.0, top p 0.95model card shows temperature 1.0, top p 0.95, top k 40, launch page shows top k 20
Serving guidancevLLM recommended, Transformers path also documentedvLLM and SGLang recommended, tool calling guide provided
Primary focusLong context reasoning, efficient scaling of test time compute, CISPO reinforcement learningAgent and code native workflows across shell, browser, retrieval, and code runners

Key Takeaways

    M2 ships as open weights on Hugging Face under MIT, with safetensors in F32, BF16, and FP8 F8_E4M3.The model is a compact MoE with 229B total parameters and ~10B active per token, which the card ties to lower memory use and steadier tail latency in plan, act, verify loops typical of agents.Outputs wrap internal reasoning in <think>...</think> and the model card explicitly instructs retaining these segments in conversation history, warning that removal degrades multi-step and tool-use performance.Reported results cover Terminal-Bench, (Multi-)SWE-Bench, BrowseComp, and others, with scaffold notes for reproducibility, and day-0 serving is documented for SGLang and vLLM with concrete deploy guides.

Editorial Notes

MiniMax M2 lands with open weights under MIT, a mixture of experts design with 229B total parameters and about 10B activated per token, which targets agent loops and coding tasks with lower memory and steadier latency. It ships on Hugging Face in safetensors with FP32, BF16, and FP8 formats, and provides deployment notes plus a chat template. The API documents Anthropic compatible endpoints and lists pricing with a limited free window for evaluation. vLLM and SGLang recipes are available for local serving and benchmarking. Overall, MiniMax M2 is a very solid open release.


Check out the API DocWeights and Repo. Feel free to check out our GitHub Page for Tutorials, Codes and Notebooks. Also, feel free to follow us on Twitter and don’t forget to join our 100k+ ML SubReddit and Subscribe to our Newsletter. Wait! are you on telegram? now you can join us on telegram as well.

The post MiniMax Releases MiniMax M2: A Mini Open Model Built for Max Coding and Agentic Workflows at 8% Claude Sonnet Price and ~2x Faster appeared first on MarkTechPost.

Fish AI Reader

Fish AI Reader

AI辅助创作,多种专业模板,深度分析,高质量内容生成。从观点提取到深度思考,FishAI为您提供全方位的创作支持。新版本引入自定义参数,让您的创作更加个性化和精准。

FishAI

FishAI

鱼阅,AI 时代的下一个智能信息助手,助你摆脱信息焦虑

联系邮箱 441953276@qq.com

相关标签

MiniMax M2 开源模型 MoE 代码智能 智能代理
相关文章