VentureBeat 6分钟前
蚂蚁集团发布千亿参数开源模型Ring-1T,挑战AI竞争格局
index_new5.html
../../../zaker_core/zaker_tpl_static/wap/tpl_guoji1.html

 

中国科技公司蚂蚁集团发布了其最新的开源推理模型Ring-1T,该模型拥有惊人的一万亿参数,旨在与OpenAI的GPT-5和Google的Gemini等模型竞争。Ring-1T在数学、逻辑推理、代码生成和科学问题解决方面表现出色,尽管仅依靠自然语言推理能力,却在多项挑战性基准测试中取得了最先进的性能。为了训练如此庞大的模型,蚂蚁集团开发了IcePop、C3PO++和ASystem等三项创新性的训练方法,以应对大规模模型训练的挑战。在基准测试中,Ring-1T表现强劲,在多项测试中仅次于GPT-5,并且是所有测试过的开源模型中表现最好的。Ring-1T的发布标志着中国在AI领域的持续投入和快速发展,加剧了中美在AI领域的竞争。

🌟 Ring-1T作为首个开源的万亿参数推理模型,标志着中国在AI领域的重大进展。该模型由蚂蚁集团推出,旨在与行业领先的AI模型如GPT-5和Gemini竞争,并在数学、逻辑推理、代码生成和科学问题解决等领域展现出强大的能力。其在多项基准测试中取得的优异成绩,尤其是在仅依靠自然语言推理的情况下,凸显了其先进性和潜力。

🚀 Ring-1T的训练过程克服了巨大模型带来的技术挑战,蚂蚁集团为此开发了IcePop、C3PO++和ASystem三项关键创新技术。IcePop通过双边掩码校准稳定训练过程,C3PO++优化了GPU利用率以高效处理训练数据,而ASystem则采用异步操作架构。这些技术突破是实现如此大规模模型训练的关键。

📈 在与现有顶尖模型的性能对比中,Ring-1T表现出极强的竞争力。在AIME 25排行榜上,其得分仅次于GPT-5,并且在所有测试过的开源模型中位列第一。在编码能力方面,Ring-1T的表现甚至超越了DeepSeek和Qwen,这得益于其精心合成的数据集,为未来在智能体应用方面的探索奠定了坚实基础。

🌐 Ring-1T的发布进一步加剧了全球AI领域的竞争格局,特别是中国与美国之间的“AI竞赛”。中国科技公司正以前所未有的速度推出和改进AI模型,如阿里巴巴的Qwen3-Omni和DeepSeek的持续迭代。蚂蚁集团在超大模型训练技术上的突破,展示了中国在该领域日益增长的实力和雄心。

China’s Ant Group, an affiliate of Alibaba, detailed technical information around its new model, Ring-1T, which the company said is “the first open-source reasoning model with one trillion total parameters.”

Ring-1T aims to compete with other reasoning models like GPT-5 and the o-series from OpenAI, as well as Google’s Gemini 2.5. With the new release of the latest model, Ant extends the geopolitical debate over who will dominate the AI race: China or the US. 

Ant Group said Ring-1T is optimized for mathematical and logical problems, code generation and scientific problem-solving. 

“With approximately 50 billion activated parameters per token, Ring-1T achieves state-of-the-art performance across multiple challenging benchmarks — despite relying solely on natural language reasoning capabilities,” Ant said in a paper.

Ring-1T, which was first released on preview in September, adopts the same architecture as Ling 2.0 and trained on the Ling-1T-base model the company released earlier this month. Ant said this allows the model to support up to 128,000 tokens.

To train a model as large as Ring-1T, researchers had to develop new methods to scale reinforcement learning (RL).

New methods of training

Ant Group developed three “interconnected innovations” to support the RL and training of Ring-1T, a challenge given the model's size and the typically large compute requirements it entails. These three are IcePop, C3PO++ and ASystem.

IcePop removes noisy gradient updates to stabilize training without slowing inference. It helps eliminate catastrophic training-inference misalignment in RL. The researchers noted that when training models, particularly those using a mixture-of-experts (MoE) architecture like Ring-1T, there can often be a discrepancy in probability calculations. 

“This problem is particularly pronounced in the training of MoE models with RL due to the inherent usage of the dynamic routing mechanism. Additionally, in long CoT settings, these discrepancies can gradually accumulate across iterations and become further amplified,” the researchers said. 

IcePop “suppresses unstable training updates through double-sided masking calibration.”

The next new method the researchers had to develop is C3PO++, an improved version of the C3PO system that Ant previously established. The method manages how Ring-1T and other extra-large parameter models generate and process training examples, or what they call rollouts, so GPUs don’t sit idle. 

The way it works would break work in rollouts into pieces to process in parallel. One group is the inference pool, which generates new data, and the other is the training pool, which collects results to update the model. C3PO++ creates a token budget to control how much data is processed, ensuring GPUs are used efficiently.

The last new method, ASystem, adopts a SingleController+SPMD (Single Program, Multiple Data) architecture to enable asynchronous operations.  

Benchmark results

Ant pointed Ring-1T to benchmarks measuring performance in mathematics, coding, logical reasoning and general tasks. They tested it against models such as DeepSeek-V3.1-Terminus-Thinking, Qwen-35B-A22B-Thinking-2507, Gemini 2.5 Pro and GPT-5 Thinking. 

In benchmark testing, Ring-1T performed strongly, coming in second to OpenAI’s GPT-5 across most benchmarks. Ant said that Ring-1T showed the best performance among all the open-weight models it tested. 

The model posted a 93.4% score on the AIME 25 leaderboard, second only to GPT-5. In coding, Ring-1T outperformed both DeepSeek and Qwen.

“It indicates that our carefully synthesized dataset shapes Ring-1T’s robust performance on programming applications, which forms a strong foundation for future endeavors on agentic applications,” the company said. 

Ring-1T shows how much Chinese companies are investing in models 

Ring-1T is just the latest model from China aiming to dethrone GPT-5 and Gemini. 

Chinese companies have been releasing impressive models at a quick pace since the surprise launch of DeepSeek in January. Ant's parent company, Alibaba, recently released Qwen3-Omni, a multimodal model that natively unifies text, image, audio and video. DeepSeek has also continued to improve its models and earlier this month, launched DeepSeek-OCR. This new model reimagines how models process information. 

With Ring-1T and Ant’s development of new methods to train and scale extra-large models, the battle for AI dominance between the US and China continues to heat up.   

Fish AI Reader

Fish AI Reader

AI辅助创作,多种专业模板,深度分析,高质量内容生成。从观点提取到深度思考,FishAI为您提供全方位的创作支持。新版本引入自定义参数,让您的创作更加个性化和精准。

FishAI

FishAI

鱼阅,AI 时代的下一个智能信息助手,助你摆脱信息焦虑

联系邮箱 441953276@qq.com

相关标签

Ring-1T Ant Group AI Model Open Source Artificial Intelligence China AI Reasoning Model Machine Learning
相关文章