热点
"MTP" 相关文章
Beyond Multi-Token Prediction: Pretraining LLMs with Future Summaries
cs.AI updates on arXiv.org 2025-10-17T04:18:58.000000Z
我们正式发布:Qwen3-Next-80B-A3B 双模型!
通义 2025-09-12T17:07:19.000000Z
我们正式发布:Qwen3-Next-80B-A3B 双模型!
通义 2025-09-12T17:07:19.000000Z
Accelerating SGLang with Multiple Token Prediction
Large Model Systems Organization 2025-07-17T22:19:22.000000Z