热点
关于我们
xx
xx
"
推理效率
" 相关文章
用更一致的轨迹、更少的解码步数「驯服」掩码扩散语言模型,扩散语言模型的推理性能和效率大幅提升
机器之心
2025-11-05T08:23:47.000000Z
用更一致的轨迹、更少的解码步数「驯服」掩码扩散语言模型,扩散语言模型的推理性能和效率大幅提升
机器之心
2025-11-05T07:43:26.000000Z
FlashEVA: Accelerating LLM inference via Efficient Attention
cs.AI updates on arXiv.org
2025-11-05T05:25:27.000000Z
DTS: Enhancing Large Reasoning Models via Decoding Tree Sketching
cs.AI updates on arXiv.org
2025-11-05T05:14:19.000000Z
英伟达帮你省钱,让大模型推理「短而精」,速度快5倍
机器之心
2025-11-04T14:56:38.000000Z
英伟达帮你省钱,让大模型推理「短而精」,速度快5倍
机器之心
2025-11-04T07:39:11.000000Z
让LLM不再话痨,快手HiPO框架来了
机器之心
2025-11-03T17:22:01.000000Z
Are Language Models Efficient Reasoners? A Perspective from Logic Programming
cs.AI updates on arXiv.org
2025-10-30T04:20:45.000000Z
华为计算:KunLun AI Space 基于昇腾实现 DeepSeek V3.1 FP8 推理,成本减半
IT之家
2025-10-28T12:58:02.000000Z
NVIDIA港大MIT联合推出Fast-dLLM v2:端到端吞吐量提升2.5倍
机器之心
2025-10-27T09:42:20.000000Z
ARC-Encoder: learning compressed text representations for large language models
cs.AI updates on arXiv.org
2025-10-24T04:28:56.000000Z
ARC-Encoder: learning compressed text representations for large language models
cs.AI updates on arXiv.org
2025-10-24T04:28:56.000000Z
Metis-HOME: Hybrid Optimized Mixture-of-Experts for Multimodal Reasoning
cs.AI updates on arXiv.org
2025-10-24T04:28:39.000000Z
Scaling Laws Meet Model Architecture: Toward Inference-Efficient LLMs
cs.AI updates on arXiv.org
2025-10-22T04:20:51.000000Z
Scaling Laws Meet Model Architecture: Toward Inference-Efficient LLMs
cs.AI updates on arXiv.org
2025-10-22T04:20:51.000000Z
DVAGen: Dynamic Vocabulary Augmented Generation
cs.AI updates on arXiv.org
2025-10-21T04:27:51.000000Z
Compressing Many-Shots in In-Context Learning
cs.AI updates on arXiv.org
2025-10-21T04:21:03.000000Z
One Token Embedding Is Enough to Deadlock Your Large Reasoning Model
cs.AI updates on arXiv.org
2025-10-21T04:15:01.000000Z
One Token Embedding Is Enough to Deadlock Your Large Reasoning Model
cs.AI updates on arXiv.org
2025-10-21T04:15:01.000000Z
CarBoN: Calibrated Best-of-N Sampling Improves Test-time Reasoning
cs.AI updates on arXiv.org
2025-10-20T04:14:27.000000Z