热点
"Mamba-2" 相关文章
Language Modeling With Factorization Memory
cs.AI updates on arXiv.org 2025-11-05T05:22:00.000000Z
IBM Released new Granite 4.0 Models with a Novel Hybrid Mamba-2/Transformer Architecture: Drastically Reducing Memory Use without Sacrificing Performance
MarkTechPost@AI 2025-10-02T22:51:24.000000Z
算力终结者来了,华人天团「降维打击」注意力瓶颈,AI狂飙进对数时代
36kr 2025-06-09T09:29:16.000000Z
12GB 显存可实现 128K 上下文 5 并发会话,IBM 预览 Granite 4.0 Tiny 模型
IT之家 2025-05-10T03:53:48.000000Z
This AI Paper Presents a Direct Experimental Comparison between 8B-Parameter Mamba, Mamba-2, Mamba-2-Hybrid, and Transformer Models Trained on Upto 3.5T Tokens
MarkTechPost@AI 2024-06-19T06:31:44.000000Z