热点
关于我们
xx
xx
"
模型扩展
" 相关文章
ScaleNet: Scaling up Pretrained Neural Networks with Incremental Parameters
cs.AI updates on arXiv.org
2025-10-22T04:22:50.000000Z
谷歌×耶鲁联手发布抗癌神器,AI推理精准狙击「隐身」癌细胞
36kr-科技
2025-10-17T01:12:40.000000Z
Claude Skills are awesome, maybe a bigger deal than MCP
https://simonwillison.net/atom/everything
2025-10-16T21:39:00.000000Z
Scaling Laws and Symmetry, Evidence from Neural Force Fields
cs.AI updates on arXiv.org
2025-10-14T04:14:47.000000Z
Continual Adapter Tuning with Semantic Shift Compensation for Class-Incremental Learning
cs.AI updates on arXiv.org
2025-10-13T04:14:53.000000Z
QDeepGR4J: Quantile-based ensemble of deep learning and GR4J hybrid rainfall-runoff models for extreme flow prediction with uncertainty quantification
cs.AI updates on arXiv.org
2025-10-08T04:11:43.000000Z
Understanding Generative Recommendation with Semantic IDs from a Model-scaling View
cs.AI updates on arXiv.org
2025-10-01T05:57:42.000000Z
ICML 2025 | 隐空间记忆登场!M+打破上下文限制,8B模型记住160K+内容
PaperWeekly
2025-07-27T09:01:18.000000Z
Supernova: Achieving More with Less in Transformer Architectures
cs.AI updates on arXiv.org
2025-07-22T04:34:16.000000Z
微软等提出「模型链」新范式,与Transformer性能相当,扩展性灵活性更好
机器之心
2025-06-02T06:54:10.000000Z
不只靠“堆参数”:Qwen新突破ParScale,用“并行”让模型更聪明
掘金 人工智能
2025-05-20T02:03:02.000000Z
Multimodal Models Don’t Need Late Fusion: Apple Researchers Show Early-Fusion Architectures are more Scalable, Efficient, and Modality-Agnostic
MarkTechPost@AI
2025-04-14T22:20:29.000000Z
谷歌超硬核教科书来了!Jeff Dean带货揭Gemini训练秘籍:在TPU上scaling
新智元
2025-02-24T01:15:55.000000Z
英伟达联手MIT清北发布SANA 1.5!线性扩散Transformer再刷文生图新SOTA
智源社区
2025-02-08T12:52:14.000000Z
Anthropic 联合创始人:AI 尚未发展至极限,2025 年将继续高速狂飙
IT之家
2024-12-26T01:25:16.000000Z
Is AI progress slowing down?
AI Snake Oil
2024-12-18T16:51:49.000000Z
Token化一切,甚至网络,北大&谷歌&马普所提出TokenFormer,Transformer从来没有这么灵活过
36氪 - 科技频道
2024-11-14T11:43:46.000000Z
Token化一切,甚至网络!北大&谷歌&马普所提出TokenFormer,Transformer从来没有这么灵活过!
机器之心
2024-11-14T05:54:48.000000Z
黄仁勋:我从不在乎市场份额,英伟达唯一目标是创造新市场
智源社区
2024-10-25T00:38:32.000000Z
Microsoft Releases GRIN MoE: A Gradient-Informed Mixture of Experts MoE Model for Efficient and Scalable Deep Learning
MarkTechPost@AI
2024-09-21T17:35:32.000000Z