cs.AI updates on arXiv.org 09月29日 12:16
高效低通信分布式训练方法
index_new5.html
../../../zaker_core/zaker_tpl_static/wap/tpl_guoji1.html

 

本文提出一种内存和计算效率高的低通信分布式训练方法,通过限制反向传播,仅更新参数的固定子集,显著降低峰值内存使用和训练FLOPs,同时无需跨节点激活交换。实验表明,该方法在相同token和带宽预算下,与现有低通信方法相比,在1.3B参数语言模型训练中达到相同的困惑度,同时减少训练FLOPs和峰值内存。

arXiv:2509.22418v1 Announce Type: cross Abstract: We introduce a memory- and compute-efficient method for low-communication distributed training. Existing methods reduce communication by performing multiple local updates between infrequent global synchronizations. We demonstrate that their efficiency can be significantly improved by restricting backpropagation: instead of updating all the parameters, each node updates only a fixed subset while keeping the remainder frozen during local steps. This constraint substantially reduces peak memory usage and training FLOPs, while a full forward pass over all parameters eliminates the need for cross-node activation exchange. Experiments on a $1.3$B-parameter language model trained across $32$ nodes show that our method matches the perplexity of prior low-communication approaches under identical token and bandwidth budgets while reducing training FLOPs and peak memory.

Fish AI Reader

Fish AI Reader

AI辅助创作,多种专业模板,深度分析,高质量内容生成。从观点提取到深度思考,FishAI为您提供全方位的创作支持。新版本引入自定义参数,让您的创作更加个性化和精准。

FishAI

FishAI

鱼阅,AI 时代的下一个智能信息助手,助你摆脱信息焦虑

联系邮箱 441953276@qq.com

相关标签

分布式训练 低通信 内存效率 计算效率 反向传播
相关文章