cs.AI updates on arXiv.org 09月04日
深度学习模型训练效率优化分析
index_new5.html
../../../zaker_core/zaker_tpl_static/wap/tpl_guoji1.html

 

本文分析了MLPerf Training v4.1在BERT、Llama2 LoRA、RetinaNet和Stable Diffusion四个工作负载上的训练时间,揭示了优化性能、GPU使用和效率之间关系的配置,指出存在一个平衡点,能够在减少训练时间的同时最大化效率。

arXiv:2509.03263v1 Announce Type: cross Abstract: Training large-scale deep learning models has become a key challenge for the scientific community and industry. While the massive use of GPUs can significantly speed up training times, this approach has a negative impact on efficiency. In this article, we present a detailed analysis of the times reported by MLPerf Training v4.1 on four workloads: BERT, Llama2 LoRA, RetinaNet, and Stable Diffusion, showing that there are configurations that optimise the relationship between performance, GPU usage, and efficiency. The results point to a break-even point that allows training times to be reduced while maximising efficiency.

Fish AI Reader

Fish AI Reader

AI辅助创作,多种专业模板,深度分析,高质量内容生成。从观点提取到深度思考,FishAI为您提供全方位的创作支持。新版本引入自定义参数,让您的创作更加个性化和精准。

FishAI

FishAI

鱼阅,AI 时代的下一个智能信息助手,助你摆脱信息焦虑

联系邮箱 441953276@qq.com

相关标签

深度学习 模型训练 效率优化 GPU使用 MLPerf
相关文章