machinelearning apple 10月01日 03:43
量化感知训练优化策略研究
index_new5.html
../../../zaker_core/zaker_tpl_static/wap/tpl_guoji1.html

 

本文研究了量化感知训练(QAT)的优化策略,通过实验确定了QAT与全精度训练的最佳比例,并提出了一个结合冷却和QAT融合的新方法,以实现计算资源的有效利用。

Quantization-aware training (QAT) is a leading technique for improving the accuracy of quantized neural networks. Previ-ous work has shown that decomposing training into a full-precision (FP) phase followed by a QAT phase yields superioraccuracy compared to QAT alone. However, the optimal allocation of compute between the FP and QAT phases remainsunclear. We conduct extensive experiments with various compute budgets, QAT bit widths, and model sizes from 86.0Mto 2.2B to investigate how different QAT durations impact final performance. We demonstrate that, contrary to previousfindings, the loss-optimal ratio of QAT to FP training increases with the total amount of compute. Moreover, the opti-mal fraction can be accurately predicted for a wide range of model sizes and quantization widths using the tokens-per-parameter-byte statistic. From experimental data, we derive a loss scaling law that predicts both optimal QAT ratios and fi-nal model performance across different QAT/FP compute allocation strategies and QAT bit widths. We use the scaling lawto make further predictions, which we verify experimentally, including which QAT bit width is optimal under a given mem-ory constraint and how QAT accuracy with different bit widths compares to full-precision model accuracy. Additionally,we propose a novel cooldown and QAT fusion approach that performs learning rate decay jointly with quantization-awaretraining, eliminating redundant full-precision model updates and achieving significant compute savings. These findingsprovide practical insights into efficient QAT planning and enable the training of higher-quality quantized models with thesame compute budget.

Fish AI Reader

Fish AI Reader

AI辅助创作,多种专业模板,深度分析,高质量内容生成。从观点提取到深度思考,FishAI为您提供全方位的创作支持。新版本引入自定义参数,让您的创作更加个性化和精准。

FishAI

FishAI

鱼阅,AI 时代的下一个智能信息助手,助你摆脱信息焦虑

联系邮箱 441953276@qq.com

相关标签

量化感知训练 QAT优化 计算资源
相关文章