热点
"量化训练" 相关文章
NVFP4 Trains with Precision of 16-Bit and Speed and Efficiency of 4-Bit
Nvidia Developer 2025-09-03T15:10:33.000000Z
Rotate, Clip, and Partition: Towards W2A4KV4 Quantization by Integrating Rotation and Learnable Non-uniform Quantizer
cs.AI updates on arXiv.org 2025-09-03T04:18:12.000000Z
SiLQ: Simple Large Language Model Quantization-Aware Training
cs.AI updates on arXiv.org 2025-07-24T05:31:06.000000Z
DilateQuant: Accurate and Efficient Diffusion Quantization via Weight Dilation
cs.AI updates on arXiv.org 2025-07-10T04:06:06.000000Z
Squat: Quant Small Language Models on the Edge
cs.AI updates on arXiv.org 2025-07-03T04:07:18.000000Z