热点
关于我们
xx
xx
"
Quantization
" 相关文章
微软BitDistill将LLM压缩到1.58比特:10倍内存节省、2.65倍CPU推理加速
机器之心
2025-10-20T13:33:30.000000Z
微软BitDistill将LLM压缩到1.58比特:10倍内存节省、2.65倍CPU推理加速
机器之心
2025-10-20T13:33:30.000000Z
QeRL: NVFP4-Quantized Reinforcement Learning (RL) Brings 32B LLM Training to a Single H100—While Improving Exploration
MarkTechPost@AI
2025-10-16T04:32:29.000000Z
Unlock Faster, Smarter Edge Models with 7x Gen AI Performance on NVIDIA Jetson AGX Thor
Nvidia Developer
2025-10-15T18:40:15.000000Z
Unlock Faster, Smarter Edge Models with 7x Gen AI Performance on NVIDIA Jetson AGX Thor
Nvidia Developer
2025-10-15T18:40:15.000000Z
Huawei's new open source technique shrinks LLMs to make them run on less powerful, less expensive hardware
VentureBeat
2025-10-03T21:45:37.000000Z
Huawei's new open source technique shrinks LLMs to make them run on less powerful, less expensive hardware
VentureBeat
2025-10-03T21:45:37.000000Z
Deploy FLAN-T5 XXL on Amazon SageMaker
philschmid RSS feed
2025-09-30T11:12:51.000000Z
Accelerate Mixtral 8x7B with Speculative Decoding and Quantization on Amazon SageMaker
philschmid RSS feed
2025-09-30T11:11:08.000000Z
Top 10 Local LLMs (2025): Context Windows, VRAM Targets, and Licenses Compared
MarkTechPost@AI
2025-09-28T06:31:26.000000Z
8GB显存笔记本能跑多大AI模型?这个计算公式90%的人都不知道!
掘金 人工智能
2025-09-16T11:01:52.000000Z
I-Segmenter: Integer-Only Vision Transformer for Efficient Semantic Segmentation
cs.AI updates on arXiv.org
2025-09-15T08:34:16.000000Z
让机器人“大脑”更轻更快:SQAP-VLA首次实现VLA模型量化与剪枝协同加速
我爱计算机视觉
2025-09-14T10:06:02.000000Z
Double PyTorch Inference Speed for Diffusion Models Using Torch-TensorRT
Nvidia Developer
2025-09-03T15:28:41.000000Z
Double PyTorch Inference Speed for Diffusion Models Using Torch-TensorRT
Nvidia Developer
2025-09-03T15:28:41.000000Z