Quantization_Fishai

热点

"Quantization" 相关文章

微软BitDistill将LLM压缩到1.58比特：10倍内存节省、2.65倍CPU推理加速

机器之心 2025-10-20T13:33:30.000000Z

微软BitDistill将LLM压缩到1.58比特：10倍内存节省、2.65倍CPU推理加速

机器之心 2025-10-20T13:33:30.000000Z

QeRL: NVFP4-Quantized Reinforcement Learning (RL) Brings 32B LLM Training to a Single H100—While Improving Exploration

MarkTechPost@AI 2025-10-16T04:32:29.000000Z

Unlock Faster, Smarter Edge Models with 7x Gen AI Performance on NVIDIA Jetson AGX Thor

Nvidia Developer 2025-10-15T18:40:15.000000Z

Unlock Faster, Smarter Edge Models with 7x Gen AI Performance on NVIDIA Jetson AGX Thor

Nvidia Developer 2025-10-15T18:40:15.000000Z

Huawei's new open source technique shrinks LLMs to make them run on less powerful, less expensive hardware

VentureBeat 2025-10-03T21:45:37.000000Z

Huawei's new open source technique shrinks LLMs to make them run on less powerful, less expensive hardware

VentureBeat 2025-10-03T21:45:37.000000Z

Deploy FLAN-T5 XXL on Amazon SageMaker

philschmid RSS feed 2025-09-30T11:12:51.000000Z

Accelerate Mixtral 8x7B with Speculative Decoding and Quantization on Amazon SageMaker

philschmid RSS feed 2025-09-30T11:11:08.000000Z

Top 10 Local LLMs (2025): Context Windows, VRAM Targets, and Licenses Compared

MarkTechPost@AI 2025-09-28T06:31:26.000000Z

8GB显存笔记本能跑多大AI模型？这个计算公式90%的人都不知道！

掘金人工智能 2025-09-16T11:01:52.000000Z

I-Segmenter: Integer-Only Vision Transformer for Efficient Semantic Segmentation

cs.AI updates on arXiv.org 2025-09-15T08:34:16.000000Z

让机器人“大脑”更轻更快：SQAP-VLA首次实现VLA模型量化与剪枝协同加速

我爱计算机视觉 2025-09-14T10:06:02.000000Z

Double PyTorch Inference Speed for Diffusion Models Using Torch-TensorRT

Nvidia Developer 2025-09-03T15:28:41.000000Z

Double PyTorch Inference Speed for Diffusion Models Using Torch-TensorRT

Nvidia Developer 2025-09-03T15:28:41.000000Z

Copyright © 2019 FISHAI.All Rights Reserved