热点
关于我们
xx
xx
"
延迟降低
" 相关文章
zFLoRA: Zero-Latency Fused Low-Rank Adapters
cs.AI updates on arXiv.org
2025-10-31T04:03:52.000000Z
SpecAgent: A Speculative Retrieval and Forecasting Agent for Code Completion
cs.AI updates on arXiv.org
2025-10-22T04:17:45.000000Z
Optical Computation-in-Communication enables low-latency, high-fidelity perception in telesurgery
cs.AI updates on arXiv.org
2025-10-17T04:14:59.000000Z
Speculative Actions: A Lossless Framework for Faster Agentic Systems
cs.AI updates on arXiv.org
2025-10-07T04:08:34.000000Z
Speculative Actions: A Lossless Framework for Faster Agentic Systems
cs.AI updates on arXiv.org
2025-10-07T04:08:34.000000Z
Accelerating LLM Inference with Precomputed Query Storage
cs.AI updates on arXiv.org
2025-10-01T06:01:13.000000Z
Open-Vocabulary Spatio-Temporal Scene Graph for Robot Perception and Teleoperation Planning
cs.AI updates on arXiv.org
2025-09-30T04:04:04.000000Z
AdaptJobRec: Enhancing Conversational Career Recommendation through an LLM-Powered Agentic System
cs.AI updates on arXiv.org
2025-08-20T04:17:31.000000Z
Fail Fast, or Ask: Mitigating the Deficiencies of Reasoning LLMs with Human-in-the-Loop Systems Engineering
cs.AI updates on arXiv.org
2025-07-22T04:34:00.000000Z
只需7.6% token,性能还更强!华人团队提全新「草稿链」CoD,成本延迟大降
新智元
2025-03-13T06:51:20.000000Z
Consistency Large Language Models (CLLMs): A New Family of LLMs Specialized for the Jacobi Decoding Method for Latency Reduction
MarkTechPost@AI
2024-05-17T08:00:58.000000Z