SelfJudge：加速LLM推理的自动验证器

cs.AI updates on arXiv.org 10月06日

SelfJudge：加速LLM推理的自动验证器

本文提出SelfJudge，通过目标模型的自监督训练来训练验证器，实现了跨NLP任务的自动验证器训练，提高了LLM推理的准确性。

arXiv:2510.02329v1 Announce Type: cross Abstract: Speculative decoding accelerates LLM inference by verifying candidate tokens from a draft model against a larger target model. Recent judge decoding boosts this process by relaxing verification criteria by accepting draft tokens that may exhibit minor discrepancies from target model output, but existing methods are restricted by their reliance on human annotations or tasks with verifiable ground truths, limiting generalizability across diverse NLP tasks. We propose SelfJudge, which trains judge verifiers via self-supervision of the target model. Our method measures semantic preservation by assessing whether token-substituted responses preserve the meaning of original responses, enabling automatic verifier training across diverse NLP tasks. Our experiments show SelfJudge achieves superior inference-accuracy trade-offs than judge decoding baselines, offering a broadly applicable solution for faster LLM inference.

Fish AI Reader

AI辅助创作，多种专业模板，深度分析，高质量内容生成。从观点提取到深度思考，FishAI为您提供全方位的创作支持。新版本引入自定义参数，让您的创作更加个性化和精准。

FishAI

鱼阅，AI 时代的下一个智能信息助手，助你摆脱信息焦虑

联系邮箱 441953276@qq.com

相关标签

SelfJudge LLM推理自动验证器 NLP任务自监督

相关文章

Training Data Locality and Chain-of-Thought Reasoning in LLMs with Ben Prystawski - #673

NuMind Released: Empowering Custom NLP Model Creation with In-House Foundation Models and Active Learning for Over 10 Industries and Languages

Amazon EC2 P5e instances are generally available

超强o1模型智商已超120！1小时写出NASA博士1年代码，最新编程赛超越99.8%选手

Transformer推理天花板被谷歌打破？DeepMind首席科学家亮出84页PPT，却遭LeCun反对

Break the Sequential Dependency of LLM Inference Using Lookahead Decoding

Neural Magic Unveils Machete: A New Mixed-Input GEMM Kernel for NVIDIA Hopper GPUs

ShadowKV: A High-Throughput Inference System for Long-Context LLM Inference

OpenAI o1太贵？那就自己做一个！纯提示方法让普通LLM进化出复杂推理能力

Researchers from Snowflake and CMU Introduce SuffixDecoding: A Novel Model-Free Approach to Accelerating Large Language Model (LLM) Inference through Speculative Decoding