MarkTechPost@AI 2024年11月01日
AUTO-CEI: A Curriculum and Expert Iteration Approach to Elevate LLMs’ Response Precision and Control Refusal Rates Across Diverse Reasoning Domains
index_new5.html
../../../zaker_core/zaker_tpl_static/wap/tpl_guoji1.html

 

AUTO-CEI是一种创新方法,用于提升大型语言模型(LLMs)在复杂推理任务中的响应精度并控制拒绝率。该方法通过动态调整的课程结构和专家迭代技术,使LLMs能根据实际能力调整回答,在多种基准测试中表现优异。

🎯AUTO-CEI是新加坡国立大学和Salesforce AI Research提出的创新方法,引入结构化课程方法,根据模型性能动态调整LLM训练。

💪该方法利用专家迭代(EI)技术,通过反复重采样响应并引导沿正确推理路径,提升模型整体推理能力,促进在能力范围内的自信回答和对复杂任务的适当拒绝。

📈在多种基准测试中,AUTO-CEI表现出色,如在BoardgameQA中使精度提高10%,在MATH中提高LLMs处理复杂计算的能力,在Blocksworld中实现低拒绝率。

✨AUTO-CEI的关键成果包括大幅提高精度,实现自信与保守的有效平衡,增强多步推理的稳健性,减少逐步错误。

Large language models (LLMs) are increasingly utilized for complex reasoning tasks, requiring them to provide accurate responses across various challenging scenarios. These tasks include logical reasoning, complex mathematics, and intricate planning applications, which demand the ability to perform multi-step reasoning and solve problems in domains like decision-making and predictive modeling. However, as LLMs attempt to meet these demands, they encounter significant issues, particularly in balancing their ability to assertively answer questions with the risk of generating “hallucinated” information, answers that appear plausible but lack accuracy, and falling into patterns of “laziness,” where models frequently resort to saying “I don’t know” when uncertain. Finding a method that allows LLMs to deliver accurate, confidence-balanced responses without undue conservatism or inaccuracy has been a persistent goal.

LLMs face two central issues in performing these high-stakes reasoning tasks: they either overestimate their capabilities, leading to hallucinations or become overly cautious, defaulting to refusals in situations they could handle effectively. These behaviors stem from the models’ need to manage complex, multi-step reasoning processes that accumulate errors at each stage, compounding inaccuracies and reducing reliability. Techniques designed to mitigate hallucinations have focused primarily on factual mistakes by integrating external knowledge, retrieval-based strategies, or reinforcement learning (RL) approaches. However, these techniques are more suited to factual tasks and struggle in reasoning-based contexts, where inaccuracies result from flaws in logical progression rather than factual missteps.

Researchers from the National University of Singapore and Salesforce AI Research have proposed an innovative approach called Automatic Curriculum Expert Iteration (AUTO-CEI). This new method introduces a structured “curriculum” approach to LLM training that dynamically adjusts based on the model’s performance, enabling LLMs to align their responses with their actual capabilities. AUTO-CEI leverages a specialized reinforcement learning technique, Expert Iteration (EI), which iteratively refines the model’s policy by resampling responses and guiding them along correct reasoning paths. This iterative approach promotes assertive responses within the model’s limits and appropriate refusals for complex tasks beyond those limits, enhancing overall reasoning capacity.

The AUTO-CEI process begins by training the LLM to assess its performance boundaries. It uses the average number of reasoning steps required to reach a correct answer as a proxy for problem difficulty. Expert Iteration works within this curriculum, exploring possible reasoning paths to identify optimal, accurate responses. Correct answers receive positive rewards in this framework, whereas overly conservative or assertively incorrect answers incur penalties. Also, the curriculum adapts these rewards over time, incentivizing the LLM to engage in extended reasoning before opting to refuse an answer, thus pushing the model’s limits incrementally and avoiding premature refusals. Through repeated cycles of Expert Iteration, the curriculum hones the model’s capacity to handle progressively complex reasoning tasks with greater robustness.

In empirical testing across various benchmarks, including BoardgameQA, MATH, and Blocksworld, AUTO-CEI outperformed other state-of-the-art methods. BoardgameQA, which involves logical reasoning tasks based on rule-based deductions, saw a 10% increase in precision from the baseline when using AUTO-CEI, with the model achieving 84.5% precision and a refusal rate of just 29.4%. In MATH, a challenging dataset requiring long chains of reasoning in algebra and geometry, AUTO-CEI attained a 35.6% accuracy, indicating significant improvements in LLMs’ ability to navigate and conclude complex calculations. Meanwhile, in Blocksworld, a planning task where the model must sequence actions to achieve a specific block configuration, AUTO-CEI achieved a refusal rate of only 18.3%, balancing conservativeness with the need for assertive reasoning.

AUTO-CEI’s contributions have led to a robust solution for mitigating both hallucinations and excessive refusals. The model demonstrates the highest precision across reasoning tasks, maintaining a conservative refusal rate while avoiding unnecessary refusals in scenarios where possible solutions exist. AUTO-CEI has achieved accuracy rates that surpass existing reinforcement learning techniques by 10-24% while maintaining refusal rates between 18-36%, significantly reducing the model’s error rate. This marks an improvement over techniques like Vanilla Expert Iteration and retrieval-based reinforcement learning methods that either lack the assertive control required or fall short on task complexity.

The key takeaways from this research are:

In conclusion, AUTO-CEI represents a significant advance in LLM training methodologies by balancing assertive and conservative behaviors based on reasoning limits. By incrementally enhancing the model’s problem-solving capacity while mitigating hallucinations and refusals, AUTO-CEI sets a new standard in reliable LLM reasoning across complex tasks, offering a scalable, adaptable solution for future AI development. This iterative, reward-based approach aligns the LLM’s behaviors with its limitations, ensuring more trustworthy and effective performance in critical applications across fields that demand accuracy and discernment.


Check out the Paper. All credit for this research goes to the researchers of this project. Also, don’t forget to follow us on Twitter and join our Telegram Channel and LinkedIn Group. If you like our work, you will love our newsletter.. Don’t Forget to join our 55k+ ML SubReddit.

[Trending] LLMWare Introduces Model Depot: An Extensive Collection of Small Language Models (SLMs) for Intel PCs

The post AUTO-CEI: A Curriculum and Expert Iteration Approach to Elevate LLMs’ Response Precision and Control Refusal Rates Across Diverse Reasoning Domains appeared first on MarkTechPost.

Fish AI Reader

Fish AI Reader

AI辅助创作,多种专业模板,深度分析,高质量内容生成。从观点提取到深度思考,FishAI为您提供全方位的创作支持。新版本引入自定义参数,让您的创作更加个性化和精准。

FishAI

FishAI

鱼阅,AI 时代的下一个智能信息助手,助你摆脱信息焦虑

联系邮箱 441953276@qq.com

相关标签

AUTO-CEI 大型语言模型 推理能力提升
相关文章