cs.AI updates on arXiv.org 09月30日
LLM代理的权力控制与风险规避
index_new5.html
../../../zaker_core/zaker_tpl_static/wap/tpl_guoji1.html

 

本文提出了一种直接测量和控制基于大型语言模型(LLM)的AI系统代理性的方法,旨在应对LLM代理可能带来的潜在危害。通过将代理性视为独立于智能度量的系统属性,并参考跨学科文献,提出代理性的系统属性操作维度,包括偏好刚性、独立操作和目标持续性。同时,提出一种表征工程方法来测量和控制LLM代理的代理性,并基于此提供了一系列监管工具,如强制测试协议、特定领域的代理性限制、基于代理性的风险评估框架以及防止社会级风险的代理性上限。

arXiv:2509.22735v1 Announce Type: cross Abstract: As increasingly capable large language model (LLM)-based agents are developed, the potential harms caused by misalignment and loss of control grow correspondingly severe. To address these risks, we propose an approach that directly measures and controls the agency of these AI systems. We conceptualize the agency of LLM-based agents as a property independent of intelligence-related measures and consistent with the interdisciplinary literature on the concept of agency. We offer (1) agency as a system property operationalized along the dimensions of preference rigidity, independent operation, and goal persistence, (2) a representation engineering approach to the measurement and control of the agency of an LLM-based agent, and (3) regulatory tools enabled by this approach: mandated testing protocols, domain-specific agency limits, insurance frameworks that price risk based on agency, and agency ceilings to prevent societal-scale risks. We view our approach as a step toward reducing the risks that motivate the ``Scientist AI'' paradigm, while still capturing some of the benefits from limited agentic behavior.

Fish AI Reader

Fish AI Reader

AI辅助创作,多种专业模板,深度分析,高质量内容生成。从观点提取到深度思考,FishAI为您提供全方位的创作支持。新版本引入自定义参数,让您的创作更加个性化和精准。

FishAI

FishAI

鱼阅,AI 时代的下一个智能信息助手,助你摆脱信息焦虑

联系邮箱 441953276@qq.com

相关标签

大型语言模型 代理性控制 风险规避 表征工程 监管工具
相关文章