热点
关于我们
xx
xx
"
行为控制
" 相关文章
Learnable Behavior Control: Breaking Atari Human World Records via Sample-Efficient Behavior Selection
cs.AI updates on arXiv.org
2025-10-28T04:14:37.000000Z
Prompt Decorators: A Declarative and Composable Syntax for Reasoning, Formatting, and Control in LLMs
cs.AI updates on arXiv.org
2025-10-24T04:19:50.000000Z
Prompt Decorators: A Declarative and Composable Syntax for Reasoning, Formatting, and Control in LLMs
cs.AI updates on arXiv.org
2025-10-24T04:19:50.000000Z
Prompt Decorators: A Declarative and Composable Syntax for Reasoning, Formatting, and Control in LLMs
cs.AI updates on arXiv.org
2025-10-24T04:19:50.000000Z
VISOR++: Universal Visual Inputs based Steering for Large Vision Language Models
cs.AI updates on arXiv.org
2025-10-01T06:00:33.000000Z
给大模型「精准手术」:美团智能客服提出逆向学习技术精准纠偏,风险控制提升38%
机器之心
2025-09-25T10:01:11.000000Z
Psychometric Personality Shaping Modulates Capabilities and Safety in Language Models
cs.AI updates on arXiv.org
2025-09-23T05:12:41.000000Z
VISOR: Visual Input-based Steering for Output Redirection in Vision-Language Models
cs.AI updates on arXiv.org
2025-08-13T04:15:41.000000Z
Forcing LLMs to be evil during training can make them nicer in the long run
MIT Technology Review » Artificial Intelligence
2025-08-01T16:43:17.000000Z
Comparing the effectiveness of top-down and bottom-up activation steering for bypassing refusal on harmful prompts
少点错误
2025-02-13T02:06:56.000000Z
What You (Want to)* Want
Paul Graham: Essays
2024-11-25T00:32:38.000000Z