热点
"引导向量" 相关文章
Enhancing LLM Steering through Sparse Autoencoder-Based Vector Refinement
cs.AI updates on arXiv.org 2025-09-30T04:05:30.000000Z
Breaking the Mirror: Activation-Based Mitigation of Self-Preference in LLM Evaluators
cs.AI updates on arXiv.org 2025-09-05T04:45:38.000000Z
Reasoning-Finetuning Repurposes Latent Representations in Base Models
少点错误 2025-07-23T16:20:31.000000Z
Steering Vectors Can Help LLM Judges Detect Subtle Dishonesty
少点错误 2025-06-03T20:47:30.000000Z
Text Steers Vision
少点错误 2025-06-02T07:37:37.000000Z
AI Safety at the Frontier: Paper Highlights, July '24
少点错误 2024-08-05T13:06:44.000000Z