Too Easily Fooled? Prompt Injection Breaks LLMs on Frustratingly Simple Multiple-Choice Questions

cs.AI updates on arXiv.org 08月20日

Too Easily Fooled? Prompt Injection Breaks LLMs on Frustratingly Simple Multiple-Choice Questions

本文探讨了大型语言模型（LLMs）在复杂推理和零样本泛化方面的能力，指出LLMs在教育、同行评审和数据质量评估中的应用潜力巨大，但同时也揭示了其在隐藏提示注入攻击下的脆弱性。

arXiv:2508.13214v1 Announce Type: cross Abstract: Large Language Models (LLMs) have recently demonstrated strong emergent abilities in complex reasoning and zero-shot generalization, showing unprecedented potential for LLM-as-a-judge applications in education, peer review, and data quality evaluation. However, their robustness under prompt injection attacks, where malicious instructions are embedded into the content to manipulate outputs, remains a significant concern. In this work, we explore a frustratingly simple yet effective attack setting to test whether LLMs can be easily misled. Specifically, we evaluate LLMs on basic arithmetic questions (e.g., "What is 3 + 2?") presented as either multiple-choice or true-false judgment problems within PDF files, where hidden prompts are injected into the file. Our results reveal that LLMs are indeed vulnerable to such hidden prompt injection attacks, even in these trivial scenarios, highlighting serious robustness risks for LLM-as-a-judge applications.

Fish AI Reader

AI辅助创作，多种专业模板，深度分析，高质量内容生成。从观点提取到深度思考，FishAI为您提供全方位的创作支持。新版本引入自定义参数，让您的创作更加个性化和精准。

FishAI

鱼阅，AI 时代的下一个智能信息助手，助你摆脱信息焦虑

联系邮箱 441953276@qq.com

相关标签

大型语言模型提示注入攻击教育应用

相关文章

Is Claude 3 Outperforming GPT-4?

Harmonizing AI: Crafting Personalized Song Suggestions

AI News Weekly - Issue #377: Next in AI : Pioneers' Predictions! - Mar 21st 2024

COLLAGE: A New Machine Learning Approach to Deal with Floating-Point Errors in Low-Precision to Make LLM Training Accurate and Efficient

Leveraging Linguistic Expertise in NLP: A Deep Dive into RELIES and Its Impact on Large Language Models

Japanese Researchers Release “Fugaku-LLM” Trained on the Fugaku Supercomputer

Teaching Large Language Models to Reason with Reinforcement Learning with Alex Havrilla - #680

Deep Learning, Transformers, and the Consequences of Scale with Oriol Vinyals - #546

AI Gateway Provider Portkey.ai Is In Partnership With F5

Powering AI with the World's Largest Computer Chip with Joel Hestness - #684