热点
"黑盒攻击" 相关文章
TPAMI 2025 | AI对抗迁移性评估的「拨乱反正」:那些年效果虚高的攻防算法们
机器之心 2025-10-27T13:05:27.000000Z
RAG-Pull: Imperceptible Attacks on RAG Systems for Code Generation
cs.AI updates on arXiv.org 2025-10-14T04:19:55.000000Z
ArtPerception: ASCII Art-based Jailbreak on LLMs with Recognition Pre-test
cs.AI updates on arXiv.org 2025-10-14T04:18:15.000000Z
RIPRAG: Hack a Black-box Retrieval-Augmented Generation Question-Answering System with Reinforcement Learning
cs.AI updates on arXiv.org 2025-10-14T04:08:25.000000Z
AgentTypo: Adaptive Typographic Prompt Injection Attacks against Black-box Multimodal Agents
cs.AI updates on arXiv.org 2025-10-07T04:16:45.000000Z
Cross-Modal Content Optimization for Steering Web Agent Preferences
cs.AI updates on arXiv.org 2025-10-07T04:05:06.000000Z
Eliciting secret knowledge from language models
少点错误 2025-10-02T21:06:29.000000Z
Boundary on the Table: Efficient Black-Box Decision-Based Attacks for Structured Data
cs.AI updates on arXiv.org 2025-09-30T04:03:34.000000Z
Discrete optimal transport is a strong audio adversarial attack
cs.AI updates on arXiv.org 2025-09-19T04:45:27.000000Z
PBI-Attack: Prior-Guided Bimodal Interactive Black-Box Jailbreak Attack for Toxicity Maximization
cs.AI updates on arXiv.org 2025-09-03T04:18:08.000000Z
西交利物浦大学 | 针对大型语言模型的目标导向生成式提示注入攻击
安全学术圈 2025-08-27T15:46:33.000000Z
西交利物浦大学 | 针对大型语言模型的目标导向生成式提示注入攻击
安全学术圈 2025-08-27T15:46:33.000000Z
Towards a 3D Transfer-based Black-box Attack via Critical Feature Guidance
cs.AI updates on arXiv.org 2025-08-22T04:02:29.000000Z
Heuristic-Induced Multimodal Risk Distribution Jailbreak Attack for Multimodal Large Language Models
cs.AI updates on arXiv.org 2025-08-19T04:21:16.000000Z
Multi-task Adversarial Attacks against Black-box Model with Few-shot Queries
cs.AI updates on arXiv.org 2025-08-15T04:18:37.000000Z
LeakAgent: RL-based Red-teaming Agent for LLM Privacy Leakage
cs.AI updates on arXiv.org 2025-08-11T04:08:29.000000Z
Generating Adversarial Point Clouds Using Diffusion Model
cs.AI updates on arXiv.org 2025-07-30T04:46:06.000000Z
Teach Me to Trick: Exploring Adversarial Transferability via Knowledge Distillation
cs.AI updates on arXiv.org 2025-07-30T04:12:15.000000Z
Attacking interpretable NLP systems
cs.AI updates on arXiv.org 2025-07-23T04:03:20.000000Z
AI 大脑如何被 “套路”?— 揭秘大模型提示词攻防
安全客 2025-05-30T00:20:03.000000Z