热点
关于我们
xx
xx
"
MLLMs
" 相关文章
QG-CoC: Question-Guided Chain-of-Captions for Large Multimodal Models
cs.AI updates on arXiv.org
2025-11-06T05:12:06.000000Z
从 「会思考」到 「善创造」: 多模态大模型的深度推理与协同进化
我爱计算机视觉
2025-11-06T03:55:05.000000Z
NeurIPS 2025 | 电子科技大学联合A*STAR提出SCOPE:兼顾显著性与覆盖率,实现高效多模态大模型令牌剪枝
我爱计算机视觉
2025-11-06T03:55:04.000000Z
OmniBrainBench: A Comprehensive Multimodal Benchmark for Brain Imaging Analysis Across Multi-stage Clinical Tasks
cs.AI updates on arXiv.org
2025-11-05T05:27:44.000000Z
Visual Backdoor Attacks on MLLM Embodied Decision Making via Contrastive Trigger Learning
cs.AI updates on arXiv.org
2025-11-03T05:18:29.000000Z
VisJudge-Bench: Aesthetics and Quality Assessment of Visualizations
cs.AI updates on arXiv.org
2025-10-28T04:14:32.000000Z
Mitigating Coordinate Prediction Bias from Positional Encoding Failures
cs.AI updates on arXiv.org
2025-10-28T04:12:48.000000Z
Rethinking the Text-Vision Reasoning Imbalance in MLLMs through the Lens of Training Recipes
cs.AI updates on arXiv.org
2025-10-28T04:04:02.000000Z
I Spy With My Model's Eye: Visual Search as a Behavioural Test for MLLMs
cs.AI updates on arXiv.org
2025-10-23T04:22:12.000000Z
PruneHal: Reducing Hallucinations in Multi-modal Large Language Models through Adaptive KV Cache Pruning
cs.AI updates on arXiv.org
2025-10-23T04:16:34.000000Z
PruneHal: Reducing Hallucinations in Multi-modal Large Language Models through Adaptive KV Cache Pruning
cs.AI updates on arXiv.org
2025-10-23T04:16:34.000000Z
RewardMap: 通过多阶段强化学习解决细粒度视觉推理的Sparse Reward
机器之心
2025-10-21T08:56:11.000000Z
RewardMap: 通过多阶段强化学习解决细粒度视觉推理的Sparse Reward
机器之心
2025-10-21T06:37:48.000000Z
RewardMap: 通过多阶段强化学习解决细粒度视觉推理的Sparse Reward
机器之心
2025-10-21T06:37:48.000000Z
RewardMap: 通过多阶段强化学习解决细粒度视觉推理的Sparse Reward
机器之心
2025-10-21T06:37:48.000000Z
RewardMap: 通过多阶段强化学习解决细粒度视觉推理的Sparse Reward
机器之心
2025-10-21T06:37:48.000000Z
CrossGuard: Safeguarding MLLMs against Joint-Modal Implicit Malicious Attacks
cs.AI updates on arXiv.org
2025-10-21T04:28:47.000000Z
Sequential Comics for Jailbreaking Multimodal Large Language Models via Structured Visual Storytelling
cs.AI updates on arXiv.org
2025-10-20T04:11:42.000000Z
Sequential Comics for Jailbreaking Multimodal Large Language Models via Structured Visual Storytelling
cs.AI updates on arXiv.org
2025-10-20T04:11:42.000000Z
NeurIPS2025 | 攻破闭源多模态大模型:一种基于特征最优对齐的新型对抗攻击方法
机器之心
2025-10-17T13:34:39.000000Z