热点
"LVLM" 相关文章
VisionWeaver:从“现象识别”到“病因诊断”,开启AI视觉幻觉研究新篇章
哔哩哔哩技术 2025-11-14T06:15:16.000000Z
VisionWeaver:从“现象识别”到“病因诊断”,开启AI视觉幻觉研究新篇章
哔哩哔哩技术 2025-11-14T06:05:57.000000Z
VisionWeaver:从“现象识别”到“病因诊断”,开启AI视觉幻觉研究新篇章
掘金 人工智能 2025-11-14T05:01:53.000000Z
A Frustratingly Simple Yet Highly Effective Attack Baseline: Over 90% Success Rate Against the Strong Black-box Models of GPT-4.5/4o/o1
cs.AI updates on arXiv.org 2025-10-28T04:14:38.000000Z
Watermarking for Factuality: Guiding Vision-Language Models Toward Truth via Tri-layer Contrastive Decoding
cs.AI updates on arXiv.org 2025-10-17T04:17:14.000000Z
On Epistemic Uncertainty of Visual Tokens for Object Hallucinations in Large Vision-Language Models
cs.AI updates on arXiv.org 2025-10-13T04:13:49.000000Z
Defending LVLMs Against Vision Attacks through Partial-Perception Supervision
cs.AI updates on arXiv.org 2025-09-05T04:45:49.000000Z
In-Depth and In-Breadth: Pre-training Multimodal Language Models Customized for Comprehensive Chart Understanding
cs.AI updates on arXiv.org 2025-07-22T04:44:28.000000Z
A Satellite-Ground Synergistic Large Vision-Language Model System for Earth Observation
cs.AI updates on arXiv.org 2025-07-09T04:01:49.000000Z
INTER: Mitigating Hallucination in Large Vision-Language Models by Interaction Guidance Sampling
cs.AI updates on arXiv.org 2025-07-08T04:33:45.000000Z
零开销,消除图像幻觉!基于零空间投影挖掘正常样本特征 | CVPR 2025
智源社区 2025-06-28T07:46:49.000000Z
How Apoidea Group enhances visual information extraction from banking documents with multimodal models using LLaMA-Factory on Amazon SageMaker HyperPod
AWS Machine Learning Blog 2025-05-15T20:10:53.000000Z
Using AI Hallucinations to Evaluate Image Realism
Unite.AI 2025-03-25T12:27:59.000000Z
This AI Paper Introduces IXC-2.5-Reward: A Multi-Modal Reward Model for Enhanced LVLM Alignment and Performance
MarkTechPost@AI 2025-01-27T17:50:02.000000Z