热点
关于我们
xx
xx
"
视觉推理
" 相关文章
2025.11.05 | 向量草图测代码;先画后想补视觉
HuggingFace 每日AI论文速递
2025-11-06T06:56:26.000000Z
A Multi-Modal Neuro-Symbolic Approach for Spatial Reasoning-Based Visual Grounding in Robotics
cs.AI updates on arXiv.org
2025-11-03T05:19:02.000000Z
Are Video Models Ready as Zero-Shot Reasoners? An Empirical Study with the MME-CoF Benchmark
cs.AI updates on arXiv.org
2025-10-31T04:09:59.000000Z
Evaluating ChatGPT's Performance in Classifying Pneumonia from Chest X-Ray Images
cs.AI updates on arXiv.org
2025-10-28T04:11:10.000000Z
RewardMap: 通过多阶段强化学习解决细粒度视觉推理的Sparse Reward
机器之心
2025-10-21T08:56:11.000000Z
ZeST: an LLM-based Zero-Shot Traversability Navigation for Unknown Environments
cs.AI updates on arXiv.org
2025-10-21T04:30:24.000000Z
SSL4RL: Revisiting Self-supervised Learning as Intrinsic Reward for Visual-Language Reasoning
cs.AI updates on arXiv.org
2025-10-21T04:24:26.000000Z
RECODE: Reasoning Through Code Generation for Visual Question Answering
cs.AI updates on arXiv.org
2025-10-16T04:29:39.000000Z
CompoDistill: Attention Distillation for Compositional Reasoning in Multimodal LLMs
cs.AI updates on arXiv.org
2025-10-15T04:58:19.000000Z
EncQA: Benchmarking Vision-Language Models on Visual Encodings for Charts
machinelearning apple
2025-10-13T22:38:40.000000Z
1万亿参数Ling-1T开源,国产LLM牛了~
PaperAgent
2025-10-10T10:09:13.000000Z
To Sink or Not to Sink: Visual Information Pathways in Large Vision-Language Models
cs.AI updates on arXiv.org
2025-10-10T04:19:09.000000Z
VCoT-Grasp: Grasp Foundation Models with Visual Chain-of-Thought Reasoning for Language-driven Grasp Generation
cs.AI updates on arXiv.org
2025-10-08T04:14:35.000000Z
ChartAgent: A Multimodal Agent for Visually Grounded Reasoning in Complex Chart Question Answering
cs.AI updates on arXiv.org
2025-10-07T04:09:11.000000Z
ChartAgent: A Multimodal Agent for Visually Grounded Reasoning in Complex Chart Question Answering
cs.AI updates on arXiv.org
2025-10-07T04:09:11.000000Z
RewardMap: Tackling Sparse Rewards in Fine-grained Visual Reasoning via Multi-Stage Reinforcement Learning
cs.AI updates on arXiv.org
2025-10-03T04:18:42.000000Z
RewardMap: Tackling Sparse Rewards in Fine-grained Visual Reasoning via Multi-Stage Reinforcement Learning
cs.AI updates on arXiv.org
2025-10-03T04:18:42.000000Z
Can World Models Benefit VLMs for World Dynamics?
cs.AI updates on arXiv.org
2025-10-02T04:18:30.000000Z
NeurIPS 2025 | UniPixel:首个统一对象指代与分割的像素级推理框架,让大模型看懂每一个像素
我爱计算机视觉
2025-10-01T09:39:51.000000Z
NeurIPS 2025 Spotlight | FSDrive统一VLA和世界模型,推动自动驾驶迈向视觉推理
机器之心
2025-09-30T14:49:40.000000Z