热点
"视觉-语言模型" 相关文章
Chain of Time: In-Context Physical Simulation with Image Generation Models
cs.AI updates on arXiv.org 2025-11-05T05:19:27.000000Z
Semantic Relation-Enhanced CLIP Adapter for Domain Adaptive Zero-Shot Learning
cs.AI updates on arXiv.org 2025-10-28T04:10:02.000000Z
Atlas Urban Index: A VLM-Based Approach for Spatially and Temporally Calibrated Urban Development Monitoring
cs.AI updates on arXiv.org 2025-10-28T04:03:16.000000Z
Frugal Federated Learning for Violence Detection: A Comparison of LoRA-Tuned VLMs and Personalized CNNs
cs.AI updates on arXiv.org 2025-10-21T04:28:41.000000Z
MIT成果登Nature正刊:90天,「AI科学家」完成3500次电化学测试
36氪 - AI相关文章 2025-10-21T02:51:57.000000Z
Self-Augmented Visual Contrastive Decoding
cs.AI updates on arXiv.org 2025-10-16T04:26:48.000000Z
DriveCritic: Towards Context-Aware, Human-Aligned Evaluation for Autonomous Driving with Vision-Language Models
cs.AI updates on arXiv.org 2025-10-16T04:25:52.000000Z
DriveCritic: Towards Context-Aware, Human-Aligned Evaluation for Autonomous Driving with Vision-Language Models
cs.AI updates on arXiv.org 2025-10-16T04:25:52.000000Z
Phys2Real: Fusing VLM Priors with Interactive Online Adaptation for Uncertainty-Aware Sim-to-Real Manipulation
cs.AI updates on arXiv.org 2025-10-14T04:20:40.000000Z
Looking to Learn: Token-wise Dynamic Gating for Low-Resource Vision-Language Modelling
cs.AI updates on arXiv.org 2025-10-10T04:09:06.000000Z
Being-VL的视觉BPE路线:把「看」和「说」真正统一起来
机器之心 2025-10-09T09:53:06.000000Z
Being-VL的视觉BPE路线:把「看」和「说」真正统一起来
机器之心 2025-10-09T09:53:06.000000Z
MonitorVLM:A Vision Language Framework for Safety Violation Detection in Mining Operations
cs.AI updates on arXiv.org 2025-10-07T04:15:27.000000Z
Multimodal Carotid Risk Stratification with Large Vision-Language Models: Benchmarking, Fine-Tuning, and Clinical Insights
cs.AI updates on arXiv.org 2025-10-06T04:27:55.000000Z
VaPR -- Vision-language Preference alignment for Reasoning
cs.AI updates on arXiv.org 2025-10-03T04:06:42.000000Z
CHAI: Command Hijacking against embodied AI
cs.AI updates on arXiv.org 2025-10-02T04:17:14.000000Z
BEV-VLM: Trajectory Planning via Unified BEV Abstraction
cs.AI updates on arXiv.org 2025-10-01T05:59:56.000000Z
OpenDataLab 发布文档解析视觉-语言模型 MinerU2.5 技术报告
oschina.net 2025-09-30T06:34:53.000000Z
【周末特辑】9月第5周最火AI论文 | Qwen3-Omni开源称王; 锁定视觉训解码,Baseer刷新阿文OCR;
HuggingFace 每日AI论文速递 2025-09-28T03:22:35.000000Z
AnyPlace: Learning Generalized Object Placement for Robot Manipulation
cs.AI updates on arXiv.org 2025-09-26T04:23:29.000000Z