视觉-语言模型_Fishai

热点

"视觉-语言模型" 相关文章

Chain of Time: In-Context Physical Simulation with Image Generation Models

cs.AI updates on arXiv.org 2025-11-05T05:19:27.000000Z

Semantic Relation-Enhanced CLIP Adapter for Domain Adaptive Zero-Shot Learning

cs.AI updates on arXiv.org 2025-10-28T04:10:02.000000Z

Atlas Urban Index: A VLM-Based Approach for Spatially and Temporally Calibrated Urban Development Monitoring

cs.AI updates on arXiv.org 2025-10-28T04:03:16.000000Z

Frugal Federated Learning for Violence Detection: A Comparison of LoRA-Tuned VLMs and Personalized CNNs

cs.AI updates on arXiv.org 2025-10-21T04:28:41.000000Z

MIT成果登Nature正刊：90天，「AI科学家」完成3500次电化学测试

36氪 - AI相关文章 2025-10-21T02:51:57.000000Z

Self-Augmented Visual Contrastive Decoding

cs.AI updates on arXiv.org 2025-10-16T04:26:48.000000Z

DriveCritic: Towards Context-Aware, Human-Aligned Evaluation for Autonomous Driving with Vision-Language Models

cs.AI updates on arXiv.org 2025-10-16T04:25:52.000000Z

DriveCritic: Towards Context-Aware, Human-Aligned Evaluation for Autonomous Driving with Vision-Language Models

cs.AI updates on arXiv.org 2025-10-16T04:25:52.000000Z

Phys2Real: Fusing VLM Priors with Interactive Online Adaptation for Uncertainty-Aware Sim-to-Real Manipulation

cs.AI updates on arXiv.org 2025-10-14T04:20:40.000000Z

Looking to Learn: Token-wise Dynamic Gating for Low-Resource Vision-Language Modelling

cs.AI updates on arXiv.org 2025-10-10T04:09:06.000000Z

Being-VL的视觉BPE路线：把「看」和「说」真正统一起来

机器之心 2025-10-09T09:53:06.000000Z

Being-VL的视觉BPE路线：把「看」和「说」真正统一起来

机器之心 2025-10-09T09:53:06.000000Z

MonitorVLM:A Vision Language Framework for Safety Violation Detection in Mining Operations

cs.AI updates on arXiv.org 2025-10-07T04:15:27.000000Z

Multimodal Carotid Risk Stratification with Large Vision-Language Models: Benchmarking, Fine-Tuning, and Clinical Insights

cs.AI updates on arXiv.org 2025-10-06T04:27:55.000000Z

VaPR -- Vision-language Preference alignment for Reasoning

cs.AI updates on arXiv.org 2025-10-03T04:06:42.000000Z

CHAI: Command Hijacking against embodied AI

cs.AI updates on arXiv.org 2025-10-02T04:17:14.000000Z

BEV-VLM: Trajectory Planning via Unified BEV Abstraction

cs.AI updates on arXiv.org 2025-10-01T05:59:56.000000Z

OpenDataLab 发布文档解析视觉-语言模型 MinerU2.5 技术报告

oschina.net 2025-09-30T06:34:53.000000Z

【周末特辑】9月第5周最火AI论文 | Qwen3-Omni开源称王; 锁定视觉训解码，Baseer刷新阿文OCR；

HuggingFace 每日AI论文速递 2025-09-28T03:22:35.000000Z

AnyPlace: Learning Generalized Object Placement for Robot Manipulation

cs.AI updates on arXiv.org 2025-09-26T04:23:29.000000Z

Copyright © 2019 FISHAI.All Rights Reserved