场景理解_Fishai

热点

"场景理解" 相关文章

AI Powered High Quality Text to Video Generation with Enhanced Temporal Consistency

cs.AI updates on arXiv.org 2025-11-05T05:19:14.000000Z

Enhancing Vision-Language Models for Autonomous Driving through Task-Specific Prompting and Spatial Reasoning

cs.AI updates on arXiv.org 2025-10-29T04:26:34.000000Z

理想VLM/VLA盲区减速差异

理想 TOP2 2025-10-18T16:50:29.000000Z

理想VLM/VLA盲区减速差异

理想 TOP2 2025-10-18T16:50:29.000000Z

Efficient Few-Shot Learning in Remote Sensing: Fusing Vision and Vision-Language Models

cs.AI updates on arXiv.org 2025-10-17T04:14:42.000000Z

Progressive Gaussian Transformer with Anisotropy-aware Sampling for Open Vocabulary Occupancy Prediction

cs.AI updates on arXiv.org 2025-10-07T04:17:31.000000Z

Lightweight Structured Multimodal Reasoning for Clinical Scene Understanding in Robotics

cs.AI updates on arXiv.org 2025-09-29T04:15:30.000000Z

【智谱AutoGLM】深度体验报告及原理分析

产品白苏GLBai 2025-09-25T10:02:02.000000Z

多模态AI的"视觉智商"测试：复杂场景理解能力深度评估

掘金人工智能 2025-09-22T02:11:49.000000Z

Multimodal SAM-adapter for Semantic Segmentation

cs.AI updates on arXiv.org 2025-09-15T08:34:51.000000Z

ICCV 2025 | HERMES：首个统一3D场景理解与生成的世界模型

机器之心 2025-08-14T08:35:09.000000Z

MTMamba++: Enhancing Multi-Task Dense Scene Understanding via Mamba-Based Decoders

cs.AI updates on arXiv.org 2025-07-29T04:22:41.000000Z

无需大量标注也能理解3D，新研究登上ICLR 2025 Spotlight

36氪 - 科技频道 2025-03-07T08:20:55.000000Z

首个统一3D场景理解与生成的自动驾驶世界模型

我爱计算机视觉 2025-02-12T13:41:25.000000Z

Text Labeling and Image Resolution with the Monkey Chat Vision Model and DigitalOcean+Paperspace GPUs ?

Hello Paperspace 2024-11-27T08:36:34.000000Z

DriveVLM：清华MARS Lab合作推出首个部署上车的自动驾驶视觉语言大模型

智源社区 2024-09-06T14:07:59.000000Z

人大团队解决复杂时空场景的物体分割难题，能用于自动驾驶和影像分析

MIT 科技评论 - 本周热榜 2024-07-07T16:02:11.000000Z

Copyright © 2019 FISHAI.All Rights Reserved