热点
关于我们
xx
xx
"
场景理解
" 相关文章
AI Powered High Quality Text to Video Generation with Enhanced Temporal Consistency
cs.AI updates on arXiv.org
2025-11-05T05:19:14.000000Z
Enhancing Vision-Language Models for Autonomous Driving through Task-Specific Prompting and Spatial Reasoning
cs.AI updates on arXiv.org
2025-10-29T04:26:34.000000Z
理想VLM/VLA盲区减速差异
理想 TOP2
2025-10-18T16:50:29.000000Z
理想VLM/VLA盲区减速差异
理想 TOP2
2025-10-18T16:50:29.000000Z
Efficient Few-Shot Learning in Remote Sensing: Fusing Vision and Vision-Language Models
cs.AI updates on arXiv.org
2025-10-17T04:14:42.000000Z
Progressive Gaussian Transformer with Anisotropy-aware Sampling for Open Vocabulary Occupancy Prediction
cs.AI updates on arXiv.org
2025-10-07T04:17:31.000000Z
Lightweight Structured Multimodal Reasoning for Clinical Scene Understanding in Robotics
cs.AI updates on arXiv.org
2025-09-29T04:15:30.000000Z
【智谱AutoGLM】深度体验报告及原理分析
产品白苏GLBai
2025-09-25T10:02:02.000000Z
多模态AI的"视觉智商"测试:复杂场景理解能力深度评估
掘金 人工智能
2025-09-22T02:11:49.000000Z
Multimodal SAM-adapter for Semantic Segmentation
cs.AI updates on arXiv.org
2025-09-15T08:34:51.000000Z
ICCV 2025 | HERMES:首个统一3D场景理解与生成的世界模型
机器之心
2025-08-14T08:35:09.000000Z
MTMamba++: Enhancing Multi-Task Dense Scene Understanding via Mamba-Based Decoders
cs.AI updates on arXiv.org
2025-07-29T04:22:41.000000Z
无需大量标注也能理解3D,新研究登上ICLR 2025 Spotlight
36氪 - 科技频道
2025-03-07T08:20:55.000000Z
首个统一3D场景理解与生成的自动驾驶世界模型
我爱计算机视觉
2025-02-12T13:41:25.000000Z
Text Labeling and Image Resolution with the Monkey Chat Vision Model and DigitalOcean+Paperspace GPUs ?
Hello Paperspace
2024-11-27T08:36:34.000000Z
DriveVLM:清华MARS Lab合作推出首个部署上车的自动驾驶视觉语言大模型
智源社区
2024-09-06T14:07:59.000000Z
人大团队解决复杂时空场景的物体分割难题,能用于自动驾驶和影像分析
MIT 科技评论 - 本周热榜
2024-07-07T16:02:11.000000Z