视频语言模型_Fishai

热点

"视频语言模型" 相关文章

Breakdance Video classification in the age of Generative AI

cs.AI updates on arXiv.org 2025-10-24T04:25:47.000000Z

Breakdance Video classification in the age of Generative AI

cs.AI updates on arXiv.org 2025-10-24T04:25:47.000000Z

Improving Temporal Understanding Logic Consistency in Video-Language Models via Attention Enhancement

cs.AI updates on arXiv.org 2025-10-10T04:16:09.000000Z

Oracle-RLAIF: An Improved Fine-Tuning Framework for Multi-modal Video Models through Reinforcement Learning from Ranking Feedback

cs.AI updates on arXiv.org 2025-10-06T04:27:13.000000Z

VideoNSA: Native Sparse Attention Scales Video Understanding

cs.AI updates on arXiv.org 2025-10-03T04:18:55.000000Z

Automated Procedural Analysis via Video-Language Models for AI-assisted Nursing Skills Assessment

cs.AI updates on arXiv.org 2025-09-23T05:16:11.000000Z

Grounded-VideoLLM: Sharpening Fine-grained Temporal Grounding in Video Large Language Models

cs.AI updates on arXiv.org 2025-08-22T04:02:34.000000Z

突破视频多模态大模型瓶颈！「合成数据」立大功，项目已开源

机器之心 2024-10-21T08:11:33.000000Z

Copyright © 2019 FISHAI.All Rights Reserved