热点
"长视频理解" 相关文章
FLoC: Facility Location-Based Efficient Visual Token Compression for Long Video Understanding
cs.AI updates on arXiv.org 2025-11-05T05:20:37.000000Z
FOCUS: Efficient Keyframe Selection for Long Video Understanding
cs.AI updates on arXiv.org 2025-11-03T05:19:29.000000Z
ICCV 2025 | AI能看懂电影剧情吗?VRBench开启首场“长视频推理大考”
PaperWeekly 2025-10-22T15:13:53.000000Z
ICCV 2025 | AI能看懂电影剧情吗?VRBench开启首场“长视频推理大考”
PaperWeekly 2025-10-22T14:32:56.000000Z
ICCV 2025 | AI能看懂电影剧情吗?VRBench开启首场“长视频推理大考”
PaperWeekly 2025-10-22T14:32:56.000000Z
NeurIPS 2025 | KAUST与MetaAI提出Vgent:图增强RAG,长视频理解性能超越SOTA 8.6%
我爱计算机视觉 2025-10-20T14:40:21.000000Z
NeurIPS 2025 | KAUST与MetaAI提出Vgent:图增强RAG,长视频理解性能超越SOTA 8.6%
我爱计算机视觉 2025-10-20T14:40:21.000000Z
轻量高效,即插即用:Video-RAG为长视频理解带来新范式
机器之心 2025-10-20T14:13:56.000000Z
K-frames: Scene-Driven Any-k Keyframe Selection for long video understanding
cs.AI updates on arXiv.org 2025-10-17T04:13:57.000000Z
VideoMiner: Iteratively Grounding Key Frames of Hour-Long Videos via Tree-based Group Relative Policy Optimization
cs.AI updates on arXiv.org 2025-10-08T04:14:54.000000Z
VideoNSA: Native Sparse Attention Scales Video Understanding
cs.AI updates on arXiv.org 2025-10-03T04:18:55.000000Z
Video Panels for Long Video Understanding
cs.AI updates on arXiv.org 2025-09-30T04:05:17.000000Z
StreamMem: Query-Agnostic KV Cache Memory for Streaming Video Understanding
cs.AI updates on arXiv.org 2025-08-22T04:02:21.000000Z
Episodic Memory Representation for Long-form Video Understanding
cs.AI updates on arXiv.org 2025-08-14T04:19:27.000000Z
VSI: Visual Subtitle Integration for Keyframe Selection to enhance Long Video Understanding
cs.AI updates on arXiv.org 2025-08-12T04:39:06.000000Z
LVBench: An Extreme Long Video Understanding Benchmark
cs.AI updates on arXiv.org 2025-08-12T04:02:22.000000Z
Enhancing Long Video Question Answering with Scene-Localized Frame Grouping
cs.AI updates on arXiv.org 2025-08-06T04:02:17.000000Z
Deep Video Discovery: 打造能“看懂”长视频的智能体新标杆
微软研究院AI头条 2025-07-21T10:33:52.000000Z
Iterative Zoom-In: Temporal Interval Exploration for Long Video Understanding
cs.AI updates on arXiv.org 2025-07-08T05:53:47.000000Z
打破长视频理解瓶颈:HoPE混合位置编码提升VLM长度泛化能力
机器之心 2025-06-29T12:30:28.000000Z