热点
"音频-视觉对应" 相关文章
Hear-Your-Click: Interactive Video-to-Audio Generation via Object-aware Contrastive Audio-Visual Fine-tuning
cs.AI updates on arXiv.org 2025-07-08T05:54:12.000000Z