cs.AI updates on arXiv.org 10月10日
混合控制视频合成方法研究
index_new5.html
../../../zaker_core/zaker_tpl_static/wap/tpl_guoji1.html

 

本文提出一种视频合成方法,通过混合用户控制与生成模型,实现指定元素的高可控性与未指定元素的多样性。该方法利用变分推理近似合成分布,并采用多视频生成骨干网络,通过逐步KL散度最小化及条件化因子分解技术,有效解决优化问题,实验结果表明该方法在可控性、多样性和3D一致性方面优于现有工作。

arXiv:2510.07670v1 Announce Type: cross Abstract: Many video workflows benefit from a mixture of user controls with varying granularity, from exact 4D object trajectories and camera paths to coarse text prompts, while existing video generative models are typically trained for fixed input formats. We develop a video synthesis method that addresses this need and generates samples with high controllability for specified elements while maintaining diversity for under-specified ones. We cast the task as variational inference to approximate a composed distribution, leveraging multiple video generation backbones to account for all task constraints collectively. To address the optimization challenge, we break down the problem into step-wise KL divergence minimization over an annealed sequence of distributions, and further propose a context-conditioned factorization technique that reduces modes in the solution space to circumvent local optima. Experiments suggest that our method produces samples with improved controllability, diversity, and 3D consistency compared to prior works.

Fish AI Reader

Fish AI Reader

AI辅助创作,多种专业模板,深度分析,高质量内容生成。从观点提取到深度思考,FishAI为您提供全方位的创作支持。新版本引入自定义参数,让您的创作更加个性化和精准。

FishAI

FishAI

鱼阅,AI 时代的下一个智能信息助手,助你摆脱信息焦虑

联系邮箱 441953276@qq.com

相关标签

视频合成 混合控制 变分推理 优化算法 3D一致性
相关文章