Stable Virtual Camera：创新场景视图生成模型

Stability AI Research 09月19日

本文介绍了一种名为Stable Virtual Camera的通用扩散模型，该模型能够根据任意数量的输入视图和目标相机生成场景的新视图。通过简化模型设计、优化训练流程和灵活的采样策略，该模型在测试时能泛化到多种视图合成任务，且生成的样本保持高一致性，无需额外的3D表示，从而简化了真实场景下的视图合成。实验结果表明，该方法在多个数据集和设置上优于现有方法。

We present Stable Virtual Camera, a generalist diffusion model that creates novel views of a scene, given any number of input views and target cameras. Existing works struggle to generate either large viewpoint changes or temporally smooth samples, while relying on specific task configurations. Our approach overcomes these limitations through simple model design, optimized training recipe, and flexible sampling strategy that generalize across view synthesis tasks at test time. As a result, our samples maintain high consistency without requiring additional 3D representation-based distillation, thus streamlining view synthesis in the wild. Furthermore, we show that our method can generate high-quality videos lasting up to half a minute with seamless loop closure. Extensive benchmarking demonstrates that Stable Virtual Camera outperforms existing methods across different datasets and settings.

Read the paper

Fish AI Reader

AI辅助创作，多种专业模板，深度分析，高质量内容生成。从观点提取到深度思考，FishAI为您提供全方位的创作支持。新版本引入自定义参数，让您的创作更加个性化和精准。

FishAI

鱼阅，AI 时代的下一个智能信息助手，助你摆脱信息焦虑

Fish AI Reader

FishAI

联系邮箱 441953276@qq.com

相关标签