3DFacePolicy: Audio-Driven 3D Facial Animation Based on Action Control

cs.AI updates on arXiv.org 08月13日

3DFacePolicy: Audio-Driven 3D Facial Animation Based on Action Control

本文提出了一种名为3DFacePolicy的音频驱动3D面部动画新方法，通过预测动作序列实现连续帧间的自然面部运动，实验证明其优于现有技术，尤其在动态、表情丰富和自然平滑的动画方面表现突出。

arXiv:2409.10848v2 Announce Type: replace-cross Abstract: Audio-driven 3D facial animation has achieved significant progress in both research and applications. While recent baselines struggle to generate natural and continuous facial movements due to their frame-by-frame vertex generation approach, we propose 3DFacePolicy, a pioneer work that introduces a novel definition of vertex trajectory changes across consecutive frames through the concept of "action". By predicting action sequences for each vertex that encode frame-to-frame movements, we reformulate vertex generation approach into an action-based control paradigm. Specifically, we leverage a robotic control mechanism, diffusion policy, to predict action sequences conditioned on both audio and vertex states. Extensive experiments on VOCASET and BIWI datasets demonstrate that our approach significantly outperforms state-of-the-art methods and is particularly expert in dynamic, expressive and naturally smooth facial animations.

Fish AI Reader

AI辅助创作，多种专业模板，深度分析，高质量内容生成。从观点提取到深度思考，FishAI为您提供全方位的创作支持。新版本引入自定义参数，让您的创作更加个性化和精准。

FishAI

鱼阅，AI 时代的下一个智能信息助手，助你摆脱信息焦虑

联系邮箱 441953276@qq.com

相关标签

3D面部动画音频驱动动作预测面部运动 3DFacePolicy

相关文章

字节AI版小李子一开口：黄风岭，八百里

理想汽车论文里出现了魔性康辉图片

真假难辨！阿里升级AI人像视频生成，表情动作直逼专业水准

MoEE：理想汽车的混合专家模型

VisualSpeaker: Visually-Guided 3D Avatar Lip Synthesis

音频驱动全身视频生成模型夸克与浙江大学联合开源OmniAvatar

MemoryTalker: Personalized Speech-Driven 3D Facial Animation via Audio-Guided Stylization

FLOAT: Generative Motion Latent Flow Matching for Audio-driven Talking Portrait

Intention-Guided Cognitive Reasoning for Egocentric Long-Term Action Anticipation

Pika 发布音频驱动的视频生成模型