AWS Machine Learning Blog 09月05日
利用AI提升故事板制作的视觉一致性
index_new5.html
../../../zaker_core/zaker_tpl_static/wap/tpl_guoji1.html

 

本文探讨了如何利用Amazon Nova Canvas等AI基础模型来克服故事板制作中的视觉一致性挑战。文章详细介绍了提示工程和角色开发流水线,分享了确保生成结果可靠、一致的提示模式。通过调整描述性提示、分离风格信息、利用种子值生成角色变体以及控制cfgScale参数,可以显著提高角色在不同场景中的连贯性。文章还展示了如何构建一个端到端的流水线,将文本描述转化为视觉上统一的故事板,并介绍了如何结合Amazon Nova Reel生成动态的动画故事板。尽管AI技术在提升一致性方面取得了显著进步,但仍存在细微的变异,文章预告了下一部分将深入探讨高级微调技术。

💡 **AI赋能故事板创作,解决视觉一致性难题**:文章指出,AI基础模型(FMs)如Amazon Nova Canvas正在革新故事板制作流程,但同时也带来了角色设计和风格连贯性的挑战。通过精细的提示工程和参数调优,可以有效解决这些问题,提升内容创作效率。

🎨 **精细化提示工程实现角色与风格一致性**:通过在提示中包含具体的角色属性、服装细节,并分离风格描述(如“图形小说风格插画”)和风格细节(如“粗线条、戏剧性阴影”),可以确保生成图像在角色特征和整体视觉风格上保持高度一致,并支持多种艺术风格的探索。

⚙️ **参数调优提升可控性与一致性**:利用种子值(seed)可以生成同一角色的不同变体,而cfgScale参数则用于控制AI对提示的遵循程度。通过合理设置这些参数,可以在保持角色身份的同时,探索多样化的视觉表现,并确保角色在不同场景中的连续性。

🎬 **端到端流水线与视频动画生成**:文章介绍了如何构建一个端到端的流水线,将场景和角色描述转化为视觉上连贯的故事板。此外,还展示了如何结合Amazon Nova Reel,将静态故事板转化为带有细微动态和镜头运动的短视频片段,进一步丰富内容创作的可能性。

The art of storyboarding stands as the cornerstone of modern content creation, weaving its essential role through filmmaking, animation, advertising, and UX design. Though traditionally, creators have relied on hand-drawn sequential illustrations to map their narratives, today’s AI foundation models (FMs) are transforming this landscape. FMs like Amazon Nova Canvas and Amazon Nova Reel offer capabilities in transforming text and image inputs into professional-grade visuals and short clips that promise to revolutionize preproduction workflows.

This technological leap forward, however, presents its own set of challenges. Although these models excel at generating diverse concepts rapidly—a boon for creative exploration—maintaining consistent character designs and stylistic coherence across scenes remains a significant hurdle. Even subtle modifications to prompts or model configurations can yield dramatically different visual outputs, potentially disrupting narrative continuity and creating additional work for content creators.

To address these challenges, we’ve developed this two-part series exploring practical solutions for achieving visual consistency. In Part 1, we deep dive into prompt engineering and character development pipelines, sharing tested prompt patterns that deliver reliable, consistent results with Amazon Nova Canvas and Amazon Nova Reel. Part 2 explores techniques like fine-tuning Amazon Nova Canvas to achieve exceptional visual consistency and precise character control.

Consistent character design with Amazon Nova Canvas

The foundation of effective storyboarding begins with establishing well-defined character designs. Amazon Nova Canvas offers several powerful techniques to create and maintain character consistency throughout your visual narrative. To help you implement these techniques in your own projects, we’ve provided comprehensive code examples and resources in our GitHub repository. We encourage you to follow along as we walk through each step in detail. If you’re new to Amazon Nova Canvas, we recommend first reviewing Generating images with Amazon Nova to familiarize yourself with the basic concepts.

Basic text prompting

Amazon Nova Canvas transforms text descriptions into visual representations. Unlike large language models (LLMs), image generation models don’t interpret commands or engage in reasoning—they respond best to descriptive captions. Including specific details in your prompts, such as physical attributes, clothing, and styling elements, directly influences the generated output.

For example, “A 7-year-old Peruvian girl with dark hair in two low braids wearing a school uniform” provides clear visual elements for the model to generate an initial character concept, as shown in the following example image.

Visual style implementation

Consistency in storyboarding requires both character features and unified visual style. Our approach separates style information into two key components in the prompt:

This structured technique enables exploration of various artistic styles, including graphic novels, sketches, and 3D illustrations, while maintaining character consistency throughout the storyboard. The following is an example prompt template and some style information you can experiment with:

{style_description} A 7 year old peruvian girl with dark hair in two low braids wearing a    school uniform. {style_details}styles = [    {        "name": "graphic-novel",        "description": "A graphic novel style illustation of",        "details": "Bold linework, dramatic shadows, and flat color palettes. Use            high contrast lighting and cinematic composition typical of comic book            panels. Include expressive line work to convey emotion and movement.",    },    {        "name": "sketch",        "description": "A simple black and white line sketch of",        "details": "Rough, sketch-like lines create a storyboard aesthetic. High            contrast. No color",    },    {        "name": "digital-illustration",        "description": "A 3D digital drawing of",        "details": "High contrast. Rounded character design. Smooth rendering.            Soft texture. Luminous lighting",    },]

Character variation through seed values

The seed parameter serves as a tool for generating character variations while adhering to the same prompt. By keeping the text description constant and varying only the seed value, creators can explore multiple interpretations of their character design without starting from scratch, as illustrated in the following example images.

Seed = 1

Seed = 20

Seed = 57

Seed = 139

Seed = 12222

Prompt adherence control with cfgScale

The cfgScale parameter is another tool for maintaining character consistency, controlling how strictly Amazon Nova Canvas follows your prompt. Operating on a scale from 1.1–10, lower values give the model more creative freedom and higher values enforce strict prompt adherence. The default value of 6.5 typically provides an optimal balance, but as demonstrated in the following images, finding the right setting is crucial. Too low a value can result in inconsistent character representations, whereas too high a value might overemphasize prompt elements at the cost of natural composition.

Seed = 57,
cfgScale = 1.1
Seed = 57,
cfgScale = 3.5
Seed = 57,
cfgScale = 6.5
Seed = 57,
cfgScale = 8.0
Seed = 57,
cfgScale = 10

Scene integration with consistent parameters

Now we can put these techniques together to test for character consistency across different narrative contexts, as shown in the following example images. We maintain consistent input for style, seed, and cfgScale, varying only the scene description to make sure character remains recognizable throughout the scene sequences.

Seed = 57, Cfg_scale: 6.5 Seed = 57, Cfg_scale: 6.5 Seed = 57, Cfg_scale: 6.5
A graphic novel style illustration of a 7 year old Peruvian girl with dark hair in two low braids wearing a school uniform riding a bike on a mountain pass Bold linework, dramatic shadows, and flat color palettes. Use high contrast lighting and cinematic composition typical of comic book panels. Include expressive line work to convey emotion and movement. A graphic novel style illustation of a 7 year old Peruvian girl with dark hair in two low braids wearing a school uniform walking on a path through tall grass in the Andes Bold linework, dramatic shadows, and flat color palettes. Use high contrast lighting and cinematic composition typical of comic book panels. Include expressive line work to convey emotion and movement. A graphic novel style illustration of a 7 year old Peruvian girl with dark hair in two low braids wearing a school uniform eating ice cream at the beach Bold linework, dramatic shadows, and flat color palettes. Use high contrast lighting and cinematic composition typical of comic book panels. Include expressive line work to convey emotion and movement.

Storyboard development pipeline

Building upon the character consistency techniques we’ve discussed, we can now implement an end-to-end storyboard development pipeline that transforms written scene and character descriptions into visually coherent storyboards. This systematic approach uses our established parameters for style descriptions, seed values, and cfgScale values to provide character consistency while adapting to different narrative contexts. The following are some example scene and character descriptions:

"scenes":[    {        "description": "Mayu stands at the edge of a mountainous path, clutching            a book. Her mother, Maya, kneels beside her, offering words of encouragement            and handing her the book. Mayu looks nervous but determined as she prepares            to start her journey."    },    {        "description": "Mayu encounters a 'danger' sign with a drawing of a            snake. She looks scared, but then remembers her mother's words. She takes a            deep breath, looks at her book for reassurance, and then searches for a stick            on the ground."    },    {        "description": "Mayu bravely makes her way through tall grass, swinging            her stick and making noise to scare off potential snakes. Her face shows a            mix of fear and courage as she pushes forward on her journey."    }],"characters":{    "Mayu":  "A 7-year-old Peruvian girl with dark hair in two low braids wearing a        school uniform",    "Maya":  "An older Peruvian woman with long dark hair tied back in a bun, wearing        traditional Peruvian clothing"}

Our pipeline uses Amazon Nova Lite to first craft optimized image prompts incorporating our established best practices, which are then passed to Amazon Nova Canvas for image generation. By setting numberOfImages higher (typically three variations), while maintaining consistent seed and cfgScale values, we give creators multiple options that preserve character consistency. We used the following prompt for Amazon Nova Lite to generate optimized image prompts:

Describe an image that best represents the scene described. Here are some examples:scene: Rosa is in the kitchen, rummaging through the pantry, looking for a snack. She    hears a strange noise coming from the back of the pantry and becomes startled.imagery: A dimly lit pantry with shelves stocked with various food items, and Rosa    peering inside, her face expressing curiosity and a hint of fear.scene: Rosa says goodbye to her mother, Maya. Maya offers her words of encouragement.imagery: A wide shot of Rosa's determined face, facing Maya and receiving a small wrapped    gift.Only describe the imagery. Use no more than 60 words.scene: {scene_description}imagery:

Our pipeline generated the following storyboard panels.

Mayu stands at the edge of a mountainous path, clutching a book. Her mother, Maya, kneels beside her, offering words of encouragement and handing her the book. Mayu looks nervous but determined as she prepares to start her journey.
Mayu encounters a ‘danger’ sign with a drawing of a snake. She looks scared, but then remembers her mother’s words. She takes a deep breath, looks at her book for reassurance, and then searches for a stick on the ground.
Mayu bravely makes her way through tall grass, swinging her stick and making noise to scare off potential snakes. Her face shows a mix of fear and courage as she pushes forward on her journey.

Although these techniques noticeably improve character consistency, they aren’t perfect. Upon closer inspection, you will notice that even images within the same scene show variations in character consistency. Using consistent seed values helps control these variations, and the techniques outlined in this post significantly improve consistency compared to basic prompt engineering. However, if your use case requires near-perfect character consistency, we recommend proceeding to Part 2, where we explore advanced fine-tuning techniques.

Video generation for animated storyboards

If you want to go beyond static scene images to transform your storyboard into short, animated video clips, you can use Amazon Nova Reel. We use Amazon Nova Lite to convert image prompts into video prompts, adding subtle motion and camera movements optimized for the Amazon Nova Reel model. These prompts, along with the original images, serve as creative constraints for Amazon Nova Reel to generate the final animated sequences. The following is the example prompt and its resulting animated scene in GIF format:

A sunlit forest path with a 'Danger' sign featuring a snake. A 7-year-old Peruvian girl    stands, visibly scared but resolute. Bold linework, dramatic shadows, and flat color    palettes. High contrast lighting and cinematic composition. Mist slowly drifting.    Camera dolly in.
Input Image Output Video

Conclusion

In this first part of our series, we explored fundamental techniques for achieving character and style consistency using Amazon Nova Canvas, from structured prompt engineering to building an end-to-end storyboarding pipeline. We demonstrated how combining style descriptions, seed values, and careful cfgScale parameter control can significantly improve character consistency across different scenes. We also showed how integrating Amazon Nova Lite with Amazon Nova Reel can enhance the storyboarding workflow, enabling both optimized prompt generation and animated sequences.

Although these techniques provide a solid foundation for consistent storyboard generation, they aren’t perfect—subtle variations might still occur. We invite you to continue to Part 2, where we explore advanced model fine-tuning techniques that can help achieve near-perfect character consistency and visual fidelity.


About the authors

Alex Burkleaux is a Senior AI/ML Specialist Solution Architect at AWS. She helps customers use AI Services to build media solutions using Generative AI. Her industry experience includes over-the-top video, database management systems, and reliability engineering.

James Wu is a Senior AI/ML Specialist Solution Architect at AWS, helping customers design and build AI/ML solutions. James’s work covers a wide range of ML use cases, with a primary interest in computer vision, deep learning, and scaling ML across the enterprise. Prior to joining AWS, James was an architect, developer, and technology leader for over 10 years, including 6 years in engineering and 4 years in marketing & advertising industries.

Vladimir Budilov is a Principal Solutions Architect at AWS focusing on agentic & generative AI, and software architecture. He leads large-scale GenAI implementations, bridging cutting-edge AI capabilities with production-ready business solutions, while optimizing for cost and solution resilience.

Nora Shannon Johnson is a Solutions Architect at Amazon Music focused on discovery and growth through AI/ML. In the past, she supported AWS through the development of generative AI prototypes and tools for developers in financial services, health care, retail, and more. She has been an engineer and consultant in various industries including DevOps, fintech, industrial AI/ML, and edtech in the United States, Europe, and Latin America.

Ehsan Shokrgozar is a Senior Solutions Architect specializing in Media and Entertainment at AWS. He is passionate about helping M&E customers build more efficient workflows. He combines his previous experience as Technical Director and Pipeline Engineer at various Animation/VFX studios with his knowledge of building M&E workflows in the cloud to help customers achieve their business goals.

Fish AI Reader

Fish AI Reader

AI辅助创作,多种专业模板,深度分析,高质量内容生成。从观点提取到深度思考,FishAI为您提供全方位的创作支持。新版本引入自定义参数,让您的创作更加个性化和精准。

FishAI

FishAI

鱼阅,AI 时代的下一个智能信息助手,助你摆脱信息焦虑

联系邮箱 441953276@qq.com

相关标签

AI 故事板 Amazon Nova Canvas 视觉一致性 提示工程 内容创作 Amazon Nova Reel AI Art Storyboard Visual Consistency Prompt Engineering Content Creation
相关文章