AWS Machine Learning Blog 09月19日
Amazon Bedrock 助力企业图像创作新篇章
index_new5.html
../../../zaker_core/zaker_tpl_static/wap/tpl_guoji1.html

 

Amazon Bedrock现已支持Stability AI图像服务,提供9种工具以优化企业图像的创建和修改流程。该技术扩展了Stable Diffusion和Stable Image模型,使用户能够精确控制图像生成与编辑。清晰的提示词是关键,它们为AI系统提供了艺术指导,能够控制色调、纹理、光照和构图等具体元素,以实现预期的视觉效果。本文将深入探讨如何在Amazon Bedrock中使用Stability AI图像服务,特别是通过高级提示词技巧,来最大限度地提升企业级应用图像生成质量与精确度。

✨ **精细化图像控制与编辑**:Amazon Bedrock集成了Stability AI图像服务,通过扩展Stable Diffusion和Stable Image模型,为企业提供了9种强大的图像生成和编辑工具,能够实现对图像细节的精确控制,满足专业级需求。

📝 **提示词工程是核心**:清晰、结构化的提示词是引导AI生成理想图像的关键。文章强调了自然语言、标签式和混合式提示词格式的选择,以及如何通过模块化构建提示词,将前缀、主体、修饰符、动作、环境、风格和相机/光照等元素有机结合,以实现高度定制化的输出。

🚫 **负面提示词优化输出**:为了获得更专业、无瑕疵的图像,文章介绍了负面提示词的使用。通过明确指出不希望出现的元素,如低质量、解剖错误、风格冲突等,可以有效排除干扰,提升图像的整体质量和吸引力。

⚖️ **提示词权重精确调控**:利用提示词权重(如使用括号或数值指定),可以精确控制AI对特定元素的重视程度。这使得用户能够优先强调主体、光照或背景等,从而在保证整体协调性的同时,实现更细致入微的视觉效果。

🎨 **风格化引导与参考**:文章阐述了如何通过风格标签(如复古Y2K、干净现代、大胆街头、超现实主义)和引用命名风格(如梵高、韦斯·安德森)来指导图像的整体美学。此外,还介绍了使用参考图像进行图像到图像的风格迁移,以匹配姿势、色彩或构图。

Amazon Bedrock now offers Stability AI Image Services: 9 tools that improve how businesses create and modify images. The technology extends Stable Diffusion and Stable Image models to give you precise control over image creation and editing. Clear prompts are critical—they provide art direction to the AI system. Strong prompts control specific elements like tone, texture, lighting, and composition to create the desired visual outcomes. This capability serves professional needs across product photography, concept, and marketing campaigns.

In this post, we expand on the post Understanding prompt engineering: Unlock the creative potential of Stability AI models on AWS. We show how to effectively use advanced prompting techniques to maximize image generation quality and precision for enterprise application using Stability AI Image Services in Amazon Bedrock.

Solution overview

Stability AI Image Services are available as APIs in Amazon Bedrock, featuring capabilities such as, in-painting, style transfer, recoloring, background removal, object removal, style transfer, and much more.

In the following sections, we first discuss prompt structure for maximum control of image generation, then we provide advanced techniques of prompting for stylistic guidance. Code samples can be found in the following GitHub repository.

Prerequisites

To get started with Stability AI Image Services in Amazon Bedrock, follow the instructions in Getting started with the API to complete the following prerequisites:

    Set up your AWS account. Acquire credentials to grant programmatic access. Attach the Amazon Bedrock permission to an AWS Identity and Access Management (IAM) user or role. Request access to the Amazon Bedrock models.

Structure prompts that maximize control

To maximize the granular capabilities of Stability AI Image Services in Amazon Bedrock, you must construct prompts that enable fine-grained control.

This section outlines best practices for building effective prompts that produce the desired output. We demonstrate how prompt structure affects results and why more structured prompts typically yield more consistent and controllable outcomes.

Choose the right prompt type for your use case

Selecting the right prompt format helps the model better understand your intent. Three primary prompt formats deliver different levels of control and readability:

The following table provides examples of these three common ways to phrase your prompts. Each prompt format has its strengths depending on your goal or the interface you’re using.

Prompt type Prompt example Generated image using Stable Image Ultra in Amazon BedrockDescription and use caseBasic Prompt (Natural Language) “A clean product photo of a perfume bottle on a marble countertop” This is readable and intuitive. Great for exploration, conversational tools, and some model types. Stable Diffusion 3.5 responds best to this style.Tag-Based Prompt “perfume bottle, marble surface, soft light, high quality, product photo” Used in many generation UIs or with models trained on datasets like LAION or Danbooru. Compact and good for stacking details.Hybrid Prompt “perfume bottle on marble counter, soft studio lighting, sharp focus, f/2.8lens” Best of both worlds. Add emphasis with weighting syntax to influence the model’s priorities.

Build modular prompts

Modular prompting enhances AI image generation effectiveness. This approach divides prompts into distinct components, each specifying what to draw and how it should appear. Modular structures provide several benefits: they help prevent conflicting or confusing instructions, allow for precise output control, and simplify prompt debugging. By isolating individual elements, you can quickly identify and adjust effective or ineffective parts of your prompts. This method ultimately leads to more refined and targeted AI-generated images.

The following table provides examples of modular prompt modules. Experiment with different prompt sequences for your desired outcome; for example, placing the style before the subject will give it a more visual weight.

ModuleExampleDescription Prefix “fashion editorial portrait of” Sets the tone and intent for a high-fashion styled portrait Subject “a woman with medium-brown skin and short coiled hair” Gives the model’s look and surface detail to help guide facial features Modifiers “wearing an asymmetrical black mesh top, metallic jewelry” Adds stylized clothing and accessories for visual interest Action “seated with her shoulders angled, eyes locked on camera, one arm lifted” Describes body language and pose to give dynamic composition Environment “bathed in intersecting beams of hard directional light through window slats” Adds context for dramatic light play and atmosphere Style “high-contrast chiaroscuro lighting, sculptural and abstract” Informs the aesthetic and mood (shadow-driven, moody, architectural) Camera/Lighting “shot on 85mm, studio setup, layered shadows and light falling across face and body” Adds technical precision and helps control realism and fidelity

The following example illustrates how to use a modular prompt to generate the desired output.

Modular PromptGenerated Image Using Stable Image Ultra in Amazon Bedrock “fashion editorial portrait of a woman with medium-brown skin and short coiled hair, wearing an asymmetrical black mesh top and metallic jewelry, seated with shoulders angled and one arm lifted, eyes locked on camera, bathed in intersecting beams of hard directional light through window slats, layered shadows and highlights sculpting her face and body, high-contrast chiaroscuro lighting, abstract and bold, shot on 85mm in studio”

Use negative prompts for polished output

Negative prompts improve AI output quality by removing specific visual elements. Explicitly defining what not to include in the prompt guides the model’s output, typically leading to professional outputs. Negative prompts act like a retoucher’s checklist used to address aspects of an image to enhance quality and appeal. For example, “No weird hands. No blurry corners. No cartoon filters. Definitely no watermarks.” Negative prompts result in clean, confident, compositions, free of distracting element and distortions.

The following table provides examples of additional tokens that can be used in negative prompts.

Artifact TypeTokens to Use Low quality or noise blurry, lowres, jpeg artifacts, noisy Anatomy or model issues deformed, extra limbs, bad hands, missing fingers Style clashes cartoon, illustration, anime, painting Technical errors watermark, text, signature, overexposed General cleanup ugly, poorly drawn, distortion, worst quality

The following example illustrates how a well-structured negative prompt can enhance photorealism.

Without Negative Prompt

Prompt

(medium full shot) of (charming office cubicle) made of glass material, multiple colors, modern style, space-saving, upholstered seat, patina, gold trim, located in a modern garden, with sleek furniture, stylish decor, bright lighting, comfortable seating, Masterpiece, best quality, raw photo, realistic, very aesthetic, dark

With Negative Prompt

Prompt

“(medium full shot) of (charming office cubicle) made of glass material, multiple colors, modern style, space-saving, upholstered seat, patina, gold trim, located in a modern garden, with sleek furniture, stylish decor, bright lighting, comfortable seating, Masterpiece, best quality, raw photo, realistic, very aesthetic, dark”

Negative Prompt

“cartoon, 3d render, cgi, oversaturated, smooth plastic textures, unreal lighting, artificial, matte surface, painterly, dreamy, glossy finish, digital art, low detail background”

Emphasize or suppress elements with prompt weighting

Prompt weighting controls the influence of individual elements in AI image generation. These numerical weights prioritize specific prompt components over others. For example, to emphasize the character over the background, you can apply a 1.8 weight to “character” (character: 1.8) and 1.1 to “background” (background: 1.1), which makes sure the model prioritizes character detail while maintaining environmental context. This targeted emphasis produces more precise outputs by minimizing competition between prompt elements and clarifying the model’s priorities.

The syntax for prompt weights is (<term>:<weight>). You can also use a shorthand such as ((<term>)), where the number of parentheses represent the weight. Values between 0.0–1.0 deemphasize the term, and values between 1.1–2.0 emphasize the term.For example:

The following example shows how prompt weights contribute to the generated output.

Prompt with weights

“editorial product photo of (a translucent gel moisturizer jar:1.4) placed on a (frosted glass pedestal:1.2), surrounded by (dewy pink flower petals:1.1), with soft (diffused lighting:1.3), subtle water droplets, shallow depth of field”

Prompt without weights

“editorial product photo of a translucent gel moisturizer jar placed on a frosted glass pedestal, surrounded by dewy pink flower petals, with soft, subtle water droplets, shallow depth of field”

You can also use weights in negative prompts to reduce how strongly the model avoids something. For example, “(text:0.5), (blurry:0.2), (lowres:0.1).” This tells the model to be especially sure to avoid generating blurry text or low-resolution content.

Giving specific stylistic guidance

Effective prompt writing when using Stability AI Image Services such as Style Transfer and Style Guide requires a good understanding of style matching and reference-driven prompting. These techniques help provide clear stylistic direction for both text-to-image and image-to-image creation.

Image-to-image style transfer extracts stylistic elements from an input image (control image) and uses it to guide the creation of an output image based on the prompt. Approach writing the prompt as if you’re directing a professional photographer or stylist. Focus on materials, lighting quality, and artistic intention—not just objects. For example, a well-structured prompt might read: “Close-up editorial photo of a translucent green lip gloss tube on crushed iridescent plastic, diffused colored lighting, shallow DOF, high fashion product styling.”

Style tag layering: Known aesthetic labels that align with brand identity

The art of crafting effective prompts often relies on incorporating established style tags that resonate with familiar visual languages and datasets. By strategically blending terms from recognized aesthetic categories (ranging from editorial photography and analog film to anime, cyberpunk cityscapes, and brutalist structures), creators can guide the AI toward specific visual outcomes that align with their brand identity. These style descriptors serve as powerful anchors in the prompt engineering process. The versatility of these tags extends further through their ability to be combined and weighted, allowing for nuanced control over the final aesthetic. For instance, a skincare brand might blend the clean lines of product photography with dreamy, surreal elements, whereas a tech company could merge brutalist structure with cyberpunk elements for a distinctive visual identity. This approach to style mixing helps creators improve their outputs while maintaining clear ties to recognizable visual genres that resonate with their target audience. The key is understanding how these style tags interact and using their combinations to create unique, yet culturally relevant, visual expressions that serve specific creative or commercial objectives. The following table provides examples of prompts for a desired aesthetic.

Desired aestheticPrompt phrasesExample use case Retro / Y2K 2000s nostalgia, flash photography, candy tones, harsh lighting Metallic textures, thin fonts, early digital feel. Clean modern neutral tones, soft gradients, minimalist styling, editorial layout Great for wellness or skincare products. Bold streetwear urban background, oversized fit, strong pose, midday shadow Fashion photography and lifestyle ads. Prioritize outfit structure and location cues. Hyperreal surrealism dreamcore lighting, glossy textures, cinematic DOF, surreal shadows Plays well in music, fashion, or alt-culture campaigns.

Invoke a named style as a reference

Some prompt structures benefit from invoking a named visual signature from a specific artist, especially when combined with your own stylistic phrasing or workflows, as shown in the following example.

Prompt

“editorial studio portrait of a woman with glowing skin in minimalist glam makeup, high-contrast lighting, clean background, (depiction of Van Gogh style:1.3)”

The following is a more conceptual example.

Prompt

“product shot of a silver hair oil bottle with soft reflections on curved chrome, (depiction of Wes Anderson style:1.2), under cold studio lighting”

These phrases function like calling on a genre; they imply choices around materials, lighting, layout, and color tonality.

Use reference images to guide style

Another useful technique is using a reference image to guide the pose, color, or composition of the output. For use cases like matching a pose from a lookbook image, transferring a color palette from a campaign still, or copying shadowplay from a photo shoot, you can extract and apply structure or style from reference images.

Stability AI Image Services support a variety of image-to-image workflows where you can use a reference image (control image) to guide the output, such as Structure, Sketch, and Style. Tools like ControlNet (a neural network architecture developed by Stability AI that enhances control), IP-Adapter (an image prompt adapter), or clip-based captioning also enable further control when paired with Stability AI models.

We will discuss ControlNet, IP-Adapter, and clip-based captioning in a subsequent post.

The following is an example of an image-to-image workflow:

    Find a high-quality editorial reference. Use it with a depth, canny, or seg ControlNet to lock a pose. Style with a prompt.

Prompt

“fashion editorial of a model in layered knitwear, dramatic colored lighting, strong shadows, high ISO texture”

Create the right mood with lighting control

In a prompt, lighting sets tone, adds dimensionality, and mimics the language of photography. It shouldn’t just be “bright vs. dark.” Lighting is often the style itself, especially for audiences like Gen Z, for instance TikTok, early-aughts flash, harsh backlight, and color gels. The following table provides some useful lighting style prompt terms.

Lighting stylePrompt termsExample use case High-contrast studio hard directional light, deep shadows, controlled highlights Beauty, tech, fashion with punchy visuals Soft editorial diffused light, soft shadows, ambient glow, overcast Skincare, fashion, wellness Colored gel lighting blue and pink gel lighting, dramatic color shadows, rim lighting Nightlife, music-adjacent fashion, youth-forward styling Natural bounce golden hour, soft natural light, sun flare, warm tones Outdoors, lifestyle, brand-friendly minimalism

Build intent with posing and framing terms

Good posing helps products feel aspirational and digital models more dynamic. With AI, you must be intentional. Framing and pose cues help avoid stiffness, anatomical errors, and randomness. The following table provides some useful posing and framing prompt terms.

Prompt cueDescriptionTip looking off camera Creates candid or editorial energy Useful for lookbooks or ad pages hands in motion Adds realism and fluidity Avoids awkward, static body posture seated with body turned Adds depth and twist to the torso Reduces symmetry, feels natural shot from low angle Power or status cue Works well for stylized streetwear or product hero shots

Example: Putting it all together

The following example puts together what we’ve discussed in this post.

Prompt

“studio portrait of a model with platinum hair in metallic cargo pants and a cropped mesh hoodie, seated with legs wide on (acrylic stairs:1.6), magenta and teal gel lighting from left and behind, dramatic contrast, shot on 50mm, streetwear editorial for Gen Z campaign”

Negative prompt

blurry, extra limbs, watermark, cartoon, distorted face missing fingers, bad anatomy”

Let’s break down the preceding prompt. We direct the look of the subject (platinum hair, metallic clothes), specify their pose (seated wide-legged, confident, unposed), define the environment (acrylic stairs and studio setup, controlled, modern), state the lighting (mixed gel sources, bold stylization), designate the lens (50mm, portrait realism), and lastly detail the purpose (for Gen Z campaign, sets visual and cultural tone). Together, the prompt produces the desired result.

Best practices and troubleshooting

Prompting is rarely a one-and-done task, especially for creative use cases. Most great images come from refining an idea over multiple attempts. Consider the following methodology to iterate over your prompts:

Sometimes things go wrong—maybe the model ignores your prompt, or the image looks messy. These issues are common and often quick to fix, and you can get sharper, cleaner, and more intentional outputs with every adjustment. The following table provides useful tips for troubleshooting your prompts.

ProblemCause of issueHow to fix it Style feels random Model is confused or terms are vague Clarify style, add weight, remove conflicts Face gets warped Over-styled or lacks facial cues Add portrait of, headshot, or adjust pose or lighting Image is too dark Lighting not defined Add softbox from left, natural light, or time of day Repetitive poses Same seed or static structure Switch seed or change camera angle or subject action Lacks realism or feels “AI-ish” Wrong tone or artifacts Add negatives like cartoon, digital texture, distorted

Conclusion

Mastering advanced prompting techniques can turn basic image generation into professional creative outputs. Stability AI Image Services in Amazon Bedrock provide precise control over visual creation and editing, helping businesses convert concepts into production-ready assets. The combination of technical expertise and creative intent can help creators achieve the precision and consistency required in professional settings. This control proves valuable across multiple applications, such as marketing campaigns, brand consistency, and product visualizations. This post demonstrated how to optimize Stability AI Image Services in Amazon Bedrock to produce high-quality imagery that aligns with your creative goals.

To implement these techniques, access Stability AI Image Services through Amazon Bedrock or explore Stability AI’s foundation models available in Amazon SageMaker JumpStart. You can also find practical code examples in our GitHub repository.


About the authors

Maxfield Hulker is the VP of Community and Business Development at Stability AI. He is a longtime leader in the generative AI space. He has helped build creator-focused platforms like Civitai and Dream Studio. Maxfield regularly publishes guides and tutorials to make advanced AI techniques more accessible.

Suleman Patel is a Senior Solutions Architect at Amazon Web Services (AWS), with a special focus on machine learning and modernization. Leveraging his expertise in both business and technology, Suleman helps customers design and build solutions that tackle real-world business problems. When he’s not immersed in his work, Suleman loves exploring the outdoors, taking road trips, and cooking up delicious dishes in the kitchen.

Isha Dua is a Senior Solutions Architect based in the San Francisco Bay Area working with generative AI model providers and helping customer optimize their generative AI workloads on AWS. She helps enterprise customers grow by understanding their goals and challenges, and guides them on how they can architect their applications in a cloud-based manner while supporting resilience and scalability. She’s passionate about machine learning technologies and environmental sustainability.

Fabio Branco is a Senior Customer Solutions Manager at Amazon Web Services (AWS) and a strategic advisor, helping customers achieve business transformation, drive innovation through generative AI and data solutions, and successfully navigate their cloud journeys. Prior to AWS, he held Product Management, Engineering, Consulting, and Technology Delivery roles across multiple Fortune 500 companies in industries, including retail and consumer goods, oil and gas, financial services, insurance, and aerospace and defense.

Fish AI Reader

Fish AI Reader

AI辅助创作,多种专业模板,深度分析,高质量内容生成。从观点提取到深度思考,FishAI为您提供全方位的创作支持。新版本引入自定义参数,让您的创作更加个性化和精准。

FishAI

FishAI

鱼阅,AI 时代的下一个智能信息助手,助你摆脱信息焦虑

联系邮箱 441953276@qq.com

相关标签

Amazon Bedrock Stability AI 图像生成 提示词工程 AI艺术 企业应用 Prompt Engineering Image Generation AI Art Enterprise Solutions
相关文章