Amazon Bedrock 助力企业图像创作新篇章

Amazon Bedrock now offers Stability AI Image Services: 9 tools that improve how businesses create and modify images. The technology extends Stable Diffusion and Stable Image models to give you precise control over image creation and editing. Clear prompts are critical—they provide art direction to the AI system. Strong prompts control specific elements like tone, texture, lighting, and composition to create the desired visual outcomes. This capability serves professional needs across product photography, concept, and marketing campaigns.

In this post, we expand on the post Understanding prompt engineering: Unlock the creative potential of Stability AI models on AWS. We show how to effectively use advanced prompting techniques to maximize image generation quality and precision for enterprise application using Stability AI Image Services in Amazon Bedrock.

Solution overview

Stability AI Image Services are available as APIs in Amazon Bedrock, featuring capabilities such as, in-painting, style transfer, recoloring, background removal, object removal, style transfer, and much more.

In the following sections, we first discuss prompt structure for maximum control of image generation, then we provide advanced techniques of prompting for stylistic guidance. Code samples can be found in the following GitHub repository.

Prerequisites

To get started with Stability AI Image Services in Amazon Bedrock, follow the instructions in Getting started with the API to complete the following prerequisites:

AWS Identity and Access Management

Structure prompts that maximize control

To maximize the granular capabilities of Stability AI Image Services in Amazon Bedrock, you must construct prompts that enable fine-grained control.

This section outlines best practices for building effective prompts that produce the desired output. We demonstrate how prompt structure affects results and why more structured prompts typically yield more consistent and controllable outcomes.

Choose the right prompt type for your use case

Selecting the right prompt format helps the model better understand your intent. Three primary prompt formats deliver different levels of control and readability:

Natural language maximizes readability and is best for general usage Tag-based formats enable precise structural control and are ideal for technical application Hybrid formats combine natural language and the structural elements of tags to provide even more control

The following table provides examples of these three common ways to phrase your prompts. Each prompt format has its strengths depending on your goal or the interface you’re using.

Prompt type Prompt example Generated image using Stable Image Ultra in Amazon BedrockDescription and use caseBasic Prompt (Natural Language) “A clean product photo of a perfume bottle on a marble countertop”

This is readable and intuitive. Great for exploration, conversational tools, and some model types. Stable Diffusion 3.5 responds best to this style.Tag-Based Prompt “perfume bottle, marble surface, soft light, high quality, product photo”

Used in many generation UIs or with models trained on datasets like LAION or Danbooru. Compact and good for stacking details.Hybrid Prompt “perfume bottle on marble counter, soft studio lighting, sharp focus, f/2.8lens”

Best of both worlds. Add emphasis with weighting syntax to influence the model’s priorities.

Build modular prompts

Modular prompting enhances AI image generation effectiveness. This approach divides prompts into distinct components, each specifying what to draw and how it should appear. Modular structures provide several benefits: they help prevent conflicting or confusing instructions, allow for precise output control, and simplify prompt debugging. By isolating individual elements, you can quickly identify and adjust effective or ineffective parts of your prompts. This method ultimately leads to more refined and targeted AI-generated images.

The following table provides examples of modular prompt modules. Experiment with different prompt sequences for your desired outcome; for example, placing the style before the subject will give it a more visual weight.

ModuleExampleDescription Prefix “fashion editorial portrait of” Sets the tone and intent for a high-fashion styled portrait Subject “a woman with medium-brown skin and short coiled hair” Gives the model’s look and surface detail to help guide facial features Modifiers “wearing an asymmetrical black mesh top, metallic jewelry” Adds stylized clothing and accessories for visual interest Action “seated with her shoulders angled, eyes locked on camera, one arm lifted” Describes body language and pose to give dynamic composition Environment “bathed in intersecting beams of hard directional light through window slats” Adds context for dramatic light play and atmosphere Style “high-contrast chiaroscuro lighting, sculptural and abstract” Informs the aesthetic and mood (shadow-driven, moody, architectural) Camera/Lighting “shot on 85mm, studio setup, layered shadows and light falling across face and body” Adds technical precision and helps control realism and fidelity

The following example illustrates how to use a modular prompt to generate the desired output.

Modular PromptGenerated Image Using Stable Image Ultra in Amazon Bedrock “fashion editorial portrait of a woman with medium-brown skin and short coiled hair, wearing an asymmetrical black mesh top and metallic jewelry, seated with shoulders angled and one arm lifted, eyes locked on camera, bathed in intersecting beams of hard directional light through window slats, layered shadows and highlights sculpting her face and body, high-contrast chiaroscuro lighting, abstract and bold, shot on 85mm in studio”

Use negative prompts for polished output

Negative prompts improve AI output quality by removing specific visual elements. Explicitly defining what not to include in the prompt guides the model’s output, typically leading to professional outputs. Negative prompts act like a retoucher’s checklist used to address aspects of an image to enhance quality and appeal. For example, “No weird hands. No blurry corners. No cartoon filters. Definitely no watermarks.” Negative prompts result in clean, confident, compositions, free of distracting element and distortions.

The following table provides examples of additional tokens that can be used in negative prompts.

Artifact TypeTokens to Use Low quality or noise blurry, lowres, jpeg artifacts, noisy Anatomy or model issues deformed, extra limbs, bad hands, missing fingers Style clashes cartoon, illustration, anime, painting Technical errors watermark, text, signature, overexposed General cleanup ugly, poorly drawn, distortion, worst quality

The following example illustrates how a well-structured negative prompt can enhance photorealism.

Without Negative Prompt

Prompt

“(medium full shot) of (charming office cubicle) made of glass material, multiple colors, modern style, space-saving, upholstered seat, patina, gold trim, located in a modern garden, with sleek furniture, stylish decor, bright lighting, comfortable seating, Masterpiece, best quality, raw photo, realistic, very aesthetic, dark “

With Negative Prompt

Prompt

Negative Prompt

“cartoon, 3d render, cgi, oversaturated, smooth plastic textures, unreal lighting, artificial, matte surface, painterly, dreamy, glossy finish, digital art, low detail background”

Emphasize or suppress elements with prompt weighting

Prompt weighting controls the influence of individual elements in AI image generation. These numerical weights prioritize specific prompt components over others. For example, to emphasize the character over the background, you can apply a 1.8 weight to “character” (character: 1.8) and 1.1 to “background” (background: 1.1), which makes sure the model prioritizes character detail while maintaining environmental context. This targeted emphasis produces more precise outputs by minimizing competition between prompt elements and clarifying the model’s priorities.

The syntax for prompt weights is (<term>:<weight>). You can also use a shorthand such as ((<term>)), where the number of parentheses represent the weight. Values between 0.0–1.0 deemphasize the term, and values between 1.1–2.0 emphasize the term.For example:

(term:1.2): Emphasize (term:0.8): Deemphasize ((term)): Shorthand for (term:1.2) (((((((((term)))))))): Shorthand for (term:1.8)

The following example shows how prompt weights contribute to the generated output.

Prompt with weights

“editorial product photo of (a translucent gel moisturizer jar:1.4) placed on a (frosted glass pedestal:1.2), surrounded by (dewy pink flower petals:1.1), with soft (diffused lighting:1.3), subtle water droplets, shallow depth of field”

Prompt without weights

“editorial product photo of a translucent gel moisturizer jar placed on a frosted glass pedestal, surrounded by dewy pink flower petals, with soft, subtle water droplets, shallow depth of field”

You can also use weights in negative prompts to reduce how strongly the model avoids something. For example, “(text:0.5), (blurry:0.2), (lowres:0.1).” This tells the model to be especially sure to avoid generating blurry text or low-resolution content.

Giving specific stylistic guidance

Effective prompt writing when using Stability AI Image Services such as Style Transfer and Style Guide requires a good understanding of style matching and reference-driven prompting. These techniques help provide clear stylistic direction for both text-to-image and image-to-image creation.

Image-to-image style transfer extracts stylistic elements from an input image (control image) and uses it to guide the creation of an output image based on the prompt. Approach writing the prompt as if you’re directing a professional photographer or stylist. Focus on materials, lighting quality, and artistic intention—not just objects. For example, a well-structured prompt might read: “Close-up editorial photo of a translucent green lip gloss tube on crushed iridescent plastic, diffused colored lighting, shallow DOF, high fashion product styling.”

Style tag layering: Known aesthetic labels that align with brand identity

The art of crafting effective prompts often relies on incorporating established style tags that resonate with familiar visual languages and datasets. By strategically blending terms from recognized aesthetic categories (ranging from editorial photography and analog film to anime, cyberpunk cityscapes, and brutalist structures), creators can guide the AI toward specific visual outcomes that align with their brand identity. These style descriptors serve as powerful anchors in the prompt engineering process. The versatility of these tags extends further through their ability to be combined and weighted, allowing for nuanced control over the final aesthetic. For instance, a skincare brand might blend the clean lines of product photography with dreamy, surreal elements, whereas a tech company could merge brutalist structure with cyberpunk elements for a distinctive visual identity. This approach to style mixing helps creators improve their outputs while maintaining clear ties to recognizable visual genres that resonate with their target audience. The key is understanding how these style tags interact and using their combinations to create unique, yet culturally relevant, visual expressions that serve specific creative or commercial objectives. The following table provides examples of prompts for a desired aesthetic.

Desired aestheticPrompt phrasesExample use case Retro / Y2K 2000s nostalgia, flash photography, candy tones, harsh lighting Metallic textures, thin fonts, early digital feel. Clean modern neutral tones, soft gradients, minimalist styling, editorial layout Great for wellness or skincare products. Bold streetwear urban background, oversized fit, strong pose, midday shadow Fashion photography and lifestyle ads. Prioritize outfit structure and location cues. Hyperreal surrealism dreamcore lighting, glossy textures, cinematic DOF, surreal shadows Plays well in music, fashion, or alt-culture campaigns.

Invoke a named style as a reference

Some prompt structures benefit from invoking a named visual signature from a specific artist, especially when combined with your own stylistic phrasing or workflows, as shown in the following example.

Prompt

“editorial studio portrait of a woman with glowing skin in minimalist glam makeup, high-contrast lighting, clean background, (depiction of Van Gogh style:1.3)”

The following is a more conceptual example.

Prompt

“product shot of a silver hair oil bottle with soft reflections on curved chrome, (depiction of Wes Anderson style:1.2), under cold studio lighting”

These phrases function like calling on a genre; they imply choices around materials, lighting, layout, and color tonality.

Use reference images to guide style

Another useful technique is using a reference image to guide the pose, color, or composition of the output. For use cases like matching a pose from a lookbook image, transferring a color palette from a campaign still, or copying shadowplay from a photo shoot, you can extract and apply structure or style from reference images.

Stability AI Image Services support a variety of image-to-image workflows where you can use a reference image (control image) to guide the output, such as Structure, Sketch, and Style. Tools like ControlNet (a neural network architecture developed by Stability AI that enhances control), IP-Adapter (an image prompt adapter), or clip-based captioning also enable further control when paired with Stability AI models.

We will discuss ControlNet, IP-Adapter, and clip-based captioning in a subsequent post.

The following is an example of an image-to-image workflow:

Find a high-quality editorial reference. Use it with a depth, canny, or seg ControlNet to lock a pose. Style with a prompt.

Prompt

“fashion editorial of a model in layered knitwear, dramatic colored lighting, strong shadows, high ISO texture”

Create the right mood with lighting control

In a prompt, lighting sets tone, adds dimensionality, and mimics the language of photography. It shouldn’t just be “bright vs. dark.” Lighting is often the style itself, especially for audiences like Gen Z, for instance TikTok, early-aughts flash, harsh backlight, and color gels. The following table provides some useful lighting style prompt terms.

Lighting stylePrompt termsExample use case High-contrast studio hard directional light, deep shadows, controlled highlights Beauty, tech, fashion with punchy visuals Soft editorial diffused light, soft shadows, ambient glow, overcast Skincare, fashion, wellness Colored gel lighting blue and pink gel lighting, dramatic color shadows, rim lighting Nightlife, music-adjacent fashion, youth-forward styling Natural bounce golden hour, soft natural light, sun flare, warm tones Outdoors, lifestyle, brand-friendly minimalism

Build intent with posing and framing terms

Good posing helps products feel aspirational and digital models more dynamic. With AI, you must be intentional. Framing and pose cues help avoid stiffness, anatomical errors, and randomness. The following table provides some useful posing and framing prompt terms.

Prompt cueDescriptionTip looking off camera Creates candid or editorial energy Useful for lookbooks or ad pages hands in motion Adds realism and fluidity Avoids awkward, static body posture seated with body turned Adds depth and twist to the torso Reduces symmetry, feels natural shot from low angle Power or status cue Works well for stylized streetwear or product hero shots

Example: Putting it all together

The following example puts together what we’ve discussed in this post.

Prompt

“studio portrait of a model with platinum hair in metallic cargo pants and a cropped mesh hoodie, seated with legs wide on (acrylic stairs:1.6), magenta and teal gel lighting from left and behind, dramatic contrast, shot on 50mm, streetwear editorial for Gen Z campaign”

Negative prompt

“blurry, extra limbs, watermark, cartoon, distorted face missing fingers, bad anatomy”

Let’s break down the preceding prompt. We direct the look of the subject (platinum hair, metallic clothes), specify their pose (seated wide-legged, confident, unposed), define the environment (acrylic stairs and studio setup, controlled, modern), state the lighting (mixed gel sources, bold stylization), designate the lens (50mm, portrait realism), and lastly detail the purpose (for Gen Z campaign, sets visual and cultural tone). Together, the prompt produces the desired result.

Best practices and troubleshooting

Prompting is rarely a one-and-done task, especially for creative use cases. Most great images come from refining an idea over multiple attempts. Consider the following methodology to iterate over your prompts:

Keep a prompt log Change one variable at a time Save seeds and base images Use comparison grids

Sometimes things go wrong—maybe the model ignores your prompt, or the image looks messy. These issues are common and often quick to fix, and you can get sharper, cleaner, and more intentional outputs with every adjustment. The following table provides useful tips for troubleshooting your prompts.

ProblemCause of issueHow to fix it Style feels random Model is confused or terms are vague Clarify style, add weight, remove conflicts Face gets warped Over-styled or lacks facial cues Add portrait of, headshot, or adjust pose or lighting Image is too dark Lighting not defined Add softbox from left, natural light, or time of day Repetitive poses Same seed or static structure Switch seed or change camera angle or subject action Lacks realism or feels “AI-ish” Wrong tone or artifacts Add negatives like cartoon, digital texture, distorted

Conclusion

Mastering advanced prompting techniques can turn basic image generation into professional creative outputs. Stability AI Image Services in Amazon Bedrock provide precise control over visual creation and editing, helping businesses convert concepts into production-ready assets. The combination of technical expertise and creative intent can help creators achieve the precision and consistency required in professional settings. This control proves valuable across multiple applications, such as marketing campaigns, brand consistency, and product visualizations. This post demonstrated how to optimize Stability AI Image Services in Amazon Bedrock to produce high-quality imagery that aligns with your creative goals.

To implement these techniques, access Stability AI Image Services through Amazon Bedrock or explore Stability AI’s foundation models available in Amazon SageMaker JumpStart. You can also find practical code examples in our GitHub repository.

About the authors

Maxfield Hulker is the VP of Community and Business Development at Stability AI. He is a longtime leader in the generative AI space. He has helped build creator-focused platforms like Civitai and Dream Studio. Maxfield regularly publishes guides and tutorials to make advanced AI techniques more accessible.

Suleman Patel is a Senior Solutions Architect at Amazon Web Services (AWS), with a special focus on machine learning and modernization. Leveraging his expertise in both business and technology, Suleman helps customers design and build solutions that tackle real-world business problems. When he’s not immersed in his work, Suleman loves exploring the outdoors, taking road trips, and cooking up delicious dishes in the kitchen.

Isha Dua is a Senior Solutions Architect based in the San Francisco Bay Area working with generative AI model providers and helping customer optimize their generative AI workloads on AWS. She helps enterprise customers grow by understanding their goals and challenges, and guides them on how they can architect their applications in a cloud-based manner while supporting resilience and scalability. She’s passionate about machine learning technologies and environmental sustainability.

Fabio Branco is a Senior Customer Solutions Manager at Amazon Web Services (AWS) and a strategic advisor, helping customers achieve business transformation, drive innovation through generative AI and data solutions, and successfully navigate their cloud journeys. Prior to AWS, he held Product Management, Engineering, Consulting, and Technology Delivery roles across multiple Fortune 500 companies in industries, including retail and consumer goods, oil and gas, financial services, insurance, and aerospace and defense.

Solution overview

Prerequisites

Structure prompts that maximize control

Choose the right prompt type for your use case

Build modular prompts

Use negative prompts for polished output

Emphasize or suppress elements with prompt weighting

Giving specific stylistic guidance

Style tag layering: Known aesthetic labels that align with brand identity

Invoke a named style as a reference

Use reference images to guide style

Create the right mood with lighting control

Build intent with posing and framing terms

Example: Putting it all together

Best practices and troubleshooting

Conclusion

About the authors

Fish AI Reader

FishAI

联系邮箱 441953276@qq.com

相关标签