Expert prompt engineering for FLUX.1 Kontext image generation and editing. Use when users request AI image generation, image editing, style transformations, or visual content modifications. This skill covers both text-to-image generation and image-to-image editing capabilities.
Expert prompt engineering for FLUX.1 Kontext image generation and editing. Use when users request AI image generation, image editing, style transformations, or visual content modifications. This skill covers both text-to-image generation and image-to-image editing capabilities.
When to Use This Skill
Trigger this skill when users ask for:
Image Generation: Creating new images from text descriptions
Use when: Open weights needed for custom implementations
Core Capabilities
1. Text-to-Image Generation
Create images from scratch using text prompts with strong prompt adherence and fast generation.
Key Features:
Aspect ratios from 3:7 to 7:3
Default 1024x1024 (1 megapixel total)
Reproducible results with seed parameter
2. Image-to-Image Editing
Edit existing images with simple text instructions while preserving unmentioned elements.
Key Features:
Context-aware editing (understands image content)
Maintains style and composition automatically
No complex workflows or fine-tuning required
3. Character Consistency
Maintain character identity across multiple edits and transformations.
Key Features:
Preserves facial features, hairstyle, and distinctive attributes
Works across dramatic scene and style changes
Supports iterative editing workflows
4. Text Editing in Images
Replace text in signs, posters, labels while maintaining original styling.
Key Features:
Precise text replacement using quotation marks
Preserves font style, color, and positioning
Works with various text contexts
5. Style Transformation
Transform images into different artistic styles or apply reference image styles.
Key Features:
Name specific styles (Bauhaus, watercolour, oil painting)
Reference known artists or movements
Use input images as style references
Prompt Engineering Framework
Basic Principles
Be Specific: Precise language yields better results
Use exact color names, detailed descriptions, clear action verbs
Avoid vague terms like "nice," "good," or "artistic"
Start Simple: Begin with core changes before adding complexity
Test basic edits first
Build upon successful results iteratively
Preserve Intentionally: Explicitly state what should remain unchanged
"while maintaining the same [facial features/composition/lighting]"
"everything else should stay [black and white/in the same position]"
Name Subjects Directly: Use specific descriptors instead of pronouns
✅ "the woman with short black hair"
❌ "her" or "she"
✅ "the red car"
❌ "it"
Maximum Token Limit: 512 tokens per prompt
Prompt Precision Levels
Quick Edits (Simple Changes)
For straightforward modifications where style changes are acceptable.
Example: "Change to daytime"
Risk: May alter the style of the input image
Controlled Edits (Style-Preserving Changes)
For modifications that should maintain the original style.
Example: "Change to daytime while maintaining the same style of the painting"
Benefit: Results closely match input image style
Complex Transformations (Multi-Element Changes)
For multiple modifications with detailed instructions.
Example: "Change the setting to daytime, add a lot of people walking the sidewalk while maintaining the same style of the painting"
Best Practice: Add as many details as possible as long as instructions per edit aren't too complicated
Text-to-Image Prompting
Effective Prompt Structure
Components to Include:
Subject: Main focus of the image
Action/Pose: What the subject is doing
Setting/Environment: Where the scene takes place
Style: Artistic style or technique
Details: Colors, lighting, mood, composition
Technical specs: Camera angle, depth of field, etc.
Example Prompt Breakdown:
"A cute round rusted robot repairing a classic pickup truck, colourful,
futuristic, vibrant glow, van gogh style"
- Subject: cute round rusted robot
- Action: repairing a classic pickup truck
- Style: van gogh style
- Details: colourful, futuristic, vibrant glow
Style Keywords
Artistic Styles:
Abstract expressionist
Pop Art
Cubism
Van Gogh style
Renaissance painting style
Watercolour painting
Oil painting with visible brushstrokes
Pencil sketch with cross-hatching
Technical Descriptors:
Cinematic composition
Shallow depth of field
Ultra-detailed textures
Dynamic lighting
Atmospheric fog
High contrast
Monochromatic palette
Photorealistic
Mood & Atmosphere:
Moody composition
Surreal and ominous mood
Mysterious, grim, provocative
Warm colors, vibrant glow
Glossy textures
Image-to-Image Editing
Basic Object Modifications
For straightforward changes like color, size, or simple replacements.
Prompt Structure: "[Action] [object] to [new state]"
Examples:
"Change the car color to red"
"Make the shirt blue"
"Replace the hat with a crown"
Style Transfer
Using Text Prompts
Framework:
Name the specific style: "Transform to Bauhaus art style"
Reference known artists: "Convert to Renaissance painting style"
Detail key characteristics: "Transform to oil painting with visible brushstrokes, thick paint texture, and rich color depth"
Preserve what matters: "Change to Bauhaus art style while maintaining the original composition and object placement"
Examples:
"Convert to pencil sketch with natural graphite lines, cross-hatching, and visible paper texture"
"Transform to oil painting with visible brushstrokes and rich colors"
"Change to watercolour style while maintaining the same composition"
Using Reference Images
When you have a style reference image, use prompts like:
"Using this style, [describe the scene you want to create]"
Example: "Using this style, a bunny, a dog and a cat are having a tea party seated around a small white table"
Character Consistency Framework
To maintain the same character across edits:
Establish the Reference: Clearly identify your character
"This person..."
"The woman with short black hair..."
"The character with [distinctive features]..."
Specify the Transformation: Clearly state what's changing
Environment: "...now in a tropical beach setting"
Activity: "...now picking up weeds in a garden"
Style: "Transform to Claymation style while keeping the same person"
Preserve Identity Markers: Explicitly mention what should remain
"...while maintaining the same facial features, hairstyle, and expression"
"...keeping the same identity and personality"
"...preserving their distinctive appearance"
Common Mistake: Using vague references like "her" instead of "The woman with short black hair"
Text Editing in Images
Prompt Structure: Replace '[original text]' with '[new text]'
Best Practices:
Use quotation marks around the exact text to change
Specify preservation when needed: "Replace 'joy' with 'BFL' while maintaining the same font style and color"
Keep text length similar to avoid layout issues
Match case of original text (uppercase/lowercase)
Examples:
"Replace 'Choose joy' with 'Choose BFL'"
"Change 'MONTREAL' to 'FREIBURG'"
"Replace 'Sync & Bloom' with 'FLUX & JOY'"
Visual Cues and Annotation Boxes
Use bright coloured boxes in the input image to mark areas for targeted editing.
Example: "Add hats in the boxes"
Benefits:
Precise targeting of specific regions
Especially effective for text edits requiring repositioning
Makes referencing image areas seamless
Note: Annotation boxes are automatically removed in the output
Iterative Editing
For dramatic transformations, work in steps:
First Edit: Make the primary change
Second Edit: Refine or add secondary changes
Continue: Build on previous results iteratively
Example Sequence:
"Remove the object from her face"
"She is now taking a selfie in the streets of Freiburg, it's a lovely day out"
"It's now snowing, everything is covered in snow"
Troubleshooting Common Issues
Character Identity Changes Too Much
Problem: Character looks different after transformation
Solutions:
✅ Be specific about identity markers: "maintain the exact same face, hairstyle, and distinctive features"
✅ Use detailed prompts: "Transform the man into a viking warrior while preserving his exact facial features, eye color, and facial expression"
✅ Focus on changing only what's needed: "Change the clothes to be a viking warrior" (instead of "Transform to a viking")
Why It Happens: The verb "transform" without qualifiers signals complete change. Use more specific verbs when you want to maintain aspects.
Composition Control Issues
Problem: Subject position, scale, or pose changes unexpectedly
Solutions:
❌ Simple: "He's now on a sunny beach"
❌ Vague: "Put him on a beach"
✅ Precise: "Change the background to a beach while keeping the person in the exact same position, scale, and pose. Maintain identical subject placement, camera angle, framing, and perspective. Only replace the environment around them"
Why It Happens: Vague instructions leave interpretation open. The model might adjust framing to match typical compositions for the new setting.
Style Isn't Applying Correctly
Problem: Style changes inconsistently or loses important elements
Solutions:
❌ Basic: "Make it a sketch"
✅ Precise: "Convert to pencil sketch with natural graphite lines, cross-hatching, and visible paper texture while preserving all architectural details and composition"
Why It Happens: Simple style prompts don't provide enough guidance on what to preserve during transformation.
General Troubleshooting
If the model changes elements you want unchanged:
Be explicit about preservation: "everything else should stay black and white"
State what should remain: "maintain all other aspects of the original image"
Specify exactly what should change and what shouldn't
API Integration Guidelines
Text-to-Image Generation
Endpoint: /v1/flux-kontext-pro
Required Parameters:
prompt: Text description of desired image
Optional Parameters:
aspect_ratio: "1:1" (default), supports 3:7 to 7:3
safety_tolerance: 0 (strict) to 6 (permissive), default: 2
output_format: "jpeg" (default) or "png"
Image-to-Image Editing
Endpoint: /v1/flux-kontext-pro
Required Parameters:
prompt: Text description of edit
input_image: Base64 encoded image (up to 20MB or 20 megapixels)
Optional Parameters: Same as text-to-image
Important:
Output dimensions try to match input (rounded to multiples of 32)
Use aspect_ratio to override default behaviour
Signed URLs valid for 10 minutes only
Workflow Examples
Example 1: Simple Object Edit
User: "I have a yellow car, make it red"
Assistant approach:
1. Acknowledge the task
2. Prepare simple, direct prompt: "Change the car color to red"
3. No need for complex instructions
4. Process with input image + prompt
Example 2: Style Transfer with Preservation
User: "Turn this photo into a watercolour painting"
Assistant approach:
1. Acknowledge the task
2. Create detailed prompt: "Transform to watercolour painting with fluid brush strokes, soft color transitions, and visible paper texture while maintaining the original composition and subject positioning"
3. Explain that style will change but composition will be preserved
4. Process with input image + prompt
Example 3: Character Consistency Across Scenes
User: "Take this photo of me and put me on a beach, then in the snow"
Assistant approach:
1. First edit - precise positioning: "Change the background to a beach while keeping the person in the exact same position, scale, and pose. Maintain identical subject placement, camera angle, framing, and perspective"
2. Second edit - use output of first as input: "Change the environment to a snowy winter scene, everything is covered in snow, while maintaining the person's exact position, features, and pose"
3. Emphasise character features stay consistent
Example 4: Text Replacement
User: "Change the sign from 'Open' to 'Closed'"
Assistant approach:
1. Acknowledge the task
2. Use quotation mark format: "Replace 'Open' with 'Closed'"
3. Optional preservation: "Replace 'Open' with 'Closed' while maintaining the same font style, color, and size"
4. Process with input image + prompt
Example 5: Complex Multi-Element Edit
User: "Make this painting daytime and add people"
Assistant approach:
1. Acknowledge complexity
2. Create detailed prompt: "Change the setting to daytime, add a lot of people walking the sidewalk while maintaining the same style of the painting"
3. Include as many details as possible
4. Process with input image + prompt
5. May suggest iterative approach if too complex
Best Practices Summary
Be Specific: Use exact colors, detailed descriptions, clear action verbs
Start Simple: Test basic edits before adding complexity; Kontext handles iterative editing well
Preserve Intentionally: Explicitly state what shouldn't change
Iterate When Needed: Break dramatic changes into sequential edits
Name Subjects Directly: Use descriptors instead of pronouns
Use Quotation Marks for Text: Quote exact text to change
Control Composition Explicitly: Specify camera angle and positioning preservation
Choose Verbs Carefully: "Transform" implies complete change; "change the clothes" or "replace the background" offers more control
Quick Reference
For Image Generation
Include: subject, action, setting, style, details
Use technical descriptors for quality
Reference known artistic styles
Specify mood and atmosphere
For Image Editing
Name what changes explicitly
State what should stay the same
Use quotation marks for text edits
Be specific about character features to preserve
Consider iterative approach for complex edits
For Style Transfer
Name the specific style clearly
Reference known artists or movements
Detail key visual characteristics
Explicitly preserve composition if needed
For Character Consistency
Describe character specifically (not with pronouns)
State transformation explicitly
List identity markers to preserve
Use "while maintaining" phrases liberally
Token Limit Reminder
Maximum prompt length: 512 tokens
Keep prompts concise while being specific. If approaching limit, prioritise:
Core change description
Preservation statements
Style specifics
Secondary details
Notes
Kontext excels at understanding context, so detailed scene descriptions help
More explicit instructions rarely hurt unless too complicated per edit
For dramatic transformations, step-by-step approach works best
Character consistency is a strength - leverage it for sequential edits
Text editing works best with clear, readable fonts
Visual annotation boxes enhance precision for local edits