Create ad-ready product images (single or collage) by back-solving sub-image sizes from target output ratio, grounding scene design with media_comprehension, generating images via image_generator with strict request params and actor-count control, and pairing each deliverable with a short social tagline for 小红书/抖音.
Generate advertising images from product assets with two output styles:
Core method: decide the final target ratio first, then compute sub-image sizes, and call image_generator directly with matching size (no manual pre-crop/pre-pad on source assets).
16:9, 1920x1080)2x2, 1x3, 1x2)SKILL__active_skill(skill_name="media_comprehension")image_generator with exact request params.When the ad needs a supporting actor beyond the product—either because the user asked for one or because they supplied material—do not fetch companion assets from TikTok or similar platforms. Use what is already available:
reference_images for image_generator after a quick media_comprehension check that the image shows the intended actor/look.SKILL__active_skill(skill_name="media_comprehension") on each candidate frame, and pick a frame where the model confirms the desired supporting actor/appearance. Use that frame image as reference_images.If the user provides no usable image or video reference, you may still proceed: call image_generator without actor reference_images and describe the supporting actor so the model generates that character in-scene—still following the actor-count rules below.
image_generator request contract (keep these fields)image_generator(
content="...",
info={
"image_url": "/path/to/product.jpg",
"size": "960x540",
"output_dir": "/path/to/output"
}
)
content: prompt describing scene and composition.info.image_url: primary product image path.info.size: output size string in "WIDTHxHEIGHT" format.info.output_dir: output directory."reference_images": ["/path/to/ref1.jpg", "/path/to/ref2.jpg"]
When using Mode 2/3, model may generate too many actors unless count is explicit.
只有一只/个, 最多两只/个only one, a single, at most two# Good
content = "Create a warm living-room scene. There is only one cat interacting with the cat tree."
# Bad
content = "Create a warm scene with cats interacting with the cat tree."
| Final size | Layout | Sub-image size |
|---|---|---|
| 1920x1080 (16:9) | 2x2 | 960x540 |
| 1920x1080 (16:9) | 1x3 | 640x1080 |
| 1920x1080 (16:9) | 1x2 | 960x1080 |
| 1600x1200 (4:3) | 2x2 | 800x600 |
| 1600x1200 (4:3) | 1x3 | 533x1200 |
| 1080x1080 (1:1) | 2x2 | 540x540 |
| 1080x1080 (1:1) | 1x3 | 360x1080 |
| 1080x1920 (9:16) | 2x2 | 540x960 |
from PIL import Image
# 1) Analyze product style first
SKILL__active_skill(skill_name="media_comprehension")
# 2) Decide target and layout
final_size = (1920, 1080)
layout = "2x2"
sub_size = (960, 540)
# 3) Generate sub-images
for scene in scenes:
content = scene["prompt"] # include explicit actor count for Mode 2/3
info = {
"image_url": product_image,
"size": f"{sub_size[0]}x{sub_size[1]}",
"output_dir": output_dir
}
if scene.get("reference_images"):
info["reference_images"] = scene["reference_images"]
image_generator(content=content, info=info)
# 4) Stitch
canvas = Image.new("RGB", final_size, (255, 255, 255))
# paste each sub-image by layout...
canvas.save("final_ad.png", quality=95)
content.size values match computed sub-image dimensions.size control in generation.