Guides May 8, 2026 18 read

How ecommerce brands automate product photography for Amazon, Shopify, and Etsy with AI (2026)

How ecommerce brands automate product photography for Amazon, Shopify, and Etsy with AI. Real specs, real APIs, real cost math. Pipeline playbook.

Apoorv Sharma

Performance Marketing Manager

A typical studio shoot for a single SKU runs $150-$500 depending on complexity, lighting, and post-production. A brand with 100 SKUs selling across Amazon, Shopify, and Etsy needs roughly 600-1,500 unique compliant images once you account for angles, lifestyle context, and marketplace-specific formatting. The studio math at scale: $90,000 to $750,000.

The API math: $1.50-$3.50 per fully-automated, marketplace-ready listing. For the same 100-SKU catalog, that's $900-$5,250 total. Two orders of magnitude cheaper, and a calendar measured in hours instead of weeks.

This is why ecommerce brands are shifting product photography from studio reshoots to API pipelines in 2026. The shift is about the marketplace tax (the work of producing platform-specific compliant images at scale), and the math turning impossible at any meaningful catalog size.

This article is the operator playbook for that shift. The pipeline shape, the marketplace specs you need to hit, the specific Runflow Solutions that do the work, the cost math per listing, and the worked examples of how brands actually run this in production. By the end, you'll have a concrete picture of what to build and what to budget.

A note on disclosure. Runflow builds the Solutions APIs in this playbook. The pipeline pattern works with other providers too, and I'll flag where you might substitute. I'm walking through Runflow's APIs because they're the ones I know best and they're priced and structured to fit together without much custom glue. The longer engineering story behind these APIs lives in Building an AI Image Generator API: 14 things that broke, if you want to see the infrastructure side.

What is product photography automation?

Product photography automation is the use of AI-powered APIs to convert a single source image (lifestyle shot, on-model photo, or studio capture) into multiple marketplace-compliant outputs without reshooting. The pipeline typically handles product extraction, background standardization, gap repair, and multi-format resizing in a sequence of API calls.

This is distinct from "AI product photo tools" you'll see on consumer sites (Pebblely, Flair, Pixelcut, Claid). Those are SaaS products with their own UI, designed for non-technical users to drag-and-drop one image at a time. Automation is the API layer underneath, designed for teams running hundreds or thousands of images through a programmatic pipeline.

The category shifted in 2024-2025 as image-generation models (Google's Nano Banana, OpenAI's gpt-image-2, Flux) reached production quality. By 2026, specialized API providers have built the supporting infrastructure: prompt-guided product extraction, reference-based inpainting for gap repair, generative reframing for marketplace-specific aspect ratios. The pipeline now exists as composable primitives.

What automation does well:

High-volume catalog work (100+ SKUs, multi-marketplace)
Reusing existing assets (lifestyle shots, on-model campaigns) for marketplace listings
Multi-format fan-out (one source to every aspect ratio)
Variant generation (color and pattern swaps for fashion)
Recurring catalog refreshes when marketplace specs change

What automation doesn't replace:

The original creative direction
New product introductions where no source asset exists
Premium hero imagery for brand campaigns

Most brands need both: studio shoots for hero assets, automation for the marketplace fan-out. The article below is about the automation half.

Why this matters: the marketplace tax

Every marketplace has different image requirements. Amazon mandates pure white backgrounds. Shopify wants square. Etsy permits lifestyle context. TikTok Shop wants vertical. Walmart wants white with hard frame fill rules.

Selling to multiple marketplaces means producing different image sets for each. A single product listed on Amazon, Shopify, Etsy, Instagram Shop, and TikTok Shop needs roughly 8-12 distinct images per channel: a main, 4-7 supporting, lifestyle context, sometimes video frames. Multiplied across the catalog, that's the marketplace tax.

For context, here's what the tax looks like for a fictional 100-SKU DTC apparel brand selling across three marketplaces:

Amazon: 8 images per listing × 100 SKUs = 800 images
Shopify (own store): 12 images per listing × 100 SKUs = 1,200 images
Etsy: 6 images per listing × 100 SKUs = 600 images
Total: 2,600 unique compliant images

Studio cost at $150-$500 per image (low end for batch work, high end for hero shots): $390,000 to $1.3 million. For a brand doing $5M-$20M in annual revenue, the studio bill alone is 2-25% of revenue. That math doesn't work, and brands operating at this scale either accept low-quality marketplace presence (one bad photo replicated across all listings) or invest in automation.

Now run the API math on the same 2,600 images. Assume each pass through the pipeline costs $1.50-$3.50 depending on operations needed:

Low end: 2,600 × $1.50 = $3,900
High end: 2,600 × $3.50 = $9,100

Roughly 100× cheaper at the low end and 140× at the high end. And the entire catalog refreshes in days instead of months when marketplace specs change or you add a new channel.

This is the editorial position the rest of this article rests on. Automation wins in 2026 on the math. Photographer-versus-AI debates are a separate conversation about hero imagery, not catalog operations.

The pipeline shape: 5 stages (plus a quality layer)

The pipeline has five stages from source asset to marketplace-ready output. Each stage maps to one or two API calls. A sixth layer, automated quality scoring, runs over the whole thing in production-grade setups.

Stage 1: Source asset. This is your input. It can be:

A studio shot on white (already mostly compliant)
An on-model fashion photo (lifestyle, needs extraction)
A campaign or brand shoot (high-quality, wrong background for marketplaces)
A flatlay or styled product scene (needs cleanup)
A 3D model render or screenshot (increasingly common, covered later in this article)

The wedge for automation is reusing existing assets. Most brands have years of campaign imagery that could be retargeted to marketplace formats but never has been because manual editing was too expensive.

Stage 2: Extract the product. Pull just the product out of whatever scene it's in. The API: Product Isolation at $0.27/image. For simpler inputs (clean studio shots), Background Removal at $0.045/image is enough. We covered the background removal layer in detail in our top 6 background remover APIs guide.

Stage 3: Set the marketplace background. Different marketplaces want different things:

Amazon: pure white (#FFFFFF) → Background Color Fix
Shopify: configurable (most use white, some use lifestyle) → Background Color Fix or Replace Background
Etsy: lifestyle context permitted, even encouraged → Replace Background with lifestyle reference

Stage 4: Fix and refine. Optional, but essential for some workflows:

Damaged regions in the source → Reference-Based Inpainting at $0.55/image
Visible price tags, brand labels → Tag Removal
Need to remove a model from on-model garment → Model Removal
Need to remove props or clutter → Object Removal

Stage 5: Resize to every marketplace spec. This is where one source becomes many outputs. The API: Smart Resize at $0.55/image. Eight aspect ratios, three resolution tiers (1K, 2K, 4K). Generative reframing instead of cropping, so subjects stay intact across formats.

Layer 6: Quality validation (Sentinel). This sits across the whole pipeline and is what separates a hobbyist setup from a production-grade one. We'll cover it in detail in a later section because most articles in this space skip it entirely.

Total cost per listing across the pipeline: ~$1.50-$3.50 depending on which stages your input requires. A clean studio source might only need Stage 5 (resize for each marketplace). A lifestyle campaign shot might run through all five stages.

The next section walks through the marketplace specs you're targeting in Stage 5, because the resize call is where most of the per-listing economics live.

Marketplace requirements: the specs that matter

Each marketplace publishes detailed image standards. Most brands don't read them, which is why so many listings are non-compliant or under-optimized. The specs below are the operationally relevant ones for automation.

Amazon

Amazon's product image requirements are the strictest of the major marketplaces, and non-compliance is enforced through automatic listing suppression.

Background: Pure white, RGB (255, 255, 255), no exceptions for the main product image
Frame fill: Product must occupy at least 85% of the image area
Resolution: Longest side 1,600 pixels minimum (2,000+ recommended for zoom functionality)
Format: JPEG strongly preferred; TIFF accepted; PNG and GIF allowed but not optimal
Image count: 1 main + 8 supporting (varies by category)
Forbidden in main image: Watermarks, text, borders, color blocks, props not included in the sale, multiple views of the product, alternate angles, model heads (for apparel main shots)
Supporting images can include: Lifestyle context, models, multiple angles, infographics, scale references

The 85% frame fill rule is where most non-compliant listings fail. Studio shoots tend to leave more white space than Amazon wants. Smart Resize at the 1:1 aspect ratio with appropriate cropping fixes this on the resize call.

Shopify

Shopify's image requirements are looser because Shopify is the store software, not a marketplace with editorial control. You set the standards for your own store.

Recommended size: Square, 2,048 × 2,048 pixels (this is Shopify's stated recommendation for product zoom)
Maximum file size: 20 MB per image; most stores stay under 3 MB for performance
Format: JPEG, PNG, GIF, WebP, HEIC all supported
Alt text: Required for accessibility and SEO
Background: No platform requirement; most brands use white or alpha-channel PNG for product shots
Image count: Up to 250 images per product
Video: Supported via Shopify Markets, separate field

Shopify's flexibility means automation here is about consistency across your own catalog, not compliance with platform rules. The Smart Resize call to 1:1 at 2K is the standard target.

Etsy

Etsy is unique among major marketplaces in actively encouraging lifestyle and contextual imagery.

Minimum size: 2,000 pixels on the shortest side
Recommended: 2,700 × 2,025 pixels (4:3 aspect ratio)
Format: JPEG or PNG
Image count: Up to 10 photos per listing + 1 video
Background: No restriction; lifestyle, styled, or plain backgrounds all permitted
Required orientation: Horizontal recommended for listing cards

Etsy's openness to lifestyle context is a real advantage for handmade and creator brands. Your campaign imagery often works for Etsy with minimal modification. Smart Resize to 4:3 at 2K, no background change needed if your source is lifestyle-styled.

Other marketplaces (briefly)

Walmart Marketplace: Similar to Amazon (white background, 85% fill, 2,000+ pixels minimum) but enforced less aggressively
eBay: Minimum 500 pixels longest side, no maximum, white or neutral background recommended for product shots
TikTok Shop: Vertical 9:16 strongly preferred for native feed integration; 1:1 for catalog
Instagram Shop: 1:1 minimum 1,000 × 1,000 pixels; lifestyle context allowed
Faire (wholesale marketplace): White background, 1,000+ pixels, similar to Amazon's editorial standards

The pipeline below produces compliant output for any of these from a single source asset. The differentiator is which aspect ratios and background colors you target in Stages 3 and 5.

Building the pipeline with Runflow's APIs

This is the operator section. Each subsection covers the API call, when to use it, the cost, and a concrete code shape. You can copy-paste these into your codebase or hand them to Claude Code to scaffold the pipeline. (If you want to see the AI-coding-assistant pattern in action on a related project, our Claude Code AI headshot tool playbook walks through the same scaffolding approach end-to-end.)

Product Isolation: extracting the product from lifestyle shoots

runflow.io/api/product-isolation

When to use Product Isolation: you have an on-model fashion shot, a lifestyle scene with multiple objects, or a busy campaign image, and you need just the product on an alpha-channel background.

The wedge over ordinary background removal: prompt-guided extraction. You name the product in plain language ("the handbag", "the red sneaker", "the watch on the wrist") and the API pulls only that. Background Removal pulls the foreground, which is usually the model in on-model shots. Product Isolation pulls what you asked for.

API call:

curl -X POST https://api.runflow.io/v1/run/product-isolation \
  -H "Authorization: Bearer $RUNFLOW_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "image_url": "https://your-cdn.com/campaign-shot-001.jpg",
    "prompt": "the handbag"
  }'

Response: PNG cutout with full alpha channel. Pixel-accurate edges on hair, fabric, jewelry chain, glass, see-through objects.

Pricing: $0.27 per image. Under 5 seconds processing. Up to 4096×4096 pixel inputs.

Real-world flow: You have a hero campaign shot for a fashion line. The model is wearing the handbag, posing in a styled environment with props. You need a PDP cutout of just the handbag for Amazon. One Product Isolation call, $0.27, you have the bag on a fully alpha-channel background with the model and props gone. Ready for Stage 3.

If your input is already a clean studio shot on white, skip this. Use Background Removal at $0.045/image instead.

Background Color Fix and Replace Background: marketplace-compliant backgrounds

Once you have the product isolated, you need the right background for each marketplace.

Background Color Fix sets or corrects the background to a specific color. For Amazon's #FFFFFF requirement, this is the call. $0.045 per image.

Replace Background swaps in a new background entirely. For Etsy lifestyle context or Shopify lifestyle PDP imagery, this is the call. You can provide a reference background image or let the API generate context-appropriate scenes.

The marketplace mapping:

Amazon main image → Background Color Fix to #FFFFFF
Shopify standard → Background Color Fix to #FFFFFF (most common) or Replace Background for lifestyle
Etsy listings → Replace Background with reference lifestyle scene, or leave the original lifestyle context if it's strong
TikTok Shop → Replace Background to seasonal/trending scenes for native feel

Cost math: $0.045 per Background Color Fix call. For an Amazon-ready pipeline, that's $0.045 added to the $0.27 Product Isolation = $0.315 to convert a campaign shot into an Amazon-compliant cutout. Before Stage 5 (resize), you're at roughly 30 cents per listing.

Reference-Based Inpainting: fixing the one thing that broke

runflow.io/api/reference-based-inpainting

The most common production failure mode in AI product imagery is not "the whole image is bad." It's "the image is 95% correct, except for one specific region that broke." A hand with too many fingers. A wrong logo on the chest pocket. A phantom pocket the model invented. A plastic-looking patch of skin. A badge AI got wrong because the brand wasn't in the training data.

Reference-Based Inpainting is built for that pattern. You mask the failing region, hand the API a reference image showing what should be there, and the API repaints just that area. The rest of the image stays untouched. Original resolution preserved.

In one production case study we walked a footwear display platform through, the failure pattern was an obscure Gore-Tex brass badge on a hiking shoe. Standard image-generation models had no representation for the badge and produced a garbled version of it every time. The fix: mask the badge region, pass a clean reference photo of the badge as the inpainting guide, and the API replaced just that 10% of the image with the correct badge. Everything else stayed as-generated.

Common production use cases:

Logo and badge correction. When the AI gets a brand-specific detail wrong, mask and reference-inpaint with the correct asset.
Hand and finger fixes. AI-generated hands remain the single most reliable failure mode. Mask the hand region, reference-inpaint from a clean source.
Phantom feature removal. Models sometimes invent pockets, sleeve plackets, or seams that don't exist in the source garment. Mask the region, reference-inpaint with the correct fabric pattern, the phantom feature disappears.
Fabric variant generation. Photograph one version of a garment, generate fabric variants using swatch references as the guide. Saves a separate shoot per colorway.
Catalog rescue. Older catalog images with damaged regions (scratches, JPEG artifacts, unwanted reflections) can be repaired by masking the affected area and using a clean reference of the product.

API call:

curl -X POST https://api.runflow.io/v1/run/reference-inpaint \
  -H "Authorization: Bearer $RUNFLOW_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "image_url": "https://your-cdn.com/product-with-bad-region.jpg",
    "mask_url": "https://your-cdn.com/inpaint-mask.png",
    "reference_url": "https://your-cdn.com/badge-reference.jpg"
  }'

Pricing: $0.55 per image. Under 5 seconds.

The call requires a mask defining the region to inpaint. Mask generation is itself a topic; for most ecommerce automation flows, you'll generate masks programmatically from Product Isolation output or hand-draw them with a quick design tool.

Smart Resize: the marketplace fan-out

runflow.io/api/smart-resize

This is the final stage and the most-used API in the pipeline. Smart Resize converts a single source image into any of eight aspect ratios at any of three resolution tiers, without cropping the subject or distorting proportions.

The wedge over traditional resize: generative reframing. Traditional resize crops to the target aspect ratio (cutting parts of the subject off) or stretches the image (distorting proportions). Smart Resize detects the subject, generates new pixels around it to match the target frame, and outputs at the requested resolution. The product stays intact, centered, and on-brand across every output format.

Supported aspect ratios: 1:1, 16:9, 9:16, 4:3, 3:4, 21:9, 2:3, 3:2
Supported resolution tiers: 1K (1024px), 2K (2048px), 4K (4096px) on the long side
Pricing: $0.55 per image, fixed (no resolution surcharge)
Processing: Async batch with webhook delivery; per-image processing under a few seconds

API call:

curl -X POST https://api.runflow.io/v1/run/smart-resize \
  -H "Authorization: Bearer $RUNFLOW_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "image_url": "https://your-cdn.com/product-on-white.jpg",
    "aspect_ratio": "1:1",
    "resolution": "2K"
  }'

The marketplace mapping for Smart Resize:

Marketplace	Aspect ratio	Resolution tier	Notes
Amazon main	1:1	2K or 4K	85% frame fill enforced; pick 4K if you want zoom support
Amazon supporting	1:1 or 4:3	2K	Lifestyle and detail shots
Shopify	1:1	2K	Shopify's documented recommendation
Etsy listing	4:3	2K	Etsy's recommended ratio
Etsy hero	1:1	2K	Listing card display
Instagram feed	1:1	2K	Native format
Instagram Story	9:16	2K	Native format
TikTok Shop	9:16	2K	Vertical native; 1:1 for catalog
Display ads	16:9 or 21:9	2K	Banner placements

The operator math: One source shot, six marketplace formats. Six Smart Resize calls at $0.55 each = $3.30 to ship every channel from one asset. Add Stages 2-4 if needed and you're at the $1.50-$3.50 per-listing range cited at the top of this article.

A fashion brand fanning a single campaign shot across Amazon (1:1 + 4:3), Shopify (1:1), Etsy (4:3 + 1:1), Instagram feed (1:1), and TikTok Shop (9:16) is seven Smart Resize calls = $3.85. Total time from source asset to all seven outputs: under two minutes async, including webhook delivery.

The quality layer: Sentinel and why most automation pipelines skip it

Every "AI product photo" article on the internet walks you through the generation steps and stops. The articles assume the model output is the deliverable. In production it isn't.

At scale, image generation has a real failure rate. Hands break. Logos drift. Colors shift slightly. Pockets and seams get invented out of nowhere. A 5% failure rate is invisible at 10 images and catastrophic at 10,000. That's 500 bad listings shipped to your marketplace presence before a human notices. Manual QA across the full output is the obvious answer and the wrong one: it doesn't scale past dozens of images per day and it's the most expensive line item in your operation.

The right answer is automated quality validation as a pipeline layer. At Runflow we call it Sentinel. The pattern:

Generate the image through Stages 2-5.
Score the output against multiple quality dimensions before delivery (face fidelity if there's a model, fit and pose accuracy, color match, prompt alignment, brand-guideline compliance, artifact detection).
Pass: ship the image to the marketplace pipeline.
Fail one dimension: trigger targeted regeneration with a refined prompt, or trigger Reference-Based Inpainting on just the failing region.
Fail two or more: hold for human review.

This is the workflow that took our largest production customer's AI headshot operation from 60% gross margins to 87% on the same models. The full numbers are documented in our AI headshot generator API guide and the engineering side in Building an AI Image Generator API: 14 things that broke. The short version: generate ~4× more candidates than you ship, score them all, deliver only what passes, and your GPU cost as a share of revenue drops from 40% to 11%.

For product photography specifically, the dimensions Sentinel checks are configurable: fit accuracy against a reference garment, color match against brand swatches, presence/absence of required brand elements, background compliance with marketplace requirements (e.g., is this really #FFFFFF and is the product really filling 85% of frame for Amazon). The configurability is the point. Different brands and different marketplaces need different things validated. The pipeline lets you set those rules once and apply them across every output.

Sentinel is included in the base price when you route image generation through Runflow workflows, or available as a standalone scoring API at ~$0.05/image for teams running their own generation infrastructure.

The over-generate-and-filter pattern

The companion to Sentinel is the over-generation pattern. Rather than asking the model for one output and hoping it's correct, ask for ~10 variants per shot, score them all, deliver the top 2. This is what's happening behind the "auto-ranked, best first" responses you see from production AI image APIs.

The math is counterintuitive at first: generating 10× more images than you ship is more expensive per attempt, but cheaper per acceptable output. A 60% per-image success rate with manual QA costs more in human review and reshipment than a 99% delivered-quality rate with 10× over-generation and automated scoring. The customer who established this pattern publicly (BetterPic, in the AI headshot space) ran 240 candidates per user and delivered the top 60, with zero manual review. Same playbook applies to product photography: the volumes are different, the principle is identical.

If you already have 3D: a stronger pipeline path

A growing share of fashion, footwear, and home-goods brands have product 3D models (built in CLO 3D for apparel, Blender or Rhino for industrial design, Style3D and similar tools for the new wave of fashion-native 3D platforms). If you have a 3D model of your product, you have access to the strongest pipeline path available in 2026.

The trick: a 3D screenshot at a specified pitch, yaw, and rotation becomes a perfectly-controlled reference for image generation. The 3D model can't hallucinate. It has the right number of fingers (or buttons, or zipper teeth) by construction. Lighting can be predefined to match your brand. Every angle is available without a reshoot.

The pipeline:

Render or screenshot the 3D model at the angles you need (front, side, back, three-quarters, top), as Stage 1 source assets.
Run the 3D screenshot through an image-generation pass that uses it as a structural reference, generating photorealistic output that preserves the exact shape and silhouette.
Continue through Stages 3-5 (backgrounds, refinement, resize) normally.

The 3D-as-reference path is the answer to the hardest production failure modes: complex product shapes (wide-toe footwear, unusual silhouettes, technical apparel), unfamiliar brand details (badges, logos, hardware), and consistent multi-angle generation across hundreds of SKUs. Standard generation alone fails on these inputs because the model has weak representations. With 3D as the structural reference, the model is generating texture and lighting over a guaranteed-correct shape.

If you're already running a ComfyUI workflow for AI generation, integrating a 3D screenshot input takes one extra step: the workflow accepts an additional image and uses it as a ControlNet-style structural guide. If you're working at the Solutions API layer, the 3D screenshot becomes the image_url input and the model is conditioned to preserve shape from that source.

Brands without 3D models can still build automation pipelines through Stages 2-5. Brands with 3D get a meaningfully better quality floor on hard cases.

Real-world automation flows

Three patterns showing how teams actually run this in production. Names anonymized; the API costs, pipeline shapes, and pain points are pulled from real customer engagements.

Flow A: a DTC fashion brand migrating Shopify catalog to Amazon

The scenario: A 200-SKU DTC apparel brand has its Shopify catalog populated with on-model lifestyle imagery. They want to expand to Amazon. Amazon requires white-background main images. Their existing assets are on-model, lifestyle-styled, and not Amazon-compliant.

The manual path: reshoot 200 SKUs on white. Studio cost at $200 per SKU: $40,000. Calendar time: 3-4 weeks of studio work plus post-production.

The automation path:

Product Isolation on each of 200 on-model shots, prompt: "the [garment type]" → 200 alpha-channel PNGs of just the garment ($0.27 × 200 = $54)
Background Color Fix to #FFFFFF on each → 200 white-background cutouts ($0.045 × 200 = $9)
Smart Resize to 1:1 at 2K (Amazon main image format) → 200 Amazon-ready images ($0.55 × 200 = $110)
Sentinel scoring to verify white-background compliance, 85% frame fill, and no extraction artifacts before shipping to Amazon

Total API cost: $173. Sentinel included if routed through Runflow workflows. Total time: Async batch, complete in hours. Compliance: Amazon-spec 1:1 at 2K with pure white background.

For supporting images (multiple angles, lifestyle context), the brand reuses the original on-model shots resized via Smart Resize at $0.55 each. Total Amazon expansion cost across the catalog: under $500. Versus $40,000 for the reshoot path.

Flow B: a brand fanning one campaign shot across every channel

The scenario: A specialty coffee brand just shot a hero image for a winter blend launch. One image, professionally art-directed, lifestyle-styled with proper holiday vibes. They want to use it across Amazon, Shopify, Etsy, Instagram, TikTok Shop, and a Meta display ad campaign.

The manual path: design team takes the source asset, crops and resizes for each format, often re-shooting the same scene at different aspect ratios because traditional crops cut the subject. Calendar time: half a day to a full day of designer work per launch.

The automation path:

Smart Resize to 1:1 at 2K → Amazon main, Shopify, Instagram feed ($0.55)
Smart Resize to 4:3 at 2K → Etsy listing card ($0.55)
Smart Resize to 9:16 at 2K → Instagram Story, TikTok Shop ($0.55)
Smart Resize to 16:9 at 2K → Meta display banner ($0.55)
Smart Resize to 21:9 at 2K → Cinematic display banner ($0.55)

Total cost: $2.75. Total time: Under two minutes. Output: Eight platform-ready formats from one source.

The differentiator: generative reframing kept the coffee bag (the subject) intact and properly centered across every aspect ratio. Traditional crop would have cut the bag in half on the 21:9 widescreen format. Stretching would have distorted it on 9:16 vertical.

Flow C: a digital-native fashion brand inside a large retailer, generating from 3D + studio inputs

The scenario: A digital-native fashion brand inside a major European retailer produces 45+ AI-generated fashion images per week for marketing assets and virtual try-on. They start with high-quality 3D renders at 8 angles per garment (front, side, back, three-quarters) generated from CLO 3D designs, plus 4-5 input studio images. They run these through a ComfyUI workflow with Nano Banana Pro as the inference step. They serve a marketplace that requires 3,000×4,000 px output natively.

Their original pain points:

Inpainting reduced resolution, requiring manual Photoshop upscaling
The inference provider silently downgraded to a smaller model under load, with no notification, causing unpredictable output quality
No feedback loop on output quality (every generation was a roll of the dice)
AI consistently hallucinated phantom pockets and sleeve plackets that didn't exist in the 3D source
Processed one image at a time; no batch capability

The automation path that solved it:

3D model screenshot at specified angles as the structural reference input (Stage 1)
Workflow inference with reserved capacity instead of public API endpoints, eliminating the silent-downgrade failure mode
Sentinel scoring against fit accuracy, brand guidelines, and "no phantom features" rules. Failures trigger step 4 automatically.
Masked Reference-Based Inpainting at original resolution for the regions that fail scoring, fixing just the bad areas without resolution loss or manual Photoshop steps
Native 4K output matching the 3,000×4,000 px marketplace requirement, no upscaling step

This pattern is what production AI fashion imagery actually looks like when it works at scale. The same composable primitives apply to product photography in any vertical where shape, fit, and brand details matter: Sentinel, Reference-Based Inpainting, reserved-capacity inference, and 3D-grounded references.

Flow D: a marketplace platform onboarding artisan sellers

The scenario: A handmade-goods marketplace platform onboards 500 new artisan sellers per month. Each seller uploads roughly 8-10 lifestyle photos of their products. The platform wants to convert these into marketplace-compliant catalog cutouts automatically.

The manual path: dedicated catalog ops team manually edits each photo, removes backgrounds, fixes color, resizes. Cost per photo at ~$3 (offshore design ops): $12,000-$15,000 per month for 4,000-5,000 photos.

The automation path:

Product Isolation on each lifestyle photo, prompt extracted from the seller's product description → clean cutout ($0.27 × 4,500 = $1,215)
Background Color Fix to #FFFFFF for the marketplace standard → ($0.045 × 4,500 = $202.50)
Smart Resize to 1:1 at 2K for catalog grid → ($0.55 × 4,500 = $2,475)
Sentinel scoring for spec compliance before publishing → no incremental cost when routed through Runflow

Total API cost: $3,892.50 per month. vs. $12,000-$15,000 manual. Time: Async batch with webhook delivery; complete in hours instead of two-week onboarding cycle.

The platform saves $8,000-$11,000 per month and reduces seller onboarding from two weeks to two days. The savings compound: better seller experience → faster catalog growth → more transactions.

A note for the platform audience. If you're building a PIM, product feed manager, DAM, marketplace, or any B2B SaaS where ecommerce brands manage their catalogs through your software, the pipeline above is a feature you can ship to your customers rather than something each brand has to build alone.

The pattern that works: route your customers' product images through Runflow's Solutions API in your pipeline. Background Removal is free up to meaningful volume. Many platforms use it as a launch feature replacing a paid dependency on remove.bg or similar. Product Isolation, Smart Resize, and the rest of the catalog work follow on the paid tier.

We currently work with PIM and product data platforms on this exact pattern. Typical entry deal is $500-$1,000/month, expanding as their customers' usage grows. We're priced and structured to be embedded inside your product, not to compete with you.

If that's the shape you're thinking about, see Runflow's homepage and the Solutions API documentation. We work through partners on inbound brand inquiries rather than selling direct.

What this doesn't solve

A few honest limits, because operator playbooks don't ship without them.

New product launches. If no source asset exists, you can't automate it. You still need an initial shoot or a 3D model. Once you have one good source, automation handles every downstream marketplace format.

Brand-specific aesthetics that require human direction. A premium fashion brand's hero campaign isn't an automation problem. A creative director's eye is the value; APIs are the support layer for the catalog work that follows. Use both.

Video and motion content. This article covers still imagery. Video product content (Etsy listing videos, TikTok Shop native video, Amazon Posts video) is a separate category and a different pipeline shape.

Color accuracy for certain materials. Jewelry, makeup, technical fabrics, and high-saturation prints can drift in automation pipelines. The output is usually 95%+ accurate, which is fine for most use cases but not for products where exact color match is the selling point. For those, plan to manually QA a sample before scaling, or wire up Sentinel with strict color-delta thresholds against a reference swatch.

Marketplace compliance can change. Amazon updates its requirements occasionally. Etsy adjusts its lifestyle policies. Build your pipeline so the aspect ratios and background targets are configurable, not hardcoded. When specs change, you re-run the pipeline rather than rebuilding it.

The good news: automation makes catalog operations work at scale, alongside the creative and editorial work that makes a brand. Those are problems you can actually solve.

Where to start: a 60-minute setup

Five steps to get the pipeline running end-to-end on your own catalog.

Step 1: Audit your existing asset library. Look through campaign shoots, lifestyle imagery, on-model fashion, and styled flatlays from the last 12-24 months. Anywhere you've got a strong source asset that wasn't repurposed for marketplaces is automation territory. You probably have more usable material than you think.

Step 2: Pick one marketplace and one product line to test. Don't try to migrate the whole catalog on day one. Pick a marketplace (Amazon is usually the highest ROI) and a product line with 5-10 SKUs. Run the full pipeline end-to-end on those. Verify quality manually.

Step 3: Sign up for Runflow. $10 in free GPU credits covers the initial test. That's roughly 35 Product Isolation calls plus 22 Background Color Fix calls plus 18 Smart Resize calls. More than enough for a 10-SKU pilot through the full pipeline.

Step 4: Run 10 SKUs through the pipeline. Either by direct API calls with the curl snippets above, or scaffold a quick pipeline script with Claude Code or ChatGPT Codex. The pipeline is short enough that an AI coding assistant produces working code in one prompt. We walk through that exact pattern in our Claude Code build playbook on the headshot side.

Step 5: Manually verify marketplace compliance before scaling. Upload your generated images to a draft listing on the target marketplace. Confirm the marketplace's automated checks pass. Verify the frame fill, color accuracy, and aspect ratio against the specs above. Once one passes, the rest of the batch will too.

The 60-minute window is for the pilot. Scaling to 100+ SKUs is async batch processing with webhook delivery, which runs in hours regardless of catalog size.

FAQ

What is product photography automation?
Product photography automation is the use of AI-powered APIs to convert source images into marketplace-compliant outputs without manual editing or reshoots. A typical pipeline extracts the product, sets the right background, resizes to platform-specific aspect ratios, and delivers ready-to-list imagery at $1.50-$3.50 per listing instead of $150-$500 per studio shot.

How much does it cost to automate product photography for Amazon?
For Amazon specifically, the per-listing automation cost is roughly $0.30-$1.50 depending on your source asset. A simple flow (Product Isolation + Background Color Fix + Smart Resize to 1:1 at 2K) costs $0.27 + $0.045 + $0.55 = $0.865 per Amazon-ready cutout. Add additional supporting images at $0.55 each via Smart Resize. For a 100-SKU Amazon catalog with 8 images per listing, total cost lands around $400-$800.

What are Amazon's product image requirements?
Amazon requires pure white background (#FFFFFF) for the main product image, with the product occupying at least 85% of the frame. Minimum 1,600 pixels on the longest side (2,000+ recommended for zoom). JPEG format preferred. No watermarks, text, borders, or props not included in the sale in the main image. Supporting images (up to 8 additional) allow lifestyle context, models, and infographics. Non-compliant listings are automatically suppressed.

What are Shopify's product image requirements?
Shopify recommends square 2,048 × 2,048 pixel images for product zoom functionality. Maximum file size is 20 MB but most performant stores stay under 3 MB. JPEG, PNG, GIF, WebP, and HEIC formats are all supported. Alt text is required for accessibility. Shopify allows up to 250 images per product. No platform-mandated background; most brands use white or alpha-channel PNG for product shots.

What are Etsy's product image requirements?
Etsy requires minimum 2,000 pixels on the shortest side and recommends 2,700 × 2,025 pixels (4:3 aspect ratio). JPEG or PNG formats. Up to 10 photos per listing plus one video. Etsy uniquely permits and encourages lifestyle context in listing imagery, unlike Amazon's strict white-background requirement.

Can I use AI-generated product photos on Amazon?
Yes. Amazon has no policy prohibiting AI-generated or AI-edited product images, as long as the resulting images accurately represent the product and comply with the technical specs (white background, 85% frame fill, resolution minimums). Most AI-automated photos use real product photography as the source and apply AI for background, resize, and compositing operations, which is fully permitted.

Will AI-generated product photos hurt my listing performance?
Not when done well. Amazon's ranking algorithm prioritizes click-through rate and conversion, which depend on image quality and compliance with specs rather than how the image was produced. Poorly-executed AI imagery (artifacts, distorted proportions, obvious cutout halos) hurts performance the same way poorly-executed studio photography does. Well-executed automation that produces clean, spec-compliant imagery performs identically to studio shots in our customers' tests. The trick is the quality validation layer. See the Sentinel section above.

How do I resize product photos for multiple marketplaces?
The most efficient approach is generative reframing with a Smart Resize API. Unlike traditional crop or stretch, generative reframing detects the subject and generates new pixels around it to match the target aspect ratio. One source image becomes outputs at 1:1 (Amazon, Shopify, Instagram feed), 4:3 (Etsy), 9:16 (TikTok Shop, Stories), and 16:9 (display ads) in a single batch call. Cost is fixed at $0.55 per output regardless of resolution tier.

Is product photography automation worth it for small catalogs?
For catalogs under 20 SKUs on a single marketplace, the ROI is marginal compared to manual editing. For 20-100 SKUs on a single marketplace, automation pays for itself within the first batch. For multi-marketplace selling (3+ channels) or catalogs over 100 SKUs, automation is the only viable approach economically. The break-even point is roughly 50 listings.

What's the difference between Background Removal and Product Isolation?
Background Removal pulls the foreground subject (whatever is most prominent) and leaves a fully alpha-channel background. Product Isolation uses a text prompt to extract a specific named product from a scene, even when the product isn't the foreground (an on-model shot where the model is foreground but you want just the handbag). Product Isolation is the right call for lifestyle and on-model imagery. Background Removal is right for studio shots already on plain backgrounds. We compared the broader background removal landscape in our top 6 background remover APIs guide.

Can I automate product photos from my existing lifestyle shoots?
Yes, and this is the highest-ROI use case for automation. Brands typically have years of campaign and lifestyle imagery that was never repurposed for marketplace listings because the manual editing cost was prohibitive. A pipeline of Product Isolation + Background Color Fix + Smart Resize converts existing assets into marketplace-ready cutouts for under $1 per output, often producing better results than fresh studio shoots because the source imagery has stronger art direction.

How do I handle AI-generated images that come out wrong?
The production answer is a quality validation layer that catches failures before they ship. Sentinel scores every generated image against multiple dimensions (face fidelity, fit accuracy, color match, brand guidelines, artifact detection) before delivery. Failures trigger targeted regeneration via Reference-Based Inpainting on just the problem region, preserving the rest of the image at full resolution. The over-generate-and-filter pattern (produce ~10 variants per shot, score them all, deliver the top 2) is how production AI imagery hits 99% delivered quality without manual review.

Can I use 3D product models as input?
Yes, and if you have them, you should. A 3D screenshot at a specified angle gives the generation step a guaranteed-correct shape, eliminating most hallucination on complex products (wide-toe footwear, unusual silhouettes, technical hardware, brand-specific badges). The 3D-as-reference path is the strongest pipeline available for fashion, footwear, and home-goods brands that already have 3D assets from CLO 3D, Blender, or Style3D pipelines.

What can't be automated?
New product launches without source assets still need a shoot or a 3D model. Premium hero campaigns where creative direction is the value can't be automated. Some technical materials (jewelry, certain fabrics, high-saturation prints) drift on color accuracy and benefit from manual QA or strict Sentinel thresholds. Video content is a separate API category. Most brands combine automation for catalog work with manual studio shoots for hero assets.

Where to go from here

The pipeline shape is clear. The APIs exist. The cost math works. What's left is the doing.

A concrete sequence for the week ahead:

Audit your asset library for existing campaign and lifestyle imagery that could be repurposed for marketplace listings. Most brands underestimate how much usable material they already have.
Pick one marketplace and 5-10 SKUs for a pilot. Amazon is usually the highest-ROI starting point because the compliance gap between your existing imagery and Amazon's specs is widest.
Sign up for Runflow at runflow.io. $10 in free credits covers the pilot end-to-end.
Run the pipeline on the 5-10 SKUs using the API call shapes in this article. Verify quality manually before scaling. If you're already running ComfyUI for AI generation, the ComfyUI API endpoints post covers how to plug Sentinel into the workflows you already have.
Plan the batch run for the rest of the catalog once the pilot validates.

For deep specifics on the individual Solutions:

Product Isolation for extracting products from on-model and lifestyle scenes
Reference-Based Inpainting for fixing the regions AI gets wrong
Smart Resize for the marketplace fan-out
Background Removal for the simple cutout case
Sentinel for the quality validation layer that makes production-grade automation work

For related context across the AI image API ecosystem:

Top 6 background remover APIs in 2026 covers the segmentation layer across providers
The 4 best AI headshot generator APIs covers the people-imagery category and the BetterPic case study in detail
Building an AI Image Generator API: 14 things that broke is the engineering field-notes companion to this article
Most "Free AI Image Generation API" lists are lying to you covers what "free" actually means at production volumes
How I built an AI headshot tool in 4 hours with Claude Code shows the AI-coding-assistant scaffolding pattern applied to a related product

The brands shipping catalogs to multiple marketplaces in 2026 are the ones who automated the pipeline. The math has shifted. The tooling exists. The path to running it is in this article.

Want custom benchmarks for your workload?

We'll run our evaluation pipeline against your production data, for free.

Talk to Founders

benchmarks

How ecommerce brands automate product photography for Amazon, Shopify, and Etsy with AI (2026)

What is product photography automation?

Why this matters: the marketplace tax

The pipeline shape: 5 stages (plus a quality layer)

Marketplace requirements: the specs that matter

Amazon

Shopify

Etsy

Other marketplaces (briefly)

Building the pipeline with Runflow's APIs

Product Isolation: extracting the product from lifestyle shoots

Background Color Fix and Replace Background: marketplace-compliant backgrounds

Reference-Based Inpainting: fixing the one thing that broke

Smart Resize: the marketplace fan-out

The quality layer: Sentinel and why most automation pipelines skip it

The over-generate-and-filter pattern

If you already have 3D: a stronger pipeline path

Real-world automation flows

Flow A: a DTC fashion brand migrating Shopify catalog to Amazon

Flow B: a brand fanning one campaign shot across every channel

Flow C: a digital-native fashion brand inside a large retailer, generating from 3D + studio inputs

Flow D: a marketplace platform onboarding artisan sellers

Sidebar: building this feature as a platform (PIM, feed manager, marketplace)

What this doesn't solve

Where to start: a 60-minute setup

FAQ

Where to go from here

Want custom benchmarks for your workload?

Related posts

Background Removal Showdown: RMBG-2.0 vs SAM 2 vs Proprietary APIs

How We Cut GPU Costs 70% - The Architecture Behind Runflow

Building Sentinel: Our Automated Model Evaluation System