Skip to main content
Runflow skill for AI agents

Turn your agent into a production image stack.

One markdown file. 45 models across image, video, and audio. Plain HTTP from any agent.

Install the skill once. Claude auto-loads it when relevant.

Step 01

Get an API key

Sign in at app.runflow.io and create a key in Settings → API Keys. New accounts get $10 in credits. Export it as RUNFLOW_API_KEY in your shell.

Step 02

Install the skill locally

Install the skill into a skills directory. Use a repo-local directory when you want the whole project to carry the integration guide.

paste
mkdir -p .agents/skills/runflow
curl -fsSL https://www.runflow.io/.well-known/agent-skills/runflow/SKILL.md -o .agents/skills/runflow/SKILL.md
Step 03

Ask it to do work

Start a Claude session in any project and ask in plain English. The skill kicks in, Claude picks the right Runflow endpoint, runs it.

paste
> remove the background from ./shoe.png and save the cutout

BetterPic ships 35M+ headshots a year on the Runflow API. 99.9% uptime. Multi-cloud GPU pool.

First call

What it looks like the first time your agent uses it.

Real prompt. Real model the agent picked. Real HTTP endpoint. Real image returned.

Claude
Edit this selfie into three professional headshot variations for our team page.
toolgoogle/nano-banana-pro/edit·completed in 14.2s
3 variations1024×1024
httpPOST /v1/models/google/nano-banana-pro/edit/runs
01· Natural
Headshot variation 1, soft natural daylight, off-white linen backdrop
save
02· Studio
Headshot variation 2, classic studio key from upper right, charcoal backdrop
save
03· Dramatic
Headshot variation 3, dramatic warm side-lit, dark plum backdrop
save
Done. Saved three variants to /assets/headshots-{id}/. Want me to upscale the top pick to 4K via topaz/upscale/image?
runflow.io

Inside the skill

A complete image stack, inside your agent.

The skill points at every endpoint your agent needs: pre-built workflows for the common tasks, direct model access for everything else, custom ComfyUI graphs for anything you bring yourself.

Generative media

Image and video, on every model that matters.

Nano Banana Pro, FLUX Kontext, FLUX 2, GPT Image 2, Ideogram v3, Wan 2.7, Veo 3.1, Kling v3 Pro, Seedance 2.0. 50 models active. Up to 4K resolution. Any aspect ratio. Any duration up to your model's limit. Your agent picks the right one.

Claude
Generate a clean editorial portrait of this subject in studio lighting at 1024×1024.
toolgoogle/nano-banana-pro·completed in 8.7s
text→image1024×1024
httpPOST /v1/models/google/nano-banana-pro/runs
Editorial portrait generated by nano-banana-pro at 1024×1024
runflow.io
Solutions API

Pre-built workflows. One endpoint per task.

Background removal, outpainting, object removal, AI headshots, product isolation, skin retouch, tag removal. Each Solution is a multi-step pipeline behind a single endpoint. Your agent follows that endpoint's response, polling, or callback contract.

Claude
Remove the background from this product shot. Keep the cast shadow under the bag.
toolrunflow/background-removal·completed in 1.4s
httpPOST /v1/models/runflow/background-removal/runs
Tan leather handbag isolated on a transparent background with a soft cast shadow
runflow.io
Comparison

Run four models on one prompt. Pick the winner.

Fan-out is a documented pattern in the skill. Your agent fires the same input at nano-banana-pro, gpt-image-2, flux-pro/kontext, and ideogram/v3 in parallel, returns all four, and lets you choose. No new tool. Just N parallel POSTs.

Claude
Generate this same scene on four image models and let me pick.
toolfan-out: nano-banana-pro · gpt-image-2 · flux-pro/kontext · ideogram/v3·completed in 9.7s
4 modelsparallel
A 2x2 grid showing the same chair-and-lamp scene rendered in four different art styles by four different models
All four ran in parallel. Top by visual fidelity: gpt-image-2. Want me to regenerate the others with that one's seed?
runflow.io
Custom workflows

Ship your ComfyUI graph as an endpoint.

Drop a graph in the dashboard. We deploy it as POST /v1/comfyui-workflows/{owner}/{slug}/runs. Same auth, same callbacks, same shape as the rest of the API. Your agent invokes it like any other model.

Cursor
Run my custom photo-restore ComfyUI workflow on this scan.
tool{your-org}/photo-restore·completed in 5.4s
custom workflow
httpPOST /v1/comfyui-workflows/{owner}/{slug}/runs
A side-by-side before-after diptych showing a degraded photo on the left and the restored version on the right
runflow.io
Scale

Process a hundred shots while you sleep.

Batch endpoint takes a list of inputs and a callback URL. Per-item progress tracking. One signed callback when the whole batch finishes. Built for agents that drain queues, not one-offs.

Agent
Process all 240 product photos in this CSV: remove backgrounds, return URLs.
toolPOST /v1/batches → runflow/background-removal
240 itemsasync
httpPOST /v1/batches
A grid of eight product cutouts produced by a batch run
Batch queued. I'll check for results every 2 minutes and keep appending the cutout URLs to your CSV.
runflow.io
Video

Cinematic video, in seconds, in the chat.

Wan 2.7, Veo 3.1 (Lite + Fast), Kling v3 Pro, Seedance 2.0, HeyGen. Text-to-video, image-to-video. Up to 1080p. Up to 12 seconds. Same auth, same call shape as image models.

Agent
Generate a 5-second cinematic product reveal for the new sneaker drop.
toolalibaba/wan/v2.7/text-to-video·completed in 38.0s
1080p5s
httpPOST /v1/models/alibaba/wan/v2.7/text-to-video/runs
A still frame from a cinematic product reveal video
Saved as runs/{id}/output.mp4. 38s on wan/v2.7. Want me to also fan it out to google/veo3.1 for comparison?
runflow.io

Real-world flows

From request to delivered asset, in one conversation.

E-commerce product shots

Drop a phone photo of a product. Get a clean catalog cutout, an outpainted lifestyle scene, and a square hero in three calls. Ready to publish.

Models the agent calls

  • runflow/background-removal
  • runflow/outpaint
  • google/nano-banana-pro/edit

Social content at scale

Generate dozens of variations for one post. Different models, different aspect ratios, different prompts. Pick the best for each platform.

Models the agent calls

  • google/nano-banana-pro
  • openai/gpt-image-2
  • ideogram/v3

Marketing campaigns

Brand-consistent assets across formats and styles. Use reference images for identity. Edit prompts on the fly. Fan out across models for the best result.

Models the agent calls

  • openai/gpt-image-2/edit
  • black-forest-labs/flux-pro/kontext
  • google/veo3.1

On-model and product imagery

Ghost-mannequin, model removal, tag removal, skin fix, replace background, color correction. Eight image-editing Solutions wrapping the catalog's top edit models.

Models the agent calls

  • runflow/model-removal
  • runflow/tag-removal
  • runflow/skin-fix

Storyboards and previs

Storyboard a sequence in nano-banana-pro. Animate the keyframe in wan or veo3.1. Outpaint to match scene aspect. The skill points the agent at the right model for each step.

Models the agent calls

  • google/nano-banana-pro
  • alibaba/wan/v2.7/image-to-video
  • google/veo3.1

Sample prompts

Paste these into your agent. Watch the skill route the call.

These are the exact prompts a skill-armed agent handles in one shot. The expected flow is what the agent picks based on the decision rule in SKILL.md.

Single asset

Generate one finished thing.

  • Generate a cinematic wide shot of a neon-lit Tokyo alley at night, anamorphic lens.

    The agent picks google/nano-banana-pro, sets the input fields per the model's llms.txt, returns a finished image.

  • Remove the background from ./shoe.png and return the cutout as a transparent PNG.

    The agent calls runflow/background-removal directly. Solution endpoint, ~1.4s, finished cutout.

Full production

Multi-step. Fan-out. Save and ship.

  • Take this packshot. Generate 6 lifestyle scenes in different settings, then a 5-second product reveal video for the winner.

    The agent runs 6 parallel calls to gpt-image-2/edit with different scene prompts, picks the best, then calls alibaba/wan/v2.7/image-to-video on it.

  • Process every product photo in this CSV: bg-removal, outpaint to 16:9, save to S3.

    The agent posts a batch to /v1/batches, polls for completion, then writes results back via your callback URL.

Multi-model

Same prompt, different models, side-by-side.

  • Generate this scene on four image models in parallel. Show me side-by-side.

    The agent fires four parallel POSTs to nano-banana-pro, gpt-image-2, flux-pro/kontext, and ideogram/v3, then returns the grid.

The catalog

Four numbers your agent should know.

Active models

50

Across image, video, audio. Catalog at /models-catalog.json.

Video models

17

Wan 2.7, Veo 3.1, Kling v3 Pro, Seedance 2.0, HeyGen. Up to 1080p, up to 12 seconds.

Image models

26

Nano Banana Pro, FLUX Kontext, FLUX 2, GPT Image 2, Ideogram v3, Bria, Topaz, Reve, plus first-party Runflow Solutions.

Public endpoints

94

Customer OpenAPI surface. Verify at docs.runflow.io/api/openapi.public.json.

The artifact

The whole skill is 115 lines of markdown.

Plain markdown, plain HTTP. Your agent reads three things: a decision rule, a request shape, and an error map. Then it makes calls.

We publish a sha256 alongside the file so your CI can pin it. When we ship a new version, the hash changes and your agent picks up the update on the next session.

Read the full skill
Excerpt 01 · Decision ruleSKILL.md
1. Browse https://www.runflow.io/api. If a Solution covers the task, use it.
2. If no Solution fits, fall back to the Model API.
3. Do not reach for the Model API to reimplement something a Solution already does.
Excerpt 02 · Run shapeSKILL.md
POST https://api.runflow.io/v1/models/{owner}/{slug}/runs
Authorization: Bearer $RUNFLOW_API_KEY
Content-Type: application/json

{ "input": { ... }, "callback_url": "https://your.server/webhook" }
Excerpt 03 · ErrorsSKILL.md
401: bad/missing key, reprompt user
403: key lacks scope, user adjusts in dashboard
429: rate limited, back off
5xx: transient, check /v1/health, retry

What you get

Six things every Runflow integration ships with.

Plain HTTP, real prices, machine-readable specs. The same shape across image, video, and audio. Every primitive an agent needs to ship a feature, none of the runtime lock-in.

01

Portable HTTP, not vendor RPC.

Your agent reads the skill once and makes plain HTTP calls. The same code runs from a script, a serverless function, a CI job, or another agent. The skill works without an MCP runtime.

02

OpenAPI-first.

Real machine-readable public spec at docs.runflow.io/api/openapi.public.json. Generate clients in any language. Pin schemas in your tests. Version drift is detectable.

03

Multi-cloud GPU pool.

Calls are routed across providers based on availability and latency. Failover is automatic. The skill itself never tells your agent which datacenter ran the job.

04

One call shape across image, video, audio.

Same auth header, same POST body, same callback contract whether you're calling nano-banana-pro, wan/v2.7, or elevenlabs/tts. The agent learns it once and reuses it everywhere.

05

Signed webhook callbacks.

Pass callback_url with any run. We deliver a signed payload when it finishes. Signing keys live at /v1/callback-secrets so your handler can verify the call came from us.

06

Composable Flows.

Multi-step pipelines you build from primitives: a model run, a check, a fan-out, a callback. Run on a real graph runtime your agent calls directly. No black-box agent flows hidden behind a vendor UI.

Case study · BetterPic

From a self-managed AI stack to 87% gross margin.

BetterPic processed 35 million headshots through Runflow. Their gross margin moved from 40% to 87%. Same product, lower infrastructure cost, less time keeping GPUs alive.

Read the case study

Headshots shipped

35M+

Gross margin

40 → 87%

Engineers managing GPUs

0

Uptime SLA

99.9%

Common questions

What agent builders ask first.

  • No. The Runflow skill is a markdown integration guide that any agent can read. Once read, the agent makes regular HTTP calls to api.runflow.io. It works with every agent that reads markdown, not just MCP-compatible clients, and you keep portable HTTP code that runs from a script, a serverless function, or a CI job.

  • Every public customer endpoint in the Runflow API. That's the Solutions API (pre-built workflows like background removal, headshots, outpainting), the Model API (50 active models across image, video, and audio), ComfyUI Deploy (your custom graphs as endpoints), Flows (multi-step composition), and Batch (fan-out with callback). Full reference at https://docs.runflow.io/api/openapi.public.json.

  • Image: Nano Banana Pro, GPT Image 2, FLUX 2, Ideogram v3, Reve, Bria. Image edit: GPT Image 2 Edit, FLUX Kontext, Nano Banana Pro Edit, Qwen Image Edit, Reve Edit. Video: Wan 2.7, Veo 3.1 (Lite + Fast), Kling v3 Pro, Seedance 2.0, HeyGen. Audio: ElevenLabs TTS v3, Gemini TTS. Plus first-party Runflow Solutions (Background Removal, Outpaint, Object Removal, Skin Fix, Tag Removal, Product Isolation, Model Removal, Background Color). 50 active in the catalog at runflow.io/models-catalog.json.

  • No package, no plugin, no SDK required. Install SKILL.md into a project or user skill directory, then let the agent read it when Runflow work comes up. Use a short system prompt or rules file only when the agent cannot read skill directories.

  • Per-call. Each model has a fixed price published in its llms.txt and the catalog. Solutions like Background Removal and Headshots are priced on the Solutions API page. New accounts get $10 in credits to try it.

  • Multi-cloud GPU pool with automatic failover. 99.9% uptime SLA. BetterPic runs 35 million images a year on this infrastructure through the Runflow API.

  • Yes. Pass a callback_url with any run. We deliver a signed callback when the run finishes. Signing keys live at /v1/callback-secrets. Your agent does not have to poll.

  • Yes. Upload a graph at app.runflow.io/deploy. We host it at POST /v1/comfyui-workflows/{owner}/{slug}/runs with the same auth and callback shape as every other endpoint. Your agent invokes it like any other model.

  • The skill is the opinionated install. llms.txt is the long-form reference. The skill tells the agent how to make decisions (Solutions first, Model API as fallback) and gives a working curl example. llms.txt is a complete catalog of every URL we expose. Most agents only need the skill.

Hand your agent the skill. Ship today.

$10 in credits to start. No credit card. The skill works in any agent that can read markdown and make HTTP calls. That's all of them.