Guides Jun 24, 2026 9 min read

ComfyUI workflow tutorial: from node graph to API endpoint (2026)

ComfyUI workflow tutorial: how the node canvas becomes JSON, how JSON becomes an API call, and what breaks at scale. The 2026 builder guide.

Thibaut Hennau

CMO - building the expert's marketplace

The question we hear most from new builders is still the same one: how do I turn this ComfyUI canvas into something my app can call?

The answer starts with the JSON.

This comfyui workflow tutorial covers the part almost no tutorial explains: what the visual canvas actually stores, how the /prompt endpoint receives it, how to poll for results from Python, and what stops working when you run the same workflow a hundred times simultaneously instead of once on your desktop.

ComfyUI workflow tutorial, how a node graph becomes an API call (2026)

What a ComfyUI workflow actually is

A ComfyUI workflow is a JSON object that describes a directed node graph: each node has an ID, a class type, and an inputs dictionary that references either literal values or the outputs of other nodes by ID.

The canvas you see in ComfyUI is a visual editor for this JSON. When you click "Save" or export, the browser writes the JSON to disk. The positioning data (x, y, size) stays for the visual editor. The "Save (API Format)" export strips that out and gives you only the graph, which is what the /prompt endpoint expects.

Here's the minimal structure:

{
  "3": {
    "class_type": "KSampler",
    "inputs": {
      "model": ["4", 0],
      "positive": ["6", 0],
      "negative": ["7", 0],
      "latent_image": ["5", 0],
      "seed": 42,
      "steps": 20,
      "cfg": 7.0,
      "sampler_name": "euler",
      "scheduler": "normal",
      "denoise": 1.0
    }
  },
  "4": {
    "class_type": "CheckpointLoaderSimple",
    "inputs": {
      "ckpt_name": "v1-5-pruned-emaonly.safetensors"
    }
  }
}

Node "3" references node "4" at output index 0 via ["4", 0]. That array notation is the entire wiring system. Once you see it, you can read any workflow export without touching the canvas.

Two things worth noting before going further: the seed field on KSampler controls reproducibility. Leave it to ComfyUI's randomizer and your API results will vary between calls. Set it explicitly and identical inputs produce identical outputs. The second thing is the export format. The regular Save export is not what you want for the API. Use "Save (API Format)" from the gear menu, or use the /api/userdata endpoint to pull the API-ready version from a running instance.

The five node types you will see in every workflow

In a standard image-generation workflow, five node types appear in almost every graph: CheckpointLoaderSimple, CLIPTextEncode, EmptyLatentImage, KSampler, and VAEDecode.

Node	Role	Key input
CheckpointLoaderSimple	Load model weights from disk	ckpt_name
CLIPTextEncode	Encode a text prompt to conditioning	text, clip
EmptyLatentImage	Create the noise starting canvas	width, height, batch_size
KSampler	Run the diffusion loop	seed, steps, cfg, model, positive, negative, latent_image
VAEDecode	Decode the latent result to pixels	samples, vae

The SaveImage node is technically optional at the API level. In local ComfyUI it writes the output to the output/ directory. When calling the API programmatically, you typically replace it with a custom output node that writes to object storage and returns a URL, or you use a provider that handles output routing for you.

One detail that trips people up: batch_size on EmptyLatentImage controls how many images one KSampler run produces. Batch size greater than one means more VRAM and a longer single-run time, but fewer total API calls. For a production API with concurrent users, batch size of one with higher concurrency usually beats batch size of four with a single queue.

How Queue Prompt works (and the API behind it)

When you click Queue Prompt, ComfyUI's web UI POSTs your workflow JSON to POST /prompt with a client_id. The server queues the job and returns a prompt_id. You poll GET /history/{prompt_id} until the outputs appear.

Here is that loop in Python:

import httpx, json, uuid, time

COMFY_URL = "http://localhost:8188"
CLIENT_ID = str(uuid.uuid4())

with open("my_workflow_api.json") as f:
    workflow = json.load(f)

# Submit the job
resp = httpx.post(f"{COMFY_URL}/prompt", json={
    "prompt": workflow,
    "client_id": CLIENT_ID
})
prompt_id = resp.json()["prompt_id"]

# Poll until done
while True:
    history = httpx.get(f"{COMFY_URL}/history/{prompt_id}").json()
    if prompt_id in history:
        outputs = history[prompt_id]["outputs"]
        break
    time.sleep(0.5)

# outputs["9"]["images"] -> [{"filename": "...", "subfolder": "output", "type": "output"}]

The node ID in outputs corresponds to the ID of the SaveImage node in your JSON. If your workflow has node "9" as the SaveImage node, the output appears at outputs["9"]["images"].

ComfyUI also exposes a WebSocket at ws://localhost:8188/ws?clientId={CLIENT_ID} for live progress events. The UI uses this to show the preview while sampling. For server-side polling you almost never need it; the history endpoint is simpler and sufficient.

What breaks at scale

Four failure modes appear repeatedly when ComfyUI workflows move from local development to a production API: seed randomization errors, model loading overhead, VRAM fragmentation, and queue starvation.

Seed randomization is the first one. If your workflow JSON has "control_after_generate": "randomize" on any node, ComfyUI generates a new seed each run. That is the correct behavior for interactive use. It is the wrong behavior for a production API where a job should be reproducible from its inputs. Grep your workflow JSON for "control_after_generate" before shipping and set those values to "fixed".

Model loading overhead is the one people underestimate most. On a cold instance, loading a 6GB checkpoint can take 8 to 20 seconds depending on disk speed and whether the model is in the VRAM cache. If your ComfyUI instance swaps between models between requests, your P99 latency includes that load time on every cache miss. The fix is either to keep a hot instance per model, or to use a provider that pre-warms the relevant checkpoints.

VRAM fragmentation happens when you run multiple model types in sequence without freeing the cache. ComfyUI exposes a POST /free endpoint that tells it to release models from GPU memory. Call it between model-switching batches if you're orchestrating mixed workloads.

Queue starvation is a scheduling problem, not a GPU problem. If your queue is FIFO and you mix short image-generation jobs (a few seconds) with long video-generation jobs (30 to 90 seconds), short jobs wait behind long ones. Separate queues per job class are the standard fix.

(We've run into all four of these in production. The seed one is the most embarrassing because it shows up as flaky results rather than an obvious error, and it takes a while to trace back to the JSON.)

Running ComfyUI workflows via Runflow

Runflow's ComfyUI Deploy exposes any ComfyUI workflow as a typed REST endpoint: POST your inputs to https://api.runflow.io/v1/models/{owner}/{slug}/runs, poll GET /v1/runs/{id}, get a URL back.

The practical difference from self-hosting ComfyUI and calling /prompt yourself:

No GPU machine to manage. Jobs run on warm hardware and scale with demand.
No polling loop to write. One endpoint for submit, one for status.
No VRAM management. Model caching and eviction happen on the infrastructure side.
No cold-load latency for models already deployed on the platform.

Here is the equivalent Python call using the Runflow API:

import httpx, os, time

KEY = os.environ["RUNFLOW_API_KEY"]
BASE = "https://api.runflow.io/v1"

# Submit
run = httpx.post(
    f"{BASE}/models/yourname/your-workflow/runs",
    headers={"Authorization": f"Bearer {KEY}"},
    json={"prompt": "a white ceramic mug on a marble countertop", "seed": 42}
).json()

# Poll
run_id = run["id"]
while True:
    status = httpx.get(
        f"{BASE}/runs/{run_id}",
        headers={"Authorization": f"Bearer {KEY}"}
    ).json()
    if status["status"] in ("completed", "failed"):
        break
    time.sleep(2)

image_url = status["outputs"][0]["url"]

The model catalog lists workflows already deployed and callable. If you want to deploy your own workflow, the ComfyUI Builder Program is the path.

For teams that want to keep calling the raw ComfyUI API but move off local hardware, the choice between that and using a managed endpoint like Runflow comes down to volume and team bandwidth. If you have the engineering time and consistent high volume, self-hosting on cloud GPUs is cheaper per-run. If you want the workflow-to-endpoint path without the infrastructure work, managed is the faster route.

Frequently asked questions

What is the difference between the standard ComfyUI workflow export and the API format?
The standard export includes visual positioning data (x, y, size, node slot positioning) used by the canvas editor. The API format strips that and keeps only the node graph. Use "Save (API Format)" from the gear icon when your target is the /prompt endpoint or any programmatic call.

Can I call the ComfyUI API from a browser?
Yes, via fetch to your ComfyUI instance's /prompt endpoint. In a production deployment you would proxy this through your backend to keep the ComfyUI instance URL private and to add authentication.

How do I get the output image after polling /history/{prompt_id}?
The response includes an outputs dict keyed by node ID. For a SaveImage node at ID "9", the images are at outputs["9"]["images"]. Each image entry has a filename and subfolder field. Fetch the actual file from GET /view?filename=...&subfolder=...&type=output.

Why does my workflow produce different results on each API call?
The most common cause is seed randomization. Check your workflow JSON for "control_after_generate": "randomize" on any node and set it to "fixed". Also confirm you are passing an explicit seed value in the KSampler inputs.

What step count should I use for production?
20 steps is standard for SD 1.5 and SDXL workflows. Flux models (Dev, Schnell) typically use 4 to 28 steps depending on the variant and required quality. Fewer steps means faster generation and lower cost at the expense of fine detail.

How do I cancel a queued ComfyUI job?
Send POST /queue with the body {"delete": ["prompt_id_here"]}. To clear the entire queue, use {"clear": true}.

What is the /free endpoint for?
It tells ComfyUI to release loaded models from GPU memory. Useful when switching between checkpoints in a mixed-model workload. Call it between model-switching batches to avoid VRAM fragmentation.

How do I handle errors in the polling loop?
Check the status field in the /history/{prompt_id} response. Failed jobs include an error field with the reason. Log the prompt_id and inputs for debugging. Common failure reasons: model file not found, VRAM out of memory, and invalid node configuration.

Can I run ComfyUI workflows in parallel?
A single ComfyUI instance processes one prompt at a time (serial queue). For parallel execution you need multiple instances behind a load balancer, or a managed provider that handles concurrency for you. The built-in queue is designed for sequential processing, not concurrent throughput.

How large can a workflow JSON get?
There is no hard size limit in ComfyUI's API. In practice, workflows with embedded base64 images can reach several MB. Keep the JSON clean by loading images from disk paths or URLs rather than embedding them inline.

Do I need to restart ComfyUI to load a new model?
No. Include a CheckpointLoaderSimple node pointing to the new checkpoint name. ComfyUI loads it on the next run. Depending on available VRAM, the previous model may be evicted from memory.

What is client_id used for in the /prompt request?
It associates the job with a specific client connection, used by the WebSocket system to route progress events back to the right UI session. For server-side polling via /history, it has no functional effect but is good practice to include.

Where to go next

Export your current workflow in API format: open ComfyUI, click the gear icon, select "Save (API Format)." Inspect the JSON and confirm the seed is set explicitly on KSampler.
Run a local test call to /prompt using the Python snippet in this tutorial. Confirm the polling loop works before adding more complexity.
Read every ComfyUI API endpoint, documented for the full list beyond /prompt.
The ComfyUI API developer guide covers custom output nodes, model management, and production error handling.
When you're ready to move off local hardware, Runflow's ComfyUI Deploy gives you the same API surface with GPU infrastructure managed for you.

Start free at runflow.io.

auto-queuevideo-sourcecomfyui workflow tutorialcomfyui apicomfyui jsoncomfyui workflowcomfyui production

Want custom benchmarks for your workload?

We'll run our evaluation pipeline against your production data, for free.

Talk to Founders

Related posts

Portrait Generation Benchmark Q1 2026: Flux.2 vs SDXL vs Proprietary

Feb 24, 2026·12 min

How We Cut GPU Costs 70% - The Architecture Behind Runflow

Feb 20, 2026·18 min

Background Removal Showdown: RMBG-2.0 vs SAM 2 vs Proprietary APIs

Feb 17, 2026·9 min