Image generation

hal0 exposes OpenAI-compatible image generation at POST /v1/images/generations, served by a ComfyUI provider running inside the img slot. hal0 owns the OpenAI ↔ ComfyUI translation — the upstream is treated as a black box that speaks POST /prompt, GET /history/<id>, and GET /view.

The route is implemented in src/hal0/api/routes/v1.py and the provider in src/hal0/providers/comfyui.py. Workflow templates live under src/hal0/providers/workflows/.

Overview

Endpoint: POST /v1/images/generations
Slot: img — part of BUILTIN_SLOTS in src/hal0/slots/manager.py, same lifecycle as primary, embed, stt, tts.
Backend: ComfyUI inside the ghcr.io/hal0ai/hal0-toolbox-comfyui:v1 toolbox image (pinned by sha256 in hal0/manifest.json).
Hardware: ROCm-capable AMD GPU. Strix Halo’s iGPU is the reference target — the 128 GB unified pool keeps an SDXL Turbo checkpoint and a primary chat model warm at the same time.

Curated models

The picker UI surfaces three curated entries spanning the licensing spectrum (see src/hal0/registry/curated.py):

Id	Family	On-disk	Min VRAM	License
`sdxl-turbo`	SDXL distilled	~6.5 GB	8 GB	SAI Non-Commercial Research Community
`sd-1.5-pruned-emaonly`	SD 1.5	~4.3 GB	4 GB	CreativeML Open RAIL-M
`flux-schnell`	FLUX.1 [schnell]	~23.8 GB	24 GB	Apache-2.0

curl: image generation

curl http://localhost:8080/v1/images/generations \
  -H "Content-Type: application/json" \
  -d '{
    "model": "sdxl-turbo",
    "prompt": "a cat in a hat, studio lighting",
    "size": "1024x1024",
    "response_format": "url"
  }'

Honoured body fields (subset of OpenAI):

model (required) — curated id, e.g. sdxl-turbo.
prompt (required).
n — batch size; default 1.
size — WxH string, e.g. 1024x1024.
response_format — "url" (default) or "b64_json".

hal0 extensions via extra_body: seed, steps, cfg, negative_prompt.

Response shape

{
  "created": 1716000000,
  "data": [
    { "url": "/api/images/cache/<uuid>.png" }
  ],
  "_hal0": {
    "workflow": "sdxl_turbo_simple",
    "checkpoint": "sd_xl_turbo_1.0_fp16.safetensors"
  }
}

When response_format is b64_json, each data[] entry carries b64_json (base64-encoded PNG) instead of url. The _hal0 debug field carries the workflow translator’s metadata so a misrouted prompt is easy to diagnose.

Hardware requirements

The v1 first-class target is a ROCm-capable AMD GPU — the Strix Halo Ryzen AI Max+ 395 iGPU specifically. SDXL Turbo runs comfortably alongside a small or mid chat model on a 128 GB unified pool; on a discrete AMD GPU you’ll want at least 8 GB of dedicated VRAM for SDXL Turbo, 4 GB for SD 1.5.

NVIDIA discrete GPUs are not yet wired — the ComfyUI provider defaults to the rocm backend. See the hardware overview for the full matrix and follow the upstream roadmap for CUDA support.

Slot configuration

A minimal slots/img.toml is shaped like every other slot — the default backend is rocm:

enabled = true
backend = "rocm"

[model]
default = "sdxl-turbo"

After a model lands in the curated catalogue’s per-id directory under /var/lib/hal0/comfyui/models/checkpoints/, start the slot:

hal0 slot load img --model sdxl-turbo

The OpenAI-shaped /v1/images/generations request will route there automatically; the dispatcher’s heuristics already pin /v1/images/* to the img slot.

Limitations

First-pull is heavy. The ComfyUI toolbox image is the largest one hal0 ships — the CI build takes ~19 minutes and the layer set is sizeable. Expect a long first docker pull on a fresh box; subsequent restarts hit the local layer cache.
No perf claims yet. No verified seconds-per-image numbers are in the repo for ComfyUI on Strix Halo iGPU. Marked [TODO: verify] in hal0-web/CONTENT_BRIEF.md until a real measurement lands.
Flux workflow. As above — flux-schnell is catalogued but the default workflow can’t drive it. A Flux workflow is the gating item before Flux is fully picker-grade.
License spread. The three curated entries each have a different license. The picker UI surfaces the badge so you pick consciously; check the bundled license_url field before shipping output anywhere production-facing.