Skip to content

Image generation

hal0 exposes OpenAI-compatible image generation at POST /v1/images/generations, served by a ComfyUI provider running inside the img slot. hal0 owns the OpenAI ↔ ComfyUI translation — the upstream is treated as a black box that speaks POST /prompt, GET /history/<id>, and GET /view.

The route is implemented in src/hal0/api/routes/v1.py and the provider in src/hal0/providers/comfyui.py. Workflow templates live under src/hal0/providers/workflows/.

  • Endpoint: POST /v1/images/generations
  • Slot: img — part of BUILTIN_SLOTS in src/hal0/slots/manager.py, same lifecycle as primary, embed, stt, tts.
  • Backend: ComfyUI inside the ghcr.io/hal0ai/hal0-toolbox-comfyui:v1 toolbox image (pinned by sha256 in hal0/manifest.json).
  • Hardware: ROCm-capable AMD GPU. Strix Halo’s iGPU is the reference target — the 128 GB unified pool keeps an SDXL Turbo checkpoint and a primary chat model warm at the same time.

The picker UI surfaces three curated entries spanning the licensing spectrum (see src/hal0/registry/curated.py):

IdFamilyOn-diskMin VRAMLicense
sdxl-turboSDXL distilled~6.5 GB8 GBSAI Non-Commercial Research Community
sd-1.5-pruned-emaonlySD 1.5~4.3 GB4 GBCreativeML Open RAIL-M
flux-schnellFLUX.1 [schnell]~23.8 GB24 GBApache-2.0
Terminal window
curl http://localhost:8080/v1/images/generations \
-H "Content-Type: application/json" \
-d '{
"model": "sdxl-turbo",
"prompt": "a cat in a hat, studio lighting",
"size": "1024x1024",
"response_format": "url"
}'

Honoured body fields (subset of OpenAI):

  • model (required) — curated id, e.g. sdxl-turbo.
  • prompt (required).
  • n — batch size; default 1.
  • sizeWxH string, e.g. 1024x1024.
  • response_format"url" (default) or "b64_json".

hal0 extensions via extra_body: seed, steps, cfg, negative_prompt.

{
"created": 1716000000,
"data": [
{ "url": "/api/images/cache/<uuid>.png" }
],
"_hal0": {
"workflow": "sdxl_turbo_simple",
"checkpoint": "sd_xl_turbo_1.0_fp16.safetensors"
}
}

When response_format is b64_json, each data[] entry carries b64_json (base64-encoded PNG) instead of url. The _hal0 debug field carries the workflow translator’s metadata so a misrouted prompt is easy to diagnose.

The v1 first-class target is a ROCm-capable AMD GPU — the Strix Halo Ryzen AI Max+ 395 iGPU specifically. SDXL Turbo runs comfortably alongside a small or mid chat model on a 128 GB unified pool; on a discrete AMD GPU you’ll want at least 8 GB of dedicated VRAM for SDXL Turbo, 4 GB for SD 1.5.

NVIDIA discrete GPUs are not yet wired — the ComfyUI provider defaults to the rocm backend. See the hardware overview for the full matrix and follow the upstream roadmap for CUDA support.

A minimal slots/img.toml is shaped like every other slot — the default backend is rocm:

/etc/hal0/slots/img.toml
enabled = true
backend = "rocm"
[model]
default = "sdxl-turbo"

After a model lands in the curated catalogue’s per-id directory under /var/lib/hal0/comfyui/models/checkpoints/, start the slot:

Terminal window
hal0 slot load img --model sdxl-turbo

The OpenAI-shaped /v1/images/generations request will route there automatically; the dispatcher’s heuristics already pin /v1/images/* to the img slot.

  • First-pull is heavy. The ComfyUI toolbox image is the largest one hal0 ships — the CI build takes ~19 minutes and the layer set is sizeable. Expect a long first docker pull on a fresh box; subsequent restarts hit the local layer cache.
  • No perf claims yet. No verified seconds-per-image numbers are in the repo for ComfyUI on Strix Halo iGPU. Marked [TODO: verify] in hal0-web/CONTENT_BRIEF.md until a real measurement lands.
  • Flux workflow. As above — flux-schnell is catalogued but the default workflow can’t drive it. A Flux workflow is the gating item before Flux is fully picker-grade.
  • License spread. The three curated entries each have a different license. The picker UI surfaces the badge so you pick consciously; check the bundled license_url field before shipping output anywhere production-facing.