Slot as model

hal0’s OpenAI-compatible API accepts two kinds of identifiers in the model field:

A registry ref — e.g. "qwen2.5-0.5b-instruct-q4_k_m" — picks that exact model file.
A slot name — e.g. "primary" — picks whatever model is currently loaded in that slot.

Both routes go through the dispatcher. Slot names are stable; registry refs change every time you pull a newer quant. Most clients should address slots, not registry refs.

Why this matters

Imagine you’re a Python script that wants “the chat model”:

from openai import OpenAI

client = OpenAI(base_url="http://localhost:8080/v1", api_key="local")

# Stable across model swaps:
resp = client.chat.completions.create(
    model="primary",
    messages=[{"role": "user", "content": "Hello!"}],
)

Tomorrow you swap primary from a 7B Qwen-Coder to a 70B Hermes-4 — the script still works. Address by slot, swap models independently of clients.

Addressing by registry ref

When you actually want a specific model — say you’re benchmarking two quants side-by-side — use the registry ref:

resp = client.chat.completions.create(
    model="qwen2.5-0.5b-instruct-q4_k_m",
    messages=[{"role": "user", "content": "Hello!"}],
)

The dispatcher resolves the ref, picks the slot that owns it, and proxies the request. If the model isn’t loaded in any slot, you get a structured model.not_loaded error.

What `/v1/models` returns

GET /v1/models returns both flavours so the client can discover either route:

{
  "object": "list",
  "data": [
    {"id": "primary", "object": "model", "owned_by": "hal0"},
    {"id": "embed", "object": "model", "owned_by": "hal0"},
    {"id": "stt", "object": "model", "owned_by": "hal0"},
    {"id": "tts", "object": "model", "owned_by": "hal0"},
    {"id": "qwen2.5-0.5b-instruct-q4_k_m", "object": "model", "owned_by": "hal0"}
  ]
}

OpenWebUI and most clients pick the first registry-ref-looking entry. That’s why we list slot aliases first — they’re stable.

Mapping rules

You sent…	Dispatcher resolves to…
`"primary"`	The model currently in the `primary` slot.
`"embed"`	The model currently in the `embed` slot.
`"stt"`	The model currently in the `stt` slot.
`"tts"`	The model currently in the `tts` slot.
A custom slot name	That slot’s current model.
A registry ref	The slot that currently owns that ref.
An external upstream ref	The upstream provider (OpenRouter, Anthropic, OpenAI, …).
Anything else	`{"error": {"code": "model.not_found", ...}}`