Skip to content

fix(dispatcher): container slot preempts composite registry binding (cutover routing)#676

Merged
thinmintdev merged 1 commit into
mainfrom
fix/dispatch-container-slot-preempts-composite
Jun 9, 2026
Merged

fix(dispatcher): container slot preempts composite registry binding (cutover routing)#676
thinmintdev merged 1 commit into
mainfrom
fix/dispatch-container-slot-preempts-composite

Conversation

@thinmintdev

Copy link
Copy Markdown
Contributor

Why

The final wiring gap surfaced by the #662 live cutover: container slots never actually receive traffic.

The model registry binds every registered id — including models served by a container slot — to the synthetic composite hal0 upstream (which forwards to lemonade). dispatch() step 1 (registry lookup) resolves there and returns before the container remote is considered. So hal0/chatqwopus3.6-27b-v2 → composite → lemonade → 404 (lemond can't load it), even though the chat container on :8102 advertises the id and is healthy. Confirmed live via /api/upstreams: composite hal0 and remote chat both advertise qwopus3.6-27b-v2; the composite wins.

Fix

Add step 0 to dispatch(): a loaded container remote (kind="remote" + slot_name) that advertises the requested model wins over the registry/composite binding — a running container slot is authoritative for its model.

Tests

TDD: test_container_slot_preempts_composite_registry_binding + test_non_container_model_still_uses_registry_binding. 304 passed; ruff clean.

Completes the container-runtime cutover routing. Relates to #652, #662. Stacked on #674 (devices/ctx) + #675 (alias).

🤖 Generated with Claude Code

Final cutover (#662) wiring gap. The model registry binds every registered
id — including models a container slot serves — to the synthetic composite
`hal0` upstream, which forwards to lemonade. dispatch() step 1 (registry)
short-circuited there, so hal0/* requests for a container-backed model (e.g.
qwopus3.6-27b-v2 served by the `chat` container on :8102) routed to lemonade
and 404'd, even though the container advertised the id and was healthy.

Add step 0: a loaded container remote (kind="remote" + slot_name) that
advertises the requested model wins over the registry/composite binding — a
running container slot is authoritative for its model. Only fires on a warm
cache hit for a container remote; lemonade-only models are untouched (added a
guard test). Down container → existing readiness gate returns slot.loading 503
rather than silently serving from lemonade.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@thinmintdev thinmintdev merged commit b317d50 into main Jun 9, 2026
4 checks passed
@thinmintdev thinmintdev deleted the fix/dispatch-container-slot-preempts-composite branch June 9, 2026 02:02
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant