fix: preflight managed dolt publication before startup store work#1763
Open
azanar wants to merge 13 commits intogastownhall:mainfrom
Open
fix: preflight managed dolt publication before startup store work#1763azanar wants to merge 13 commits intogastownhall:mainfrom
azanar wants to merge 13 commits intogastownhall:mainfrom
Conversation
* fix: add label fallback to polecat work query The polecat work query only checked metadata-based routing (gc.routed_to). Manual dispatch via `bd update --add-label pool:<pool>` sets a label instead, causing work to never be found. Now checks labels as a fallback after metadata. witness: salvage uncommitted work from orphaned polecat (sa-ml2, recovery #14) * test: update work query expectations for pool label fallback
The maintenance pack qualifies the dog agent as "maintenance.dog" but all dispatch (deacon, witnesses, formulas) uses the short name "dog". EffectiveScaleCheck and EffectiveWorkQuery both derive their routing target from QualifiedName, so the controller saw zero demand and never spawned dogs. Add explicit scale_check and work_query overrides using "dog" as the routing key so the controller and work query both see the correct demand. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Without work_query, gc hook always exits 1 (no work found) even when pool beads are ready. Polecats get the pool notification but cannot claim work. Root cause: gc hook sets BEADS_DIR to the rig's beads store, so pool lookups hit rig beads (which have no pool beads). Fix uses BEADS_DIR override to point at city-level beads for pool routing. GC_AGENT is set to the template name by gc hook, so pool:$GC_AGENT resolves to the correct pool label (e.g. pool:sazabi/polecat). Pattern matches the existing dog agent work_query. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
The polecat work query only checked metadata-based routing (gc.routed_to). Manual dispatch via `bd update --add-label pool:<pool>` sets a label instead, causing work to never be found. Now checks labels as a fallback after metadata. witness: salvage uncommitted work from orphaned polecat (sa-ml2, recovery #14)
Add [[named_session]] for polecat (scope=rig, mode=on_demand) so each rig auto-spawns a polecat when work is routed to it. Mirrors refinery pattern. Fixes permanent problem of city-scoped polecats not seeing rig-local work. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
When --on creates a molecule, the work bead was routed but the molecule itself never received gc.routed_to metadata. This caused polecat's work query to find zero molecules, leaving step beads unclaimed. Route the molecule root after routing the work bead, ensuring both have gc.routed_to=<target> so workers can discover and claim the work. Fixes issue where gc sling <target> <bead> --on <formula> created a molecule that was invisible to workers.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
This adds a managed-Dolt publication preflight before early startup/reconcile paths touch bead-store-backed controller state.
The failure mode was startup-time access into store-dependent controller/runtime paths before managed Dolt runtime state had been published, which could surface as transient startup errors and missing-port behavior during restart.
What changed
CityRuntimestartup.cmd/gcpackage tests.Why
Some startup paths were assuming managed Dolt publication had already happened. When that assumption was false, controller/store work could race ahead of runtime publication.
This change closes that ordering gap.
Tests
Targeted tests passed:
go test ./cmd/gc -run 'Test(NewCityRuntimePreflightsManagedDoltPublicationBeforeStartupStoreWork|CityRuntimeEnsureManagedDoltPublishedForTick|EnsureBeadsProvider_execDoesNotReclassifyProviderAfterStart|InitBeadsForDir_execPassesCanonicalDoltDatabase|CityRuntimeTick_LogsWispGCPurgeCountWithNonFatalError)' -count=1Validation notes
During PR prep I found a deterministic
cmd/gcregression in the original test seam for this change. That regression is fixed in this branch by removing package-global test hook mutation and using per-runtime injection instead.I would not claim a fully green repo-wide suite here. In this environment, baseline
mainandupstream/mainare already noisy/red in broadercmd/gccoverage, so the confidence signal for this PR is the targeted regression coverage and the specific startup-ordering fix.Need help on this PR? Tag
@codesmithwith what you need.