feat(doctor,events,pgauth): postgres-auth check + pg.credential_resolved event#1796
Draft
quad341 wants to merge 10 commits intogastownhall:mainfrom
Draft
feat(doctor,events,pgauth): postgres-auth check + pg.credential_resolved event#1796quad341 wants to merge 10 commits intogastownhall:mainfrom
quad341 wants to merge 10 commits intogastownhall:mainfrom
Conversation
…tauth
Slice 2/4 of the PG-auth chain (parent design ga-dga2; design ga-3hay).
Adds the new internal/pgauth package that resolves a Postgres password
for a (scope, host, port, user) tuple via a deterministic 7-tier chain:
1. envMap["GC_POSTGRES_PASSWORD"]
2. envMap["BEADS_POSTGRES_PASSWORD"]
3. os.Getenv("GC_POSTGRES_PASSWORD")
4. <scope>/.beads/.env BEADS_POSTGRES_PASSWORD (chmod-checked)
5. os.Getenv("BEADS_POSTGRES_PASSWORD")
6. $BEADS_CREDENTIALS_FILE [host:port] section (chmod-checked, parse-checked)
7. ~/.config/beads/credentials [host:port] section (chmod-checked, parse-checked)
Public surface mirrors the design's pinned contract:
- Endpoint{Host,Port,User}, Resolved{User,Password,Source}, Source enum
with stable iota ordering (consumers may persist the int value).
- Source.String() returns the eight snake_case identifiers slice 4's
event payload and gc trace will read verbatim ("none",
"projected_gc", "projected_beads", "process_env_gc", "scope_file",
"process_env_beads", "credentials_file_env", "credentials_file_home").
- ResolveFromEnv(envMap, scopeRoot, endpoint) (Resolved, error). Pass
envMap == nil to skip tiers 1+2.
- ReadStoreLocalPassword(scopeRoot) — slice 4's doctor probe; applies
the same chmod predicate as tier 4.
Errors are typed for slice 4's errors.As discrimination:
- ErrNoPasswordResolvable (sentinel; chain exhaustion wraps with %w).
- *PermissivePermissionError on tiers 4/6/7 when mode & 0177 != 0.
STOPS the chain — does not fall through, since silent fall-through
would hide the operator's intent.
- *CredentialsParseError on tiers 6/7 for malformed credentials files.
Closed reason vocabulary: unterminated section header, empty section
header, missing '=' in key/value line. Unknown keys inside a matching
section are silently ignored per design §7.
Whitespace-only values count as empty (mirrors doltauth). Quoted
values in credentials files honour strconv.Unquote escapes.
Tests cover every tier (positive), exhaustion, NilEnv-skips-1+2, all
three chain-termination paths (perm scope file, perm credentials file,
malformed credentials file), per-reason parse-error variants, absent
file fall-through, non-matching section fall-through, whitespace
emptiness, quoted password, unknown-key tolerance, endpoint fields
echoed in error wording, Source.String() table, and stable iota
ordering. ReadStoreLocalPassword is exercised on happy/permissive/
absent/empty-scope paths.
CI guard TestNoOsGetenvOutsidePgauth walks cmd/+internal/ and asserts
no other package reads BEADS_POSTGRES_* or GC_POSTGRES_* via os.Getenv,
keeping resolution centralised before slices 3 and 4 wire it up.
Coverage: 88.4% of statements; remaining gap is Windows APPDATA branch
in DefaultCredentialsPath plus os.Stat / os.Open error paths that can't
be triggered after Stat passes — genuinely platform-conditional.
Bead: ga-vt6q
Source design: ga-3hay
Reference impl: internal/doltauth (mirror by structure)
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Slice 1/4 of the PG-auth design (ga-dga2). Foundation for slices 2-4 (pgauth, bd_env wiring, doctor checks). - Extend MetadataState with PostgresHost, PostgresPort, PostgresUser, PostgresDatabase plus JSON tags on all eight fields. Existing Dolt scopes round-trip with no diff (omitempty on every backend-specific field). - Add LoadMetadataState (fs, path) -> (state, ok, err) with the deterministic short-circuit specified by the design (E1 -> E3 -> E2 -> E4 -> E5). Validation failures are wrapped in *MetadataParseError so slice 4 can discriminate parse vs IO errors via errors.As. - Extend EnsureCanonicalMetadata with the four PG keys in defaults and a backend-discriminated cross-backend scrub (dolt drops postgres_*, postgres drops dolt_*, empty backend preserves both). The json.Unmarshal swallow at the canonicaliser stays untouched. - 14 fixtures under internal/beads/contract/testdata/metadata/ cover every valid and rejected case plus the three mixedBackendField selector branches. Tests cover loader idempotence, byte-equality round-trip on every valid_*.json, the sibling regression TestEnsureCanonicalMetadataPreservesExistingPostgresHostWhenStateOmitsIt, empty-object behaviour, and the *MetadataParseError typed-error contract. Bead: ga-pnqg Design: ga-0nmb Architecture: ga-dga2 Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Misspell linter (revive/misspell) flags 'behaviour' as a US-spelling violation in CI Preflight. Single-character comment-only fix. Reviewer finding F1 on PR gastownhall#1727 (ga-cwq1). Bead: ga-pnqg Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Slice 3/4 of the PG-auth chain. When a scope's MetadataState.Backend == "postgres", project the resolver's output into the bd subprocess env; Dolt-backed scopes flow through unchanged under the same dispatch arm. Wrapper-side wiring: - Rename applyCanonicalScopeDoltEnv -> applyCanonicalScopeBackendEnv; the Dolt body lives under case "", "dolt"; postgres takes a new arm calling applyResolvedScopePostgresEnv; default returns "unsupported backend %q for scope %s". - Add applyResolvedScopePostgresEnv (calls pgauth.ResolveFromEnv, projects six PG keys), mirrorBeadsPostgresEnv, and projectedPostgresEnvKeys. - Extend mergeRuntimeEnv strip list with all six PG keys so a stale parent BEADS_POSTGRES_* never leaks into a child. - bdCommandRunnerWithManagedRetry gains a PG branch BEFORE the bdTransportRetryableError check: PG transport errors are wrapped with "postgres at %s:%s is unreachable; gc does not manage external PG endpoints" and never trigger recoverManagedBDCommand. Tests: - TestApplyResolvedScopePostgresEnv_HappyPath / TestApplyResolvedScopePostgresEnv_NoPasswordResolvable - TestBdRuntimeEnvForRig_PostgresBackend - TestMergeRuntimeEnvScrubsPostgresKeys - TestPGTransportError_NoManagedRecovery / _MarkerCoverage - TestMixedBackendCity_PerScopeDispatch - TestProjectedKeysCoverage symmetry guard - TestApplyCanonicalScopeBackendEnv_UnsupportedBackend Refs: ga-wvka, design ga-4qvs, architecture ga-dga2. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Validation-gate grep for applyCanonicalScopeDoltEnv must return zero hits in cmd/gc/. The doc comment retained the old name as a "Renamed from" historical marker, which both tripped the gate and violated the project's "no comments for removed functionality" rule. Refs: ga-wvka, design ga-4qvs. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Cleared by the deployer's gate FAIL on ga-wvka @ ea9af92: - cmd/gc/bd_env.go: Centralised → Centralized (misspell) - cmd/gc/bd_env_test.go: rename unused env param to _ (revive) - internal/pgauth/auth.go: add doc comment on Source const block (revive exported-const) + gofumpt re-aligns trailing comments - internal/pgauth/auth_test.go: drop unused homeDir/credPath returns from pinHome (unparam); all four call sites discard Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…ved event (slice 4/4) Slice 4/4 of the PG-auth chain. Closes ga-yih2 (design ga-5c4x). Adds operator-facing observability for Postgres credential resolution: - gc doctor reports per-scope auth status across five branches (OK, fragile-from-env, no-creds, permissive-file, malformed-file) with verbatim wording from design §3.3. - gc doctor --explain-postgres-auth prints the seven-tier resolution table per scope with [YES]/[no]/[skip]/[ERR] tokens. Status right- aligned at column 70; footer line carries the Source identifier and position. Honest skip-after-winner semantics — tiers after the winner are [skip], never [no]. - pg.credential_resolved is registered as a typed events.Payload with six string fields (scope_kind, scope_name, source, host, port, user). Slice-3's applyResolvedScopePostgresEnv now emits the event after a successful resolve. Best-effort recorder; failures do NOT propagate. No password field anywhere. - TestPostgresEventOmitsPassword regression test asserts the payload, the full event envelope, and execenv.RedactText all keep the password literal off every observation surface. Negative-control sub-test guards against a regression that breaks emission entirely. Pinning notes: - Check lives at internal/doctor/postgres_auth.go (flat layout). Design §3.1 named internal/doctor/checks/postgres_auth.go but no checks/ sub-package exists in this codebase — every existing check is flat next to types.go. Following existing patterns per project rules. - The explain-table renderer wires through a new opt-in Renderer interface in internal/doctor/types.go (design §4.1 alternative b — smaller diff than threading a writer through CheckContext). - The §7 warm-up alert deliverable is deferred — no dolt-side warm-up mail-to-mayor path exists in cmd/gc/cmd_start.go today. Filed as ga-perl18 (needs-design routed to gascity/architect) per design §7.3. - Slice-3's applyResolvedScopePostgresEnv signature gains cityPath. The slice-3 forward-compat seam (no new parameters) was explicit about slice 4 lifting it; the helper needs cityPath to find the city's events.Recorder. Validation: - go test ./internal/doctor/ -run TestPostgresAuth -count=1 — PASS. - go test ./internal/pgauth/ -run TestPostgresEventOmitsPassword -count=1 — PASS. - go test ./internal/events/ -count=1 — PASS (registry coverage holds). - go test ./internal/api/ -count=1 — PASS (TypedEventEnvelopeUnionsCoverKnownEventTypes includes pg.credential_resolved; TestOpenAPISpecInSync clean). - go vet ./... — clean. - grep -rn '"password"' internal/pgauth/events.go — zero hits. - grep -n PostgresCredentialResolved internal/events/events.go — present in KnownEventTypes. - ZFC: no role names anywhere in the diff. Refs: ga-yih2, design ga-5c4x, architecture ga-dga2. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Pre-emptive cleanup before slice 4 reaches the deployer's gate. Same class of findings the deployer hit on slice 3 (ga-wvka): - internal/api/event_payloads.go: blank line before pgauth blank-import comment block (goimports) - internal/doctor/postgres_auth.go: anglicized → US spellings (unrecognised, recognise) + rename unused cityPath param to _ - internal/doctor/types.go: honours → honors - internal/pgauth/events_redact_test.go: localises → localizes - internal/doctor/postgres_auth_test.go: drop the redundant type-alias conversion in humanSourceLabelByIndex's switch + drop the always-"devpw" password param from writePGScopeEnv (4 callers) Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Three findings from gascity/reviewer on ga-j65i3c at HEAD 557093c: 1. cmd/gc/bd_env.go: emitPostgresCredentialResolved leaked the FileRecorder fd because Record() returned without closing the underlying *os.File. Add `defer rec.Close()` (best-effort, errcheck silenced) so emission stays a no-fail side effect on long-running processes (gc supervise, mayor sessions). 2. internal/doctor/postgres_auth_test.go: TestHumanSourceLabel was driving a parallel humanSourceLabelByIndex shim instead of the production humanSourceLabel(pgauth.Source). Drop the shim, import pgauth, drive the production function with the actual Source enum values. Adds a none_returns_empty case so the SourceNone branch is locked too. 3. internal/doctor/postgres_auth.go: drop the leading `_ string` placeholder from loadPostgresAuthScope's signature; both call sites were passing cityPath even though the body never read it. Validation gates re-run after the fixes: - go test ./internal/doctor/ -run TestPostgresAuth -count=1 — PASS - go test ./internal/doctor/ -run TestHumanSourceLabel -count=1 — PASS (8 sub-tests incl. the new SourceNone case) - go test ./internal/pgauth/ -count=1 — PASS - go test ./internal/events/ ./internal/api/ -count=1 — PASS - go test ./cmd/gc/ -run "TestBd|TestPG|TestMergeRuntimeEnv|TestApply|TestMixedBackend|TestProjectedKeysCoverage" -count=1 — PASS - go vet ./... — clean - golangci-lint run ./... — 0 issues - spec-ci no drift Pre-existing tree-state failures noted in the original packet (BeadsStoreCheck environment fixtures + frontend.env fixture) still fail on a clean origin/main checkout. Refs: ga-j65i3c (review verdict: request-changes), ga-yih2 (closed). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Verdict: PASS at HEAD c7fb142 on quad341:builder/ga-yih2-1. Slice 4/4 of PG-auth chain. Stacked on PR gastownhall#1792 (slices 1+2+3). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Codecov Report❌ Patch coverage is 📢 Thoughts on this report? Let us know! |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What this changes
Final slice (4/4) of the PG-auth chain. After the resolver and bd_env
wiring land in #1792, this slice makes the result observable:
gc doctorpostgres-auth check — for every PG-backed rig,attempts credential resolution and reports which source was used
(env / password file / pgpass / managed-store / unresolved). Status
branches: OK / WARNING / ERROR / SKIPPED / NOT_APPLICABLE.
pg.credential_resolvedtyped event — emitted bygc bdonceper resolution. Payload carries the source label and a timestamp; the
password is never in the event (locked by
TestPostgresEventOmitsPassword).humanSourceLabel— eightpgauth.Source.String()mappings forconsistent operator messaging across doctor + trace + events.
The result: an operator running
gc doctorlearns whether PG-backedrigs have resolvable credentials and which tier resolved them, and an
auditor reading
gc tracesees the chosen source on every bd subprocessspawn.
Review notes
humanSourceLabeltable is locked byTestHumanSourceLabel. Areviewer-noted test-smell about the test using a parallel
humanSourceLabelByIndexshim was addressed inc7fb142e— thetest now exercises production
humanSourceLabeldirectly.c7fb142e: the password-filereader's
os.Fileis closed in all paths.events.RegisterPayloadandsatisfies the
TestEveryKnownEventTypeHasRegisteredPayloadinvariant.
to its own design slice (
ga-perl18, routed to architect) — nodolt-side warm-up
Notify/SendMail/Mayorpath exists today, sothe alert wiring is meaningful only after that lands.
Test plan
go build ./...,go vet ./...,golangci-lint runclean (full repo).TestPostgresAuth*(10/10),TestHumanSourceLabel,TestPostgresEventOmitsPassword,internal/events(1.19s),internal/pgauth(114ms).origin/mainsucceeded with no conflicts.release-gates/ga-j65i3c-pg-auth-slice-4-gate.md🤖 Deployed by actual-factory (stacked on #1792)
Need help on this PR? Tag
@codesmithwith what you need.