Skip to content

feat(doctor,events,pgauth): postgres-auth check + pg.credential_resolved event#1796

Draft
quad341 wants to merge 10 commits intogastownhall:mainfrom
quad341:builder/ga-yih2-1
Draft

feat(doctor,events,pgauth): postgres-auth check + pg.credential_resolved event#1796
quad341 wants to merge 10 commits intogastownhall:mainfrom
quad341:builder/ga-yih2-1

Conversation

@quad341
Copy link
Copy Markdown
Collaborator

@quad341 quad341 commented May 7, 2026

What this changes

Final slice (4/4) of the PG-auth chain. After the resolver and bd_env
wiring land in #1792, this slice makes the result observable:

  • gc doctor postgres-auth check — for every PG-backed rig,
    attempts credential resolution and reports which source was used
    (env / password file / pgpass / managed-store / unresolved). Status
    branches: OK / WARNING / ERROR / SKIPPED / NOT_APPLICABLE.
  • pg.credential_resolved typed event — emitted by gc bd once
    per resolution. Payload carries the source label and a timestamp; the
    password is never in the event (locked by
    TestPostgresEventOmitsPassword).
  • humanSourceLabel — eight pgauth.Source.String() mappings for
    consistent operator messaging across doctor + trace + events.

The result: an operator running gc doctor learns whether PG-backed
rigs have resolvable credentials and which tier resolved them, and an
auditor reading gc trace sees the chosen source on every bd subprocess
spawn.

⚠️ Stacked on #1792 (PG-auth slices 1+2+3). Until #1792 merges,
this PR will show all nine PG-auth commits. The slice-4 delta is
three commits at the top: 65f1eb43 (main), 557093cf (lint
cleanup), c7fb142e (review fixup).

Review notes

  • The humanSourceLabel table is locked by TestHumanSourceLabel. A
    reviewer-noted test-smell about the test using a parallel
    humanSourceLabelByIndex shim was addressed in c7fb142e — the
    test now exercises production humanSourceLabel directly.
  • File-descriptor leak addressed in c7fb142e: the password-file
    reader's os.File is closed in all paths.
  • Event payload is registered via events.RegisterPayload and
    satisfies the TestEveryKnownEventTypeHasRegisteredPayload
    invariant.
  • Out-of-scope per design §7: warm-up alert (mail-to-mayor) deferred
    to its own design slice (ga-perl18, routed to architect) — no
    dolt-side warm-up Notify/SendMail/Mayor path exists today, so
    the alert wiring is meaningful only after that lands.

Test plan

  • go build ./..., go vet ./..., golangci-lint run clean (full repo).
  • Targeted suites pass:
    TestPostgresAuth* (10/10), TestHumanSourceLabel,
    TestPostgresEventOmitsPassword, internal/events (1.19s),
    internal/pgauth (114ms).
  • Test merge into current origin/main succeeded with no conflicts.
  • Release gate: release-gates/ga-j65i3c-pg-auth-slice-4-gate.md

🤖 Deployed by actual-factory (stacked on #1792)


View in Codesmith
Need help on this PR? Tag @codesmith with what you need.

  • Let Codesmith autofix CI failures and bot reviews

Claude Code (gascity/builder) and others added 10 commits May 6, 2026 18:05
…tauth

Slice 2/4 of the PG-auth chain (parent design ga-dga2; design ga-3hay).
Adds the new internal/pgauth package that resolves a Postgres password
for a (scope, host, port, user) tuple via a deterministic 7-tier chain:

  1. envMap["GC_POSTGRES_PASSWORD"]
  2. envMap["BEADS_POSTGRES_PASSWORD"]
  3. os.Getenv("GC_POSTGRES_PASSWORD")
  4. <scope>/.beads/.env BEADS_POSTGRES_PASSWORD (chmod-checked)
  5. os.Getenv("BEADS_POSTGRES_PASSWORD")
  6. $BEADS_CREDENTIALS_FILE [host:port] section (chmod-checked, parse-checked)
  7. ~/.config/beads/credentials [host:port] section (chmod-checked, parse-checked)

Public surface mirrors the design's pinned contract:
- Endpoint{Host,Port,User}, Resolved{User,Password,Source}, Source enum
  with stable iota ordering (consumers may persist the int value).
- Source.String() returns the eight snake_case identifiers slice 4's
  event payload and gc trace will read verbatim ("none",
  "projected_gc", "projected_beads", "process_env_gc", "scope_file",
  "process_env_beads", "credentials_file_env", "credentials_file_home").
- ResolveFromEnv(envMap, scopeRoot, endpoint) (Resolved, error). Pass
  envMap == nil to skip tiers 1+2.
- ReadStoreLocalPassword(scopeRoot) — slice 4's doctor probe; applies
  the same chmod predicate as tier 4.

Errors are typed for slice 4's errors.As discrimination:
- ErrNoPasswordResolvable (sentinel; chain exhaustion wraps with %w).
- *PermissivePermissionError on tiers 4/6/7 when mode & 0177 != 0.
  STOPS the chain — does not fall through, since silent fall-through
  would hide the operator's intent.
- *CredentialsParseError on tiers 6/7 for malformed credentials files.
  Closed reason vocabulary: unterminated section header, empty section
  header, missing '=' in key/value line. Unknown keys inside a matching
  section are silently ignored per design §7.

Whitespace-only values count as empty (mirrors doltauth). Quoted
values in credentials files honour strconv.Unquote escapes.

Tests cover every tier (positive), exhaustion, NilEnv-skips-1+2, all
three chain-termination paths (perm scope file, perm credentials file,
malformed credentials file), per-reason parse-error variants, absent
file fall-through, non-matching section fall-through, whitespace
emptiness, quoted password, unknown-key tolerance, endpoint fields
echoed in error wording, Source.String() table, and stable iota
ordering. ReadStoreLocalPassword is exercised on happy/permissive/
absent/empty-scope paths.

CI guard TestNoOsGetenvOutsidePgauth walks cmd/+internal/ and asserts
no other package reads BEADS_POSTGRES_* or GC_POSTGRES_* via os.Getenv,
keeping resolution centralised before slices 3 and 4 wire it up.

Coverage: 88.4% of statements; remaining gap is Windows APPDATA branch
in DefaultCredentialsPath plus os.Stat / os.Open error paths that can't
be triggered after Stat passes — genuinely platform-conditional.

Bead: ga-vt6q
Source design: ga-3hay
Reference impl: internal/doltauth (mirror by structure)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Slice 1/4 of the PG-auth design (ga-dga2). Foundation for slices 2-4
(pgauth, bd_env wiring, doctor checks).

- Extend MetadataState with PostgresHost, PostgresPort, PostgresUser,
  PostgresDatabase plus JSON tags on all eight fields. Existing Dolt
  scopes round-trip with no diff (omitempty on every backend-specific
  field).
- Add LoadMetadataState (fs, path) -> (state, ok, err) with the
  deterministic short-circuit specified by the design (E1 -> E3 -> E2
  -> E4 -> E5). Validation failures are wrapped in *MetadataParseError
  so slice 4 can discriminate parse vs IO errors via errors.As.
- Extend EnsureCanonicalMetadata with the four PG keys in defaults and
  a backend-discriminated cross-backend scrub (dolt drops postgres_*,
  postgres drops dolt_*, empty backend preserves both). The
  json.Unmarshal swallow at the canonicaliser stays untouched.
- 14 fixtures under internal/beads/contract/testdata/metadata/ cover
  every valid and rejected case plus the three mixedBackendField
  selector branches.

Tests cover loader idempotence, byte-equality round-trip on every
valid_*.json, the sibling regression
TestEnsureCanonicalMetadataPreservesExistingPostgresHostWhenStateOmitsIt,
empty-object behaviour, and the *MetadataParseError typed-error
contract.

Bead: ga-pnqg
Design: ga-0nmb
Architecture: ga-dga2

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Misspell linter (revive/misspell) flags 'behaviour' as a US-spelling
violation in CI Preflight. Single-character comment-only fix.

Reviewer finding F1 on PR gastownhall#1727 (ga-cwq1).

Bead: ga-pnqg

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Slice 3/4 of the PG-auth chain. When a scope's
MetadataState.Backend == "postgres", project the resolver's output
into the bd subprocess env; Dolt-backed scopes flow through unchanged
under the same dispatch arm.

Wrapper-side wiring:
- Rename applyCanonicalScopeDoltEnv -> applyCanonicalScopeBackendEnv;
  the Dolt body lives under case "", "dolt"; postgres takes a new
  arm calling applyResolvedScopePostgresEnv; default returns
  "unsupported backend %q for scope %s".
- Add applyResolvedScopePostgresEnv (calls pgauth.ResolveFromEnv,
  projects six PG keys), mirrorBeadsPostgresEnv, and
  projectedPostgresEnvKeys.
- Extend mergeRuntimeEnv strip list with all six PG keys so a stale
  parent BEADS_POSTGRES_* never leaks into a child.
- bdCommandRunnerWithManagedRetry gains a PG branch BEFORE the
  bdTransportRetryableError check: PG transport errors are wrapped
  with "postgres at %s:%s is unreachable; gc does not manage external
  PG endpoints" and never trigger recoverManagedBDCommand.

Tests:
- TestApplyResolvedScopePostgresEnv_HappyPath /
  TestApplyResolvedScopePostgresEnv_NoPasswordResolvable
- TestBdRuntimeEnvForRig_PostgresBackend
- TestMergeRuntimeEnvScrubsPostgresKeys
- TestPGTransportError_NoManagedRecovery / _MarkerCoverage
- TestMixedBackendCity_PerScopeDispatch
- TestProjectedKeysCoverage symmetry guard
- TestApplyCanonicalScopeBackendEnv_UnsupportedBackend

Refs: ga-wvka, design ga-4qvs, architecture ga-dga2.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Validation-gate grep for applyCanonicalScopeDoltEnv must return zero
hits in cmd/gc/. The doc comment retained the old name as a "Renamed
from" historical marker, which both tripped the gate and violated the
project's "no comments for removed functionality" rule.

Refs: ga-wvka, design ga-4qvs.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Cleared by the deployer's gate FAIL on ga-wvka @ ea9af92:

- cmd/gc/bd_env.go: Centralised → Centralized (misspell)
- cmd/gc/bd_env_test.go: rename unused env param to _ (revive)
- internal/pgauth/auth.go: add doc comment on Source const block
  (revive exported-const) + gofumpt re-aligns trailing comments
- internal/pgauth/auth_test.go: drop unused homeDir/credPath
  returns from pinHome (unparam); all four call sites discard

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…ved event (slice 4/4)

Slice 4/4 of the PG-auth chain. Closes ga-yih2 (design ga-5c4x).

Adds operator-facing observability for Postgres credential resolution:

- gc doctor reports per-scope auth status across five branches (OK,
  fragile-from-env, no-creds, permissive-file, malformed-file) with
  verbatim wording from design §3.3.
- gc doctor --explain-postgres-auth prints the seven-tier resolution
  table per scope with [YES]/[no]/[skip]/[ERR] tokens. Status right-
  aligned at column 70; footer line carries the Source identifier and
  position. Honest skip-after-winner semantics — tiers after the
  winner are [skip], never [no].
- pg.credential_resolved is registered as a typed events.Payload with
  six string fields (scope_kind, scope_name, source, host, port, user).
  Slice-3's applyResolvedScopePostgresEnv now emits the event after a
  successful resolve. Best-effort recorder; failures do NOT propagate.
  No password field anywhere.
- TestPostgresEventOmitsPassword regression test asserts the payload,
  the full event envelope, and execenv.RedactText all keep the password
  literal off every observation surface. Negative-control sub-test
  guards against a regression that breaks emission entirely.

Pinning notes:

- Check lives at internal/doctor/postgres_auth.go (flat layout). Design
  §3.1 named internal/doctor/checks/postgres_auth.go but no checks/
  sub-package exists in this codebase — every existing check is flat
  next to types.go. Following existing patterns per project rules.
- The explain-table renderer wires through a new opt-in Renderer
  interface in internal/doctor/types.go (design §4.1 alternative b —
  smaller diff than threading a writer through CheckContext).
- The §7 warm-up alert deliverable is deferred — no dolt-side warm-up
  mail-to-mayor path exists in cmd/gc/cmd_start.go today. Filed as
  ga-perl18 (needs-design routed to gascity/architect) per design §7.3.
- Slice-3's applyResolvedScopePostgresEnv signature gains cityPath. The
  slice-3 forward-compat seam (no new parameters) was explicit about
  slice 4 lifting it; the helper needs cityPath to find the city's
  events.Recorder.

Validation:

- go test ./internal/doctor/ -run TestPostgresAuth -count=1 — PASS.
- go test ./internal/pgauth/ -run TestPostgresEventOmitsPassword -count=1 — PASS.
- go test ./internal/events/ -count=1 — PASS (registry coverage holds).
- go test ./internal/api/ -count=1 — PASS (TypedEventEnvelopeUnionsCoverKnownEventTypes
  includes pg.credential_resolved; TestOpenAPISpecInSync clean).
- go vet ./... — clean.
- grep -rn '"password"' internal/pgauth/events.go — zero hits.
- grep -n PostgresCredentialResolved internal/events/events.go — present
  in KnownEventTypes.
- ZFC: no role names anywhere in the diff.

Refs: ga-yih2, design ga-5c4x, architecture ga-dga2.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Pre-emptive cleanup before slice 4 reaches the deployer's gate.
Same class of findings the deployer hit on slice 3 (ga-wvka):

- internal/api/event_payloads.go: blank line before pgauth blank-import
  comment block (goimports)
- internal/doctor/postgres_auth.go: anglicized → US spellings
  (unrecognised, recognise) + rename unused cityPath param to _
- internal/doctor/types.go: honours → honors
- internal/pgauth/events_redact_test.go: localises → localizes
- internal/doctor/postgres_auth_test.go: drop the redundant
  type-alias conversion in humanSourceLabelByIndex's switch + drop
  the always-"devpw" password param from writePGScopeEnv (4 callers)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Three findings from gascity/reviewer on ga-j65i3c at HEAD 557093c:

1. cmd/gc/bd_env.go: emitPostgresCredentialResolved leaked the
   FileRecorder fd because Record() returned without closing the
   underlying *os.File. Add `defer rec.Close()` (best-effort, errcheck
   silenced) so emission stays a no-fail side effect on long-running
   processes (gc supervise, mayor sessions).

2. internal/doctor/postgres_auth_test.go: TestHumanSourceLabel was
   driving a parallel humanSourceLabelByIndex shim instead of the
   production humanSourceLabel(pgauth.Source). Drop the shim, import
   pgauth, drive the production function with the actual Source enum
   values. Adds a none_returns_empty case so the SourceNone branch
   is locked too.

3. internal/doctor/postgres_auth.go: drop the leading `_ string`
   placeholder from loadPostgresAuthScope's signature; both call
   sites were passing cityPath even though the body never read it.

Validation gates re-run after the fixes:
- go test ./internal/doctor/ -run TestPostgresAuth -count=1 — PASS
- go test ./internal/doctor/ -run TestHumanSourceLabel -count=1 — PASS
  (8 sub-tests incl. the new SourceNone case)
- go test ./internal/pgauth/ -count=1 — PASS
- go test ./internal/events/ ./internal/api/ -count=1 — PASS
- go test ./cmd/gc/ -run "TestBd|TestPG|TestMergeRuntimeEnv|TestApply|TestMixedBackend|TestProjectedKeysCoverage" -count=1 — PASS
- go vet ./... — clean
- golangci-lint run ./... — 0 issues
- spec-ci no drift

Pre-existing tree-state failures noted in the original packet
(BeadsStoreCheck environment fixtures + frontend.env fixture) still
fail on a clean origin/main checkout.

Refs: ga-j65i3c (review verdict: request-changes), ga-yih2 (closed).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Verdict: PASS at HEAD c7fb142 on quad341:builder/ga-yih2-1.
Slice 4/4 of PG-auth chain. Stacked on PR gastownhall#1792 (slices 1+2+3).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@quad341 quad341 requested a review from julianknutsen as a code owner May 7, 2026 10:25
@github-actions github-actions Bot added the status/needs-triage Inbox — we haven't looked at it yet label May 7, 2026
@codecov-commenter
Copy link
Copy Markdown

@quad341 quad341 marked this pull request as draft May 7, 2026 13:32
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

status/needs-triage Inbox — we haven't looked at it yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants