Skip to content

phase 13.5: cognition drift extension#24

Merged
nah414 merged 8 commits into
mainfrom
phase-13-5-cognition-drift-extension
Jun 5, 2026
Merged

phase 13.5: cognition drift extension#24
nah414 merged 8 commits into
mainfrom
phase-13-5-cognition-drift-extension

Conversation

@nah414

@nah414 nah414 commented Jun 5, 2026

Copy link
Copy Markdown
Owner

Summary

Wires cognition substrate signals (classifier verdict distribution, confidence stats, cognition wobble disagreement, provider behavior, latency, disposition mix) into the existing phoenix/verification/drift_detector.py::MLStatisticalChecker via its already-shipped feature_provider callback seam. Per-Phoenix-version baseline with schema versioning. Two admin endpoints for capture/get. Auto-capture helper for refreshing baseline after N healthy cycles. Decision 17's three-checker aggregation rule preserved (no fourth checker).

What ships

  • New primitive: phoenix/verification/cognition_drift_features.pyCognitionDriftFeatures dataclass + CognitionFeatureProvider + _VECTOR_FIELDS ordered tuple (single source of truth for vector dimension; module-level assertion catches accidental drift).
  • Baseline storage: phoenix/verification/cognition_drift_baseline.pyCognitionDriftBaseline with per-Phoenix-version JSON storage + FEATURE_SCHEMA_VERSION=1 schema versioning + weighted-L2 distance computation.
  • Admin endpoints: phoenix/admin/cognition_drift_admin.pyPOST /v1/admin/drift/cognition-baseline/capture + GET /v1/admin/drift/cognition-baseline. Auth chain mirrors Phase 13.x.7's encryption_admin pattern exactly. 11 granular per-error audit event types.
  • New permission: ActorPermissions.can_capture_drift_baseline (default deny; granted to bootstrap actors).
  • Extended MLStatisticalChecker: consumes the baseline via 3 new kwargs (cognition_baseline, phoenix_version, distance_threshold); PHOENIX_DRIFT_COGNITION_DISTANCE_THRESHOLD env-var overrides 0.5 default. Existing CheckerResult shape preserved; new reason tokens embedded into the existing summary field.
  • Daemon startup wiring: get_detector() builds the cognition provider + baseline + wires them into the ML checker. Graceful fallback to default checker list on wiring failures.
  • Auto-capture helper: maybe_auto_capture_baseline() refreshes baseline after N consecutive healthy cycles. Shipped as a callable; full integration into DriftDetector.run_cycle is a v1.1.x followup.

Privacy contract

Feature provider reads only aggregate fields (verdict, classification, cognition_provenance, cognition_disagreement_metric, prompt_disposition, axis). It does NOT access prompt_verbatim or prompt_encrypted payload fields. Whitelist enforced by _extract_aggregate_fields + pinned by a dedicated test_privacy_whitelist_contains_only_expected_fields test that asserts the literal whitelist against the approved frozenset.

NOT shipped (deferred follow-ups)

  • Per-provider drift attribution
  • Drift-triggered cognition rerouting (router consumes signal)
  • Full integration of maybe_auto_capture_baseline into DriftDetector.run_cycle (helper is shipped; auto-cycle wiring deferred to v1.1.x followup)
  • Replacing the Tier-1 checker or ml/drift_ensemble.py

Tests added

~32 new across 6 test files:

  • tests/cognition/test_cognition_drift_features.py (10) — primitive + privacy
  • tests/cognition/test_cognition_drift_baseline.py (7) — storage + schema versioning + distance
  • tests/cognition/test_ml_checker_cognition.py (5) — ML checker integration
  • tests/cognition/test_drift_detector_auto_capture.py (3) — auto-capture helper
  • tests/integration/test_admin_cognition_drift_baseline.py (5) — endpoint integration
  • tests/unit/test_permissions_phase13_5.py (2) — permission flag

Project-wide pytest: 1348 passed, 43 skipped, 0 failures. mypy --strict clean on 5 source files. ruff check + format clean on all 11 touched files.

Spec / plan

  • Design: docs/superpowers/specs/2026-05-28-phase-13.5-cognition-drift-extension-design.md (a18c0be on main)
  • Plan: docs/superpowers/plans/2026-06-05-phase-13.5-cognition-drift-extension.md (2a28738 on main)

Test plan

  • Reviewer confirms pytest tests/cognition/test_cognition_drift_features.py tests/cognition/test_cognition_drift_baseline.py tests/cognition/test_ml_checker_cognition.py tests/cognition/test_drift_detector_auto_capture.py tests/integration/test_admin_cognition_drift_baseline.py tests/unit/test_permissions_phase13_5.py -v all green
  • Reviewer confirms mypy --strict clean on the 5 touched modules
  • Reviewer eyeballs the privacy whitelist + the _VECTOR_FIELDS ordering discipline
  • Reviewer confirms CHANGELOG entry under [1.1.0.dev0] below the existing 13.x.4 and 13.x.7 entries

Co-Authored-By: Claude Opus 4.7 (1M context) noreply@anthropic.com

Summary by Sourcery

Integrate cognition drift signals into the drift detector by introducing a versioned cognition baseline, wiring it into the ML statistical checker, and exposing admin controls and helpers to manage and auto-capture baselines.

New Features:

  • Introduce cognition feature extraction from cognition ledger entries into a fixed 14-dimensional feature vector for use by the ML drift checker.
  • Add a versioned cognition drift baseline store with weighted-distance computation to detect drift relative to known-healthy behavior.
  • Expose admin APIs to capture and retrieve cognition drift baselines for the current Phoenix version.
  • Provide an auto-capture helper to refresh cognition baselines after a configurable number of healthy detector cycles.

Enhancements:

  • Extend the MLStatisticalChecker to optionally use cognition baselines with configurable distance thresholds while preserving legacy behavior when no baseline is configured.
  • Wire cognition feature collection and baseline loading into DriftDetector construction with graceful fallback when dependencies are unavailable.
  • Refine ML checker result summaries and metadata to include structured reason tokens and cognition distance metrics.

Documentation:

  • Document Phase 13.5 cognition drift extension and changelog entry, including privacy guarantees and non-shipped follow-ups.

Tests:

  • Add unit and integration tests covering cognition feature extraction, baseline storage and distance calculation, ML checker-baseline integration, auto-capture behavior, admin endpoints, and the new permission flag.

Chores:

  • Introduce a dedicated can_capture_drift_baseline permission with default deny semantics and bootstrap grants for admin-tier actors, and register the new admin module for import.

nah414 and others added 8 commits June 5, 2026 12:11
… tests

Adds POST /v1/admin/drift/cognition-baseline/capture and
GET /v1/admin/drift/cognition-baseline, both gated on the standard
admin auth chain (extract_or_bootstrap + verify_request +
require_admin) plus the dedicated can_capture_drift_baseline
permission flag (Phase 13.5 Task 3). Mirrors the
encryption_admin.py shape from Phase 13.x.7 -- same per-error
audit-event granularity, same dict[str, Any] return convention.

Baseline file path is overridable via
PHOENIX_COGNITION_DRIFT_BASELINE_PATH env var; default lives at
~/.phoenix/runtime/cognition_drift_baseline.json (via
CognitionDriftBaseline default).

5 integration tests cover: happy-path capture, alice-403,
insufficient-data-409, missing-baseline-404, get-after-capture.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@sourcery-ai

sourcery-ai Bot commented Jun 5, 2026

Copy link
Copy Markdown

Reviewer's Guide

Phase 13.5 wires cognition drift signals into the existing MLStatisticalChecker using a new cognition feature provider and per-version baseline with weighted-L2 distance, adds admin endpoints and permissions to capture/read baselines, integrates wiring at DriftDetector startup with graceful fallback, and ships an auto-capture helper plus comprehensive tests and changelog updates.

Sequence diagram for MLStatisticalChecker cognition baseline path

sequenceDiagram
    participant DriftDetector
    participant MLStatisticalChecker
    participant CognitionFeatureProvider
    participant CognitionDriftBaseline

    DriftDetector->>MLStatisticalChecker: run
    MLStatisticalChecker->>CognitionFeatureProvider: __call
    CognitionFeatureProvider-->>MLStatisticalChecker: np_ndarray_features
    MLStatisticalChecker->>CognitionDriftBaseline: read_baseline_for_version
    alt baseline_missing
        MLStatisticalChecker-->>DriftDetector: CheckerResult no_baseline
    else baseline_loaded
        MLStatisticalChecker->>CognitionDriftBaseline: compute_distance
        MLStatisticalChecker-->>DriftDetector: CheckerResult drifting_flag
    end
Loading

Sequence diagram for admin cognition drift baseline capture endpoint

sequenceDiagram
    participant AdminActor
    participant AdminAPI
    participant _admin_authn
    participant StateBackend
    participant CognitionFeatureProvider
    participant CognitionDriftBaseline

    AdminActor->>AdminAPI: POST /v1/admin/drift/cognition-baseline/capture
    AdminAPI->>_admin_authn: _admin_authn
    _admin_authn-->>AdminAPI: actor
    AdminAPI->>StateBackend: get_state_backend
    AdminAPI->>CognitionFeatureProvider: compute
    CognitionFeatureProvider-->>AdminAPI: CognitionDriftFeatures_or_None
    alt insufficient_data
        AdminAPI-->>AdminActor: HTTP 409
    else sufficient_data
        AdminAPI->>CognitionDriftBaseline: write_current
        AdminAPI-->>AdminActor: HTTP 200 baseline_summary
    end
Loading

File-Level Changes

Change Details Files
Extend MLStatisticalChecker and DriftDetector wiring to consume cognition drift baselines and support auto-capture.
  • Add optional cognition_baseline, phoenix_version, and distance_threshold kwargs to MLStatisticalChecker and derive threshold from PHOENIX_DRIFT_COGNITION_DISTANCE_THRESHOLD or default 0.5
  • Implement _run_with_baseline to read per-version baseline, reconstruct CognitionDriftFeatures from feature vector, compute weighted-L2 distance, and emit CheckerResult with new summary/metadata while preserving existing result shape
  • Update run() to branch between baseline path, legacy ML path, and explicit no-baseline states, tightening summary reason tokens
  • Wire CognitionFeatureProvider and CognitionDriftBaseline into MLStatisticalChecker in get_detector(), using PHOENIX_COGNITION_DRIFT_BASELINE_PATH and falling back to default checker list on wiring failures
  • Introduce maybe_auto_capture_baseline helper that, after N healthy cycles, reconstructs features from provider and writes a new baseline for the current Phoenix version
phoenix/verification/drift_detector.py
Introduce cognition drift feature vector computation with strict privacy boundary and schema versioning.
  • Define CognitionDriftFeatures dataclass with 14 numeric metrics plus sample_size and an as_vector method
  • Define _VECTOR_FIELDS as the single ordered source of truth for vector layout, guarded by a module-level assertion against the dataclass fields
  • Implement CognitionFeatureProvider that queries recent cognition ledger entries from the state backend over a rolling window, enforces min_sample_size, and returns the feature vector
  • Enforce a privacy contract via _AGGREGATE_FIELDS_ALLOWED and _extract_aggregate_fields so only aggregate fields are read, with tests pinning the whitelist literal
phoenix/verification/cognition_drift_features.py
tests/cognition/test_cognition_drift_features.py
Add per-version cognition drift baseline storage with schema versioning and weighted-L2 distance computation.
  • Implement CognitionDriftBaseline to persist CognitionDriftFeatures plus schema_version, phoenix_version, and captured_unix in a JSON file at a conventional runtime path or env override
  • Implement read_baseline_for_version that returns None on missing/mismatched versions, raises BaselineSchemaVersionMismatch on schema mismatch, and tolerates malformed files
  • Define per-feature drift scales and construct a weight vector aligned with _VECTOR_FIELDS, guarded by an assertion
  • Expose compute_distance to return a weighted-L2 distance between current and baseline feature vectors
phoenix/verification/cognition_drift_baseline.py
tests/cognition/test_cognition_drift_baseline.py
Expose admin API surface for capturing and reading cognition drift baselines with dedicated permissioning and auditing.
  • Register new cognition drift admin module in phoenix.admin package init
  • Implement POST /v1/admin/drift/cognition-baseline/capture to run CognitionFeatureProvider, enforce min_sample_size, write baseline via CognitionDriftBaseline, and return version, captured_unix, and feature summary
  • Implement GET /v1/admin/drift/cognition-baseline to load the baseline for the running Phoenix version and handle not-found and schema-mismatch cases with appropriate HTTP codes
  • Implement shared _admin_authn that mirrors encryption_admin’s chain plus checks ActorPermissions.can_capture_drift_baseline and emits granular audit events for all error/success paths
  • Add integration tests that run endpoints against an isolated runtime with a real TestClient and synthetic cognition ledger data
phoenix/admin/__init__.py
phoenix/admin/cognition_drift_admin.py
tests/integration/test_admin_cognition_drift_baseline.py
Extend permissions model and changelog to cover cognition drift baseline capture capability.
  • Add can_capture_drift_baseline flag to ActorPermissions with default False and document it alongside existing flags
  • Grant can_capture_drift_baseline to bootstrap admin actors in _default_permissions_for
  • Add unit tests pinning default-deny and explicit-grant behavior for the new permission
  • Document Phase 13.5 cognition drift extension, new files, permissions, privacy contract, auto-capture helper, and non-shipped items in CHANGELOG under 1.1.0.dev0
phoenix/safety/permissions.py
tests/unit/test_permissions_phase13_5.py
CHANGELOG.md
Add tests exercising ML checker cognition integration and the auto-capture helper.
  • Test MLStatisticalChecker behavior for missing provider, provider returning None, missing baseline, distance below threshold, and distance above threshold using a temporary baseline file and synthetic feature vectors
  • Test maybe_auto_capture_baseline existence and behavior, including only capturing after the configured number of healthy cycles and never capturing when state is non-healthy
tests/cognition/test_ml_checker_cognition.py
tests/cognition/test_drift_detector_auto_capture.py

Tips and commands

Interacting with Sourcery

  • Trigger a new review: Comment @sourcery-ai review on the pull request.
  • Continue discussions: Reply directly to Sourcery's review comments.
  • Generate a GitHub issue from a review comment: Ask Sourcery to create an
    issue from a review comment by replying to it. You can also reply to a
    review comment with @sourcery-ai issue to create an issue from it.
  • Generate a pull request title: Write @sourcery-ai anywhere in the pull
    request title to generate a title at any time. You can also comment
    @sourcery-ai title on the pull request to (re-)generate the title at any time.
  • Generate a pull request summary: Write @sourcery-ai summary anywhere in
    the pull request body to generate a PR summary at any time exactly where you
    want it. You can also comment @sourcery-ai summary on the pull request to
    (re-)generate the summary at any time.
  • Generate reviewer's guide: Comment @sourcery-ai guide on the pull
    request to (re-)generate the reviewer's guide at any time.
  • Resolve all Sourcery comments: Comment @sourcery-ai resolve on the
    pull request to resolve all Sourcery comments. Useful if you've already
    addressed all the comments and don't want to see them anymore.
  • Dismiss all Sourcery reviews: Comment @sourcery-ai dismiss on the pull
    request to dismiss all existing Sourcery reviews. Especially useful if you
    want to start fresh with a new review - don't forget to comment
    @sourcery-ai review to trigger a new review!

Customizing Your Experience

Access your dashboard to:

  • Enable or disable review features such as the Sourcery-generated pull request
    summary, the reviewer's guide, and others.
  • Change the review language.
  • Add, remove or edit custom review instructions.
  • Adjust other review settings.

Getting Help

@sourcery-ai sourcery-ai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey - I've found 1 issue, and left some high level feedback:

  • Both MLStatisticalChecker._run_with_baseline and maybe_auto_capture_baseline duplicate the _VECTOR_FIELDSCognitionDriftFeatures reconstruction logic; consider extracting a small shared helper to keep the vector/dataclass mapping consistent in one place.
  • In capture_cognition_drift_baseline, the audit event reaches into CognitionFeatureProvider._min_sample_size; if you want to keep that attribute private, consider exposing a read-only property or constant instead of accessing the underscore-prefixed field directly.
Prompt for AI Agents
Please address the comments from this code review:

## Overall Comments
- Both `MLStatisticalChecker._run_with_baseline` and `maybe_auto_capture_baseline` duplicate the `_VECTOR_FIELDS``CognitionDriftFeatures` reconstruction logic; consider extracting a small shared helper to keep the vector/dataclass mapping consistent in one place.
- In `capture_cognition_drift_baseline`, the audit event reaches into `CognitionFeatureProvider._min_sample_size`; if you want to keep that attribute private, consider exposing a read-only property or constant instead of accessing the underscore-prefixed field directly.

## Individual Comments

### Comment 1
<location path="phoenix/admin/cognition_drift_admin.py" line_range="160" />
<code_context>
+    # the grant-prompt-verbatim sibling endpoint family is not currently
+    # wired for this flag (admin-tier construction is the only grant
+    # path in v1.1).
+    perms = get_permissions_registry().get(actor.name)
+    if not perms.can_capture_drift_baseline:
+        emit_admin_audit(
</code_context>
<issue_to_address>
**issue (bug_risk):** Handle missing permission records to avoid attribute access on None.

If `get_permissions_registry().get(actor.name)` returns `None`, `perms.can_capture_drift_baseline` will raise `AttributeError` and return a 500 instead of a 403. Please handle the `None` case explicitly (e.g., default to `ActorPermissions()` or deny with a 403) before accessing `can_capture_drift_baseline`, and adjust the audit event to match the chosen behavior.
</issue_to_address>

Sourcery is free for open source - if you like our reviews please consider sharing them ✨
Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.

# the grant-prompt-verbatim sibling endpoint family is not currently
# wired for this flag (admin-tier construction is the only grant
# path in v1.1).
perms = get_permissions_registry().get(actor.name)

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

issue (bug_risk): Handle missing permission records to avoid attribute access on None.

If get_permissions_registry().get(actor.name) returns None, perms.can_capture_drift_baseline will raise AttributeError and return a 500 instead of a 403. Please handle the None case explicitly (e.g., default to ActorPermissions() or deny with a 403) before accessing can_capture_drift_baseline, and adjust the audit event to match the chosen behavior.

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: ee2ef1d044

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment on lines +202 to +203
rows = self._backend.list_ledger_entries(
since_unix=since_unix, limit=self._max_entries_read

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Sample the latest ledger rows for drift features

When the 24h window contains more than max_entries_read total ledger rows, StateBackend.list_ledger_entries returns the oldest rows first, and cognition filtering happens only after this capped read. In a busy deployment, recent cognition rows can sit past the first 10,000 rows, causing the provider to return insufficient_data or compare stale early-window features while current cognition drift is invisible; fetch enough rows to cover cognition entries or select the most recent rows before computing the vector.

Useful? React with 👍 / 👎.

# ----- Phase 13.x.7 extension (encryption admin) -----
can_rotate_encryption_key: bool = False
# ----- Phase 13.5 extension (cognition drift baseline) -----
can_capture_drift_baseline: bool = False

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Backfill the new permission for persisted bootstrap actors

For installations that already have actor_permissions.json entries for adam or ash, deserialization fills this newly added field from the dataclass default (False) instead of using the bootstrap default grant, so previously persisted bootstrap admins will get 403 from the new baseline endpoints after upgrade. Add a load-time migration/backfill for missing can_capture_drift_baseline on bootstrap/admin records so the documented bootstrap grant remains true across upgrades.

Useful? React with 👍 / 👎.

@nah414 nah414 merged commit d49a49f into main Jun 5, 2026
10 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant