Add unified CLI and handoff/SARIF support for pii-scan and verify#1
Merged
Add unified CLI and handoff/SARIF support for pii-scan and verify#1
Conversation
…erify
Closes the middle of the three-point GTM flow (scan → generate → verify) by
adding the missing bridge layers while keeping the public repo free of any
proprietary generation-API surface.
@certifieddata/cli (new)
- Unified top-level command: pii-scan, generate, verify, registry
- generate is a web-handoff only: never uploads datasets, never embeds
hosted-API details. Reads a sanitized handoff or a local file, builds a
continue-generation URL with aggregate counts only, and opens a browser.
@certifieddata/pii-scan
- --emit-handoff / --output-handoff <path> — sanitized handoff JSON
(pii-scan.handoff.v1). No raw values, no redacted samples.
- --open-generate — opens the continue-generation URL; URL carries only
risk level, finding count, column count, row count.
- --sarif — SARIF 2.1.0 output for GitHub Code Scanning.
- Library exports: buildHandoff, handoffContinueUrl, buildSarif.
- Tests lock in the privacy rule: raw row values never appear in handoff
artifacts, SARIF logs, or deeplink URLs.
@certifieddata/verify
- verifyManifestFile(path, opts) — offline manifest-file verification.
- verifyBundleDirectory(dir, opts) — offline unpacked-bundle verification.
- verifyBundleZip(zipPath, opts) — offline zipped-bundle verification.
- Minimal built-in STORE+DEFLATE zip reader (no new runtime deps).
- Fulfills the "if certifieddata.io goes away, the zip still verifies"
promise end-to-end.
Docs
- docs/pricing.md — local vs. hosted boundary, evaluation-without-account.
- docs/compliance.md — SOC 2 / GDPR / HIPAA / CCPA / EU AI Act crosswalk.
Support-only framing, no certification claims.
- README hero flow updated to lead with the unified CLI three-step.
- llms.txt updated with the new CLI surface.
All builds clean, 93 tests passing (22 verify, 44 pii-scan, 21 schemas,
6 cli). Lint clean across the workspace.
https://claude.ai/code/session_01V7ARryoR769vHVQpnUFnqH
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
This PR introduces a unified command-line interface (
@certifieddata/cli) and extends@certifieddata/pii-scanwith sanitized handoff summaries and SARIF 2.1.0 output support. The changes enable a complete local-first workflow: detect PII → hand off to generation → verify certificates, all while maintaining strict privacy guarantees (no raw data transmission).Key Changes
New Package:
@certifieddata/clipii-scan,generate,verify,registrypii-scansubcommand wraps@certifieddata/pii-scanwith additional output formatsgeneratesubcommand implements a browser-based handoff workflow (no file uploads, only aggregate counts in URL params)verifysubcommand wraps@certifieddata/verifyfor offline certificate validation@certifieddata/pii-scanExtensionshandoff.ts): Builds sanitized summary artifacts containing only aggregate counts, column names, and risk labels—never raw samples or valuesbuildHandoff()function creates aHandoffSummarywith schema versionpii-scan.handoff.v1handoffContinueUrl()helper generates deeplinks with only counts and risk (no column names in URL)sarif.ts): Generates SARIF 2.1.0-compliant logs for GitHub Code Scanning integrationcli.ts):--emit-handoffflag to print sanitized handoff JSON--output-handoff <path>to write handoff to disk--open-generateto launch browser with continue URL--sarifflag for SARIF output--base-urloverride for handoff generationbuildHandoff,handoffContinueUrl,buildSarif@certifieddata/verifyExtensionsbundle.ts): Offline verification of certificate bundles in three formatsverifyManifestFile(): Verify a manifest JSON against a PEM public keyverifyBundleDirectory(): Verify an unpacked bundle directory with auto-discovery of manifest and key filesverifyBundleZip(): Verify a zipped bundle with support for stored and deflate compressionzip.ts): Custom implementation supporting STORE (0) and DEFLATE (8) compressionDocumentation
docs/compliance.md: Crosswalk showing how tooling supports SOC 2, GDPR, and HIPAA workflowsdocs/pricing.md: Clear delineation of what is local/free vs. hosted/account-requiredREADME.mdwith three-step workflow examplellms.txtwith CLI package descriptionNotable Implementation Details
pii-scanandverifysubcommands make zero network calls;generateopens a browser but transmits onlyhttps://claude.ai/code/session_01V7ARryoR769vHVQpnUFnqH