Skip to content

fix(provenance): domain-separated length-prefixed hash; full-field coverage#88

Merged
hyperpolymath merged 1 commit into
mainfrom
fix/provenance-hash-domain-separation
May 14, 2026
Merged

fix(provenance): domain-separated length-prefixed hash; full-field coverage#88
hyperpolymath merged 1 commit into
mainfrom
fix/provenance-hash-domain-separation

Conversation

@hyperpolymath
Copy link
Copy Markdown
Owner

Summary

Lands all four V-L2-C* provenance-hash fixes together (they necessarily ship as one change):

API note: compute_hash signature changes (4 args → 7 args). It's pub so technically breaking, but it was wrong by construction and no external callers exist.

Closes

Test plan

  • Linux CI: cargo clippy --all-targets -- -D warnings clean
  • Linux CI: cargo test — abi unit tests + tests/integration_test.rs pass
  • Local: AppControl/WDAC on libsqlite3-sys blocks Windows local build — Linux CI is the source of truth

The five new tests + the flipped wontfix assertion give a 4×7 mutation matrix covering every (entry, field) combination across the chain.

Closes #27, closes #28, closes #29, closes #30.

`ProvenanceEntry::compute_hash` was hashing
`previous_hash + entity_id + operation + timestamp` as raw byte
concatenation:

  1. Three fields — `actor`, `before_snapshot`, `transformation` —
     were *not* in the preimage. Tampering with any of them left
     `verify()` returning `true`. The integration test
     `test_provenance_chain_integrity_multi_step` *codified the hole*
     (#30 / V-L2-C4): "Actor is not part of hash — tamper to actor
     alone is invisible".
  2. No domain separation. Hash bytes were concatenated without
     length prefixes or a domain tag, so a future field reorder /
     addition / removal could silently produce a colliding digest
     for distinct inputs. No version tag either, so a migration to a
     different hash algorithm had no way to mark old vs new entries.
  3. Timestamp encoded as `to_rfc3339()`. Two valid RFC3339 strings
     for the same instant (sub-second precision, `Z` vs `+00:00`,
     etc.) produce two different hashes.

This commit lands all four V-L2-C* fixes together because they
necessarily ship as one change:

  - **#27 / V-L2-C1** — new `compute_hash` signature accepts all
    seven preimage fields and prepends `b"verisim-prov-v1\0"` as
    the domain tag. Variable-length fields are length-prefixed with
    `u64_le(len)` for canonical encoding.
  - **#28 / V-L2-C2** — timestamp encoded as
    `i64_le(secs) || u32_le(nanos)` via `chrono::DateTime::timestamp()`
    + `.timestamp_subsec_nanos()`. Different RFC3339 strings for the
    same instant now produce the same hash. RFC3339 is kept for
    display surface (JSON, status output) but is no longer part of
    the hash preimage.
  - **#29 / V-L2-C3** — five new tests in `abi::tests`:
    `test_provenance_tamper_actor`,
    `test_provenance_tamper_before_snapshot`,
    `test_provenance_tamper_transformation`,
    `test_provenance_timestamp_canonical_encoding`,
    `test_provenance_mutation_matrix_breaks_verification`
    (4-entry chain × 7 fields × every entry, asserts every single
    mutation breaks `verify()`).
  - **#30 / V-L2-C4** — the wontfix assertion in
    `tests/integration_test.rs` is flipped to assert
    `!tampered.verify()` with the updated comment.

API note: `compute_hash` signature changed (4 args → 7 args). The
function is `pub`, so this is a breaking change. It's a security
fix and the previous signature was wrong by construction; the
callers inside the crate are updated. No external callers known.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
@hyperpolymath hyperpolymath merged commit 4ee90c8 into main May 14, 2026
1 of 18 checks passed
@hyperpolymath hyperpolymath deleted the fix/provenance-hash-domain-separation branch May 14, 2026 15:45
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

1 participant