Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 4 additions & 0 deletions .github/workflows/rust-ci.yml
Original file line number Diff line number Diff line change
Expand Up @@ -11,6 +11,10 @@ on:
push:
branches: [main, master]

concurrency:
group: rust-ci-${{ github.ref }}
cancel-in-progress: true

permissions:
contents: read

Expand Down
27 changes: 0 additions & 27 deletions CODE_OF_CONDUCT.md

This file was deleted.

8 changes: 6 additions & 2 deletions Justfile
Original file line number Diff line number Diff line change
Expand Up @@ -51,8 +51,12 @@ assail:
@command -v panic-attack >/dev/null 2>&1 && panic-attack assail . || echo "panic-attack not found — install from https://github.com/hyperpolymath/panic-attacker"

# --- Domain-Specific Recipes (verisimiser) ---

# Augment a database with VeriSimDB octad\naugment DB_URL:\n cargo run -- augment {{DB_URL}}\n\n# Check octad layer completeness\ncheck-octad DB_URL:\n cargo run -- check-octad {{DB_URL}}\n\n# Generate migration scripts\nmigrate DB_URL:\n cargo run -- migrate {{DB_URL}}
#
# (Reserved.) Recipes for clap subcommands like `augment`, `check-octad`,
# and `migrate` were removed per ADR-0003: they wrapped subcommands that
# don't exist in src/main.rs (the real subcommands are `init`, `generate`,
# `start`, `drift`, `provenance`, `history`, `status`, `octad`).
# Re-add wrappers here when their underlying subcommands ship.

# Run contractile checks
contractile-check:
Expand Down
255 changes: 147 additions & 108 deletions README.adoc

Large diffs are not rendered by default.

116 changes: 85 additions & 31 deletions ROADMAP.adoc
Original file line number Diff line number Diff line change
Expand Up @@ -4,35 +4,78 @@
:toc:
:icons: font

The phases below are stated in terms of the *concerns* octad
(Data, Metadata, Provenance, Lineage, Constraints, AccessControl,
Temporal, Simulation) per `docs/decisions/ADR-0001-octad-ontology.adoc`.
Tier 2 *modalities* (graph, vector, tensor, semantic, document, spatial)
are independent overlay representations layered on top.

== Phase 0: Scaffold (COMPLETE)
* [x] RSR template with full CI/CD (17 workflows)
* [x] CLI with subcommands (init, start, drift, provenance, history, status, octad)
* [x] Manifest parser (verisimiser.toml with tier1/tier2 config)
* [x] Tier 1 data types (DriftReport, ProvenanceRecord, TemporalVersion)
* [x] ABI module stubs (Idris2 + Zig FFI)
* [x] README with two-tier architecture and honest framing

== Phase 1: PostgreSQL Tier 1 MVP

* [x] RSR template with full CI/CD
* [x] CLI with subcommands (init, generate, start, drift, provenance,
history, status, octad)
* [x] Manifest parser (`verisimiser.toml` with `[octad]` toggles +
legacy `tier1`/`tier2` back-compat)
* [x] ABI types in Rust (`src/abi/mod.rs`) + Idris2 declarations
(`src/interface/abi/`) + Zig FFI stubs (`src/interface/ffi/`)
* [x] Codegen for sidecar overlay schema and query interceptor SQL
* [x] README + ADRs covering octad ontology, verification tree,
Justfile recipes

== Phase 1: SQLite Tier 1 MVP

The shortest end-to-end loop: SQLite target, SQLite sidecar, provenance
+ temporal concerns. Sequencing follows the bottom-up issue plan in
`docs/decisions/` (forthcoming).

* [ ] SQLite interception via `sqlite3_update_hook`
* [ ] Provenance sidecar — write-path observer, SHA-256 hash chain
covering operation + actor + before-snapshot + transformation (not
just operation + ts)
* [ ] Temporal sidecar — version history, point-in-time read,
rollback, partial-unique-index enforcement of "exactly one current"
* [ ] Property tests for hash-chain integrity, version ordering, and
sidecar isolation (Tier 1 never writes to target)
* [ ] `verisimiser doctor` + `verisimiser validate` subcommands
* [ ] Structured logging (`tracing`), `--log-format=json|pretty`

== Phase 2: PostgreSQL Tier 1

* [ ] PostgreSQL logical replication interception
* [ ] Provenance sidecar (SQLite) — write-path observer
* [ ] SHA-256 hash-chain integrity for provenance records
* [ ] Temporal versioning sidecar — point-in-time queries
* [ ] Cross-modal drift detection — read-path observer
* [ ] Drift index with 8-category classification
* [ ] Idris2 ABI proofs: sidecar isolation, hash-chain integrity, version ordering
* [ ] Zig FFI bridge: database connection, overlay operations, VCL-total queries
* [ ] End-to-end test: PostgreSQL -> verisimiser overlay -> VCL-total query

== Phase 2: Multi-Backend Support
* [ ] SQLite interception via sqlite3_update_hook / WAL monitoring
* [ ] Provenance + temporal sidecars against PG target
* [ ] Idris2 ABI proofs: sidecar isolation, hash-chain integrity,
version ordering
* [ ] Zig FFI bridge: database connection, overlay operations

== Phase 3: Multi-Backend Support

* [ ] MongoDB interception via change streams
* [ ] Redis interception via keyspace notifications
* [ ] MySQL interception via binlog CDC
* [ ] Application-level middleware / ORM hooks
* [ ] Backend-agnostic interception trait abstraction
* [ ] Per-backend integration tests
* [ ] MySQL (binlog CDC) / Redis (keyspace notifications) — only if
there is real demand; the manifest enum currently excludes them.

== Phase 4: Constraints / Drift Detection

* [ ] Per-category drift definition (one ADR per category)
* [ ] First implemented category: Temporal drift (version skew —
cheapest to define and observe)
* [ ] Drift index storage + query API
* [ ] `verisimiser drift` subcommand wired to real measurements

== Phase 5: AccessControl + Lineage

* [ ] AccessControl model ADR: principals, role composition, deny vs
allow precedence, view interaction
* [ ] Typed policy condition language (replace free-form SQL TEXT)
* [ ] Lineage DAG enforcement: self-edge CHECK + cycle prevention
ADR
* [ ] Lineage traversal subcommand (upstream/downstream)

== Phase 6: Tier 2 Modality Overlays

== Phase 3: Tier 2 Overlays
* [ ] Graph overlay (RDF triples / property graph edges)
* [ ] Vector overlay (HNSW embedding similarity search)
* [ ] Tensor overlay (ndarray multi-dimensional numeric data)
Expand All @@ -41,26 +84,37 @@
* [ ] Spatial overlay (R-tree geospatial coordinates)
* [ ] Independent enable/disable per overlay via manifest

== Phase 4: VCL-total Integration
== Phase 7: Simulation

* [ ] Branching semantics ADR (isolation, merge policy, conflict
resolution)
* [ ] FK enforcement on `simulation_branches.parent_branch`
* [ ] `verisimiser simulate` subcommand

== Phase 8: VCL-total Integration

* [ ] VCL-total type-safe query parsing
* [ ] Cross-tier queries (Tier 1 + Tier 2 in single query)
* [ ] Cross-concern queries (Tier 1 + Tier 2 in single query)
* [ ] TypedQLiser integration for compile-time query validation
* [ ] Query planner for multi-sidecar operations
* [ ] Performance benchmarks: overhead of augmentation layer

== Phase 5: Production Hardening
* [ ] Retention policies (auto-prune temporal history)
== Phase 9: Production Hardening

* [ ] Retention policies (`[retention]` section in manifest)
* [ ] Sidecar compaction and garbage collection
* [ ] Concurrent access safety (multi-writer provenance chains)
* [ ] Concurrent access safety (multi-writer provenance chains —
per-entity serialisation + UNIQUE(entity_id, previous_hash))
* [ ] Backup and restore for sidecars
* [ ] Monitoring and alerting integration
* [ ] Error recovery and graceful degradation
* [ ] Shell completions (bash, zsh, fish)

== Phase 6: Ecosystem
* [ ] PanLL panel for drift monitoring dashboard
== Phase 10: Ecosystem

* [ ] BoJ-server cartridge (MCP integration)
* [ ] SqueakWell integration (database recovery via cross-modal constraint propagation)
* [ ] Migration tooling: Tier 1 -> Tier 2 -> full VeriSimDB
* [ ] SqueakWell integration (database recovery via cross-concern
constraint propagation)
* [ ] Migration tooling: Tier 1 → Tier 2 → full VeriSimDB
* [ ] Publish to crates.io
* [ ] Chainguard container image
21 changes: 0 additions & 21 deletions SECURITY.md

This file was deleted.

2 changes: 1 addition & 1 deletion docs/architecture/TOPOLOGY.md
Original file line number Diff line number Diff line change
Expand Up @@ -12,7 +12,7 @@ verisimiser/
│ ├── src/manifest/ — TOML manifest parsing (verisimiser.toml)
│ ├── src/tier1/ — Tier 1 piggyback data types
│ │ ├── drift.rs — DriftReport, DriftCategory (8 categories)
│ │ ├── provenance.rs — ProvenanceRecord, SHA-256 hash chain
│ │ ├── provenance.rs — re-exports abi::ProvenanceEntry; future write-path helpers (V-L1-C1)
│ │ └── temporal.rs — TemporalVersion, point-in-time snapshots
│ ├── src/tier2/ — Tier 2 overlay stubs (graph, vector, tensor, semantic, document, spatial)
│ ├── src/intercept/ — Per-backend interception strategies
Expand Down
102 changes: 102 additions & 0 deletions docs/decisions/ADR-0001-octad-ontology.adoc
Original file line number Diff line number Diff line change
@@ -0,0 +1,102 @@
// SPDX-License-Identifier: PMPL-1.0-or-later
// Copyright (c) 2026 Jonathan D.A. Jewell (hyperpolymath) <j.d.a.jewell@open.ac.uk>
= ADR-0001: Canonical octad ontology — concerns, not modalities
:revdate: 2026-05-13
:status: Accepted

== Status

Accepted — 2026-05-13.

Resolves: https://github.com/hyperpolymath/verisimiser/issues/19[V-L1-A1].

Closes as wontfix: https://github.com/hyperpolymath/verisimiser/issues/21[V-L1-A3] (the modalities-first refactor).

Unblocks: https://github.com/hyperpolymath/verisimiser/issues/20[V-L1-A2] (the README rewrite).

== Context

Two competing ontologies have lived in this repository under the same name
("octad"):

Modalities octad (README §"VeriSimDB's Octad: Eight Modalities")::
Graph · Vector · Tensor · Semantic · Document · Temporal · Provenance · Spatial.
These are *representations* of an entity — how the same data is stored in
different shapes. The README's eight cross-modal drift categories
(structural, semantic, temporal, statistical, referential, provenance,
spatial, embedding) presuppose this ontology.

Concerns octad (`src/abi/mod.rs::OctadDimension`, `src/manifest/mod.rs::OctadConfig`, `src/main.rs::print_octad`)::
Data · Metadata · Provenance · Lineage · Constraints · AccessControl ·
Temporal · Simulation. These are *concerns/aspects* of data — what you
want to know or enforce about it.

The code commits to the concerns octad; the README leads with the modalities
octad. A user cannot answer "what does an octad-augmented entity look like?"
without picking one.

== Decision

The *concerns* octad is canonical.

The eight dimensions of the verisimiser octad are:

. **Data** — the original entity as stored in the target database.
. **Metadata** — schema and type information.
. **Provenance** — SHA-256 hash-chain tracking of who did what and when.
. **Lineage** — directed-edge graph of data derivation (target nameschematically a DAG; see ADR-0004 when written).
. **Constraints** — cross-dimensional invariant enforcement, including
drift detection between Data + Metadata + active overlays.
. **AccessControl** — policy-based row/column-level access permissions.
. **Temporal** — version history with point-in-time queries and rollback.
. **Simulation** — what-if branching and sandbox query execution.

Modalities (Graph, Vector, Tensor, Semantic, Document, Spatial) are
*Tier 2 overlays* — independent representational projections that a user
can enable per-entity for similarity search, full-text search, geospatial
indexing, etc. They are not "the octad" and not co-equal with the eight
concerns. Provenance and Temporal in the modalities list collapse onto
the same-named concerns.

The eight "cross-modal drift categories" become *symptoms observed by
the Constraints concern* when Data, Metadata, and the active overlays
disagree:

[cols="1,2"]
|===
| Drift category (legacy framing) | Where it lives in the concerns ontology

| Structural | Constraints (Data vs Metadata schema agreement)
| Semantic | Constraints across overlays
| Temporal | Constraints between Temporal versions across overlays
| Statistical | Constraints over Tier 2 vector/tensor overlay drift
| Referential | Constraints between Tier 2 graph overlay and Data
| Provenance | Constraints over Provenance chain integrity
| Spatial | Constraints over Tier 2 spatial overlay
| Embedding | Constraints between Tier 2 vector overlay and source documents
|===

== Consequences

. README and ROADMAP must be rewritten to drop the modalities octad table
and reframe the drift categories under Constraints. Tracked as V-L1-A2.
. The modalities-first refactor (V-L1-A3) is *not* done. Closed as wontfix.
. Tier 2 design (V-L1-F, V-L1-G, V-L1-H) continues to use modality terms
for overlays — they are just no longer presented as "the octad."
. `OctadDimension` enum and `OctadConfig` fields are stable; no source-level
rename triggered by this ADR.

== Alternatives considered

Modalities octad as canonical::
Rejected. Would have required rewriting `abi`, `manifest`, `codegen`, tests,
and example manifests — a multi-week change for pre-alpha framing. The
codebase had already converged on the concerns ontology; the only cost of
keeping it is doc updates.

"Both, with one as primary"::
Considered but rejected. Two ontologies sharing a name is the bug;
ranking them doesn't fix it.

Renaming "octad" to something else (e.g. "octave")::
Out of scope here. The brand survives; the contents are defined.
Loading
Loading