Skip to content

SevensRequiem/trivium-paper

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 

Repository files navigation

Trivium: A Multi-Branch Distributed Knowledge System with Institutional Consensus Governance

Author: Requiem Date: February 20, 2026 Version: 0.1.0 (Draft) License: CC BY-SA 4.0


Abstract

Trivium is a distributed wiki system that replaces the single-canonical-truth model of centralized encyclopedias with a multi-branch consensus architecture governed by academic institutions. Instead of maintaining one authoritative version of each article, Trivium maintains multiple independently governed canonical branches. Convergence between branches serves as a measurable trust signal: when all branches agree, confidence is high; when branches diverge, the divergence itself is informative and presented transparently to readers.

Governance authority is anchored to verified academic institutions. Anti-capture mechanisms include VRF-based random committee selection, encrypted commit-reveal voting, mandatory monthly key rotation, and cross-country hash verification for data integrity. The system is designed to survive the compromise of any individual node, institution, or country, and includes a fork protocol that allows mirror operators to reconstitute the network if all authoritative nodes are lost.


Table of Contents

  1. Problem Statement
  2. Design Philosophy
  3. Architecture Overview
  4. Node Tiers
  5. Data Model
  6. Branch Model
  7. Consensus and Voting
  8. Reputation System
  9. Security Model
  10. Cross-Country Verification
  11. Peer-to-Peer Distribution Layer
  12. Threat Model
  13. Reader Experience
  14. Implementation Strategy
  15. Future Work

1. Problem Statement

Centralized knowledge platforms, most notably Wikipedia and its emerging competitors, suffer from structural governance failures that cannot be resolved through policy reform alone. These failures are architectural.

1.1 Single-Canon Fragility

Centralized wikis maintain a single authoritative version of each article. This creates a winner-take-all dynamic in editorial disputes where one perspective must prevail, even on topics where genuine scholarly disagreement exists. The resulting article presents a false consensus that obscures the actual state of knowledge.

1.2 Editorial Capture

In systems where editorial authority is earned through participation volume rather than domain expertise, a small number of highly active editors accumulate disproportionate influence. A determined procedural editor can override domain experts through sustained engagement with the platform's bureaucratic processes. Knowledge of the platform's rules ends up mattering more than knowledge of the subject.

1.3 Centralized Points of Failure

A single-server knowledge platform is vulnerable to censorship, DDoS attacks, infrastructure failure, and organizational capture. When one entity controls the canonical database, that entity becomes a single point of failure for the entire knowledge base.

1.4 Opacity of Disagreement

When experts disagree on a topic, readers benefit from understanding the disagreement: what the competing positions are, who holds them, and why. Single-canon systems flatten this disagreement into one narrative, depriving readers of the information they need to form their own assessment.

1.5 Jurisdictional Vulnerability

A centralized platform operating under one legal jurisdiction can be compelled by that jurisdiction's government to alter content. There is no structural mechanism for detecting or resisting such alterations.


2. Design Philosophy

Trivium rests on a handful of principles that inform every design decision in the protocol.

Where genuine disagreement exists, presenting multiple well-sourced perspectives transparently is more honest and more useful than manufacturing a single "neutral" version. The system favors multiple truths over false consensus.

When independently governed branches arrive at the same conclusion without coordination, that convergence carries more weight than any single editorial process could. Convergence is the primary trust signal.

Academic institutions (universities, research institutes, libraries) have existing reputations, domain expertise, and accountability structures that make them suitable trust anchors. Their participation anchors the governance model.

Anti-capture mechanisms must be architectural, not procedural. Policies can be circumvented; protocol-level constraints cannot. VRF-based committee selection, encrypted voting, and mandatory key rotation are structural features, not guidelines.

The knowledge base must survive the failure, compromise, or censorship of any individual node, institution, or country. Distribution is resilience. Any mirror with a full state replica can fork the network and continue.

All votes, reputation scores, branch states, and editorial histories are publicly auditable. Transparency is the default.


3. Architecture Overview

Trivium consists of three logical layers.

3.1 Governance Layer

The governance layer is composed of verified academic institutions that participate in VRF-sampled committee voting on article edits. Institutions are assigned to canonical branches. For each proposal, a random committee is selected from the eligible institutions on the relevant branch. Committee members cast encrypted votes that are revealed only after the vote window closes. This layer produces signed state roots that represent the canonical state of each branch's knowledge base.

3.2 Distribution Layer

The distribution layer is a peer-to-peer network built on libp2p that propagates article content, edit proposals, and vote results across all participating nodes. This layer handles content discovery, replication, and availability. Any participant can run a distribution node.

3.3 Verification Layer

The verification layer enables cross-regional and cross-country auditing of governance decisions. Regional governance clusters publish their state roots and vote logs, which other regions can independently replay and verify. Because state transitions are deterministic (given the same votes, any node computes the same result), discrepancies between a published state root and the independently computed root indicate data integrity violations.


4. Node Tiers

4.1 Authoritative Nodes

Authoritative nodes are operated by verified academic institutions. They hold a two-tier keypair: SLH-DSA (SPHINCS+) for long-term identity (cold storage, HSM-backed) and ML-DSA (Dilithium) for operational signing (rotated monthly). They participate in committee-based voting when selected via VRF and produce signed state roots after each vote window closes.

Authoritative nodes are discoverable via a hardcoded bootstrap list rather than through the DHT, preventing impersonation. They must maintain a full replica of their assigned branches, run the deterministic state transition engine, and participate in cross-node hash verification.

4.2 Mirror Nodes

Mirror nodes are operated by any willing participant. They maintain a full replica of all branches and serve content to readers via a local API. Mirrors verify state transitions by replaying vote logs against authoritative signed state roots. They have no voting power and cannot propose edits directly (proposals must be sponsored by an authoritative node).

Critically, mirrors must retain the full vote log. This is not optional. The vote log is what enables fork survival: if all authoritative nodes are compromised, any mirror with a complete state replica and vote log can reconstitute the network (see Section 9.5).

4.3 Light Nodes

Light nodes maintain a partial replica, caching only articles that local users access along with their merkle proofs. They verify article integrity against signed state roots but do not replay the full vote log. They have the lowest resource requirements and are suitable for mobile or embedded deployments.

4.4 Self-Hosted Interface

All node tiers expose a local interface on localhost. Node operators can build and deploy custom frontends, allowing personalized presentation of Trivium content and integration with existing sites.


5. Data Model

5.1 Semantic Document Model

Articles in Trivium are not flat text blobs. They are trees of typed blocks, which enables per-block voting and fine-grained divergence detection between branches.

Article
├── Metadata (title, categories, created_at, branch_id)
├── Section
│   ├── Heading
│   ├── Paragraph
│   ├── Citation
│   └── Figure
├── Section
│   ├── Heading
│   ├── Paragraph
│   └── SubSection
│       ├── Heading
│       └── Paragraph
└── References
    ├── Citation
    └── Citation

Each block carries:

  • block_id: content-addressed hash of (block_type || content || children_hashes). Deterministic; same content produces the same ID.
  • block_type: one of paragraph, heading, citation, figure, section, reference_list, etc.
  • content: UTF-8 text for text blocks, content-addressed ID for binary assets (figures, diagrams).
  • children: ordered list of child block IDs (for container blocks like sections).
  • version: monotonically increasing per-block version number within a branch.

This structure means an edit to a single paragraph only requires re-hashing the path from that block to the article root. Divergence between branches can be detected at the block level, not just the article level.

5.2 Merkle Tree Structure

Articles are organized in a merkle tree using SHA3-256:

Global State Root
├── Branch 1 Root
│   ├── Category A Root
│   │   ├── Article 1 Root (hash of block tree)
│   │   └── Article 2 Root
│   └── Category B Root
│       └── Article 3 Root
├── Branch 2 Root
│   └── ...
└── Branch 3 Root
    └── ...

Article root = hash of its top-level block tree. Category root = hash of sorted article roots. Branch root = hash of sorted category roots. Global state root = hash of all branch roots. When an article is edited, only the path from the modified leaf to the root must be recomputed and re-signed, allowing efficient incremental verification.

5.3 Vote Record

VoteRecord {
    proposal_id:     hash(proposal_content)
    voter_id:        institutional_public_key
    branch_id:       uint8
    vote:            enum { APPROVE, REJECT, ABSTAIN, REJECT_SPAM }
    committee_proof: VRF_proof (proves voter was selected for this committee)
    window_type:     enum { FAST, STANDARD, EXTENDED }
    epoch:           uint64
    timestamp:       int64
    signature:       ML-DSA signature over all above fields
}

The committee_proof field is critical. It contains the VRF output and proof that demonstrates the voter was legitimately selected for this committee. Any node can verify this proof without trusting the voter's claim.

The REJECT_SPAM vote type carries a heavier reputation penalty for the proposer than a regular REJECT. Unanimous spam rejection triggers automatic suspension.

5.4 Edit Proposal

EditProposal {
    proposal_id:     hash(all fields below)
    article_id:      content hash of target article root
    branch_id:       uint8
    proposer_id:     institutional_public_key
    sponsor_id:      institutional_public_key (if proposed by non-authoritative)
    edit_type:       enum { BLOCK_ADD, BLOCK_MODIFY, BLOCK_DELETE, BLOCK_MOVE }
    target_blocks:   []block_id
    new_blocks:      []Block
    diff_summary:    human-readable description
    window_class:    enum { FAST, STANDARD, EXTENDED }
    epoch:           uint64
    signature:       ML-DSA signature
}

Edit proposals operate on blocks, not on the article as a whole. The window_class is auto-classified from the diff: character-level changes below a threshold are classified as FAST, block-level structural changes as EXTENDED, and everything else as STANDARD (see Section 7.2).

Non-authoritative nodes can draft proposals, but an authoritative node must sponsor them before they enter the voting pipeline.

5.5 Institution Identity

InstitutionRecord {
    institution_id:       hash(name || country || founding_public_key)
    name:                 string
    country:              ISO 3166-1 alpha-2
    affiliation_group:    optional(group_id)
    identity_key:         SLH-DSA public key (long-term, cold storage / HSM)
    signing_key:          ML-DSA public key (operational, rotatable monthly)
    onboarded_epoch:      uint64
    branch_assignments:   []uint8
    status:               enum { ACTIVE, SUSPENDED, DEPARTED }
    verification_proofs:  []ExternalVerification
    ratification_votes:   []VoteRecord
}

The two-tier key hierarchy is a deliberate design choice. The SLH-DSA identity key uses the most conservative post-quantum security assumptions and is stored offline (HSM or equivalent). It is used infrequently: key rotation endorsement, emergency revocation, self-suspension. The ML-DSA signing key is used for day-to-day operations (proposals, votes, state roots) and rotates monthly.


6. Branch Model

6.1 Multiple Canonical Branches

Trivium maintains three independently governed canonical branches. Each branch represents a complete, internally consistent version of the knowledge base.

Branches are not forks in the version control sense. They are parallel editorial processes operating on the same set of articles. On non-contested topics, branches will naturally converge to identical content. On contested topics, branches may diverge, and this divergence is presented to readers as meaningful information.

The branch count of three is a protocol parameter. Changing it requires a governance proposal with 80% approval from all authoritative nodes (bypassing committee sampling). Three was chosen as a starting point because it is the minimum needed to distinguish majority agreement from full disagreement while keeping the system manageable during early operation.

6.2 Branch Assignment

Institutions are randomly assigned to 1-2 branches. Assignment is determined by a verifiable random function (VRF) seeded with the institution's public key and the current epoch identifier. This ensures assignments are deterministic and auditable but not predictable in advance.

6.3 Reassignment and Cooldown

Institutions may request reassignment to a different branch. Reassignment is subject to a cooldown of 3 epochs (roughly 3 weeks), during which the institution cannot participate in committees on the new branch. This prevents rapid branch-hopping to exert influence across multiple branches simultaneously.

If a branch drops below minimum participation (10% of total authoritative nodes), forced random rebalancing triggers across all branches, overriding the normal cooldown. Each branch must also maintain institutions from at least 3 countries; violation of this diversity requirement triggers mandatory rebalancing as well.

6.4 Single Voting Entity

Affiliated institutions (e.g., campuses within a university system) are treated as a single voting entity. They share one voting slot on committees and must internally agree on their vote. Identity verification includes affiliation mapping to prevent a single institutional network from claiming multiple independent votes.


7. Consensus and Voting

7.1 Epoch Structure

Time is divided into epochs. The default epoch duration is 1 week (governable). Each epoch contains one or more vote windows, tiered by edit significance:

  • Fast windows (2 hours) handle typo fixes, citation additions, and formatting changes. Classification is automatic: if the character-level diff falls below a threshold, the proposal is classified as fast.
  • Standard windows (48 hours) handle substantive content edits and section rewrites.
  • Extended windows (7 days) handle new article creation, article deletion, structural changes, institution onboarding, and governance proposals.

The article database is read-only outside of vote windows. During a vote window, the only accepted write operations are signed vote transactions from verified committee members.

7.2 VRF-Based Committee Selection

At the target scale of 200-1000+ institutions, having every node vote on every edit is impractical. Instead, a random committee of 15-30 members is selected per vote window using a Verifiable Random Function.

For each vote window, each authoritative node on the relevant branch computes:

vrf_output, vrf_proof = VRF_eval(signing_key, epoch || window_id || branch_id)

If vrf_output falls below a selection threshold (calibrated for the target committee size), the node is selected. The node publishes vrf_proof alongside their vote, and any other node can verify that the selection was legitimate.

The key property here is self-certification: nobody learns who is on the committee until members publish their votes. An attacker cannot target committee members before the vote because the committee membership is unknown until it is revealed. This is the primary defense against pre-vote coercion.

A diversity constraint ensures no single country holds more than 40% of a committee's seats. If VRF selection produces a non-diverse committee, additional members are drawn from underrepresented jurisdictions.

7.3 Vote Privacy

Votes are concealed during the window to prevent real-time coercion or bandwagon effects:

EncryptedVote {
    encrypted_payload:  AES-256-GCM(VoteRecord, ephemeral_key)
    commitment:         hash(VoteRecord || ephemeral_key)
}

After the window closes, committee members reveal their ephemeral key. Anyone can then decrypt, verify the VRF proof and signature, and tally independently. If a member fails to reveal their key, their vote is counted as abstain.

This commit-reveal scheme means that even if an attacker can observe the network in real time, they cannot determine how any committee member voted until after the window has closed and the result is already determined.

7.4 Vote Window Lifecycle

The full lifecycle of a proposal:

  1. Proposal submitted, gossiped on /trivium/proposals/<branch>.
  2. Vote window opens (duration depends on window class).
  3. Committee members evaluate the proposal and cast encrypted votes.
  4. Encrypted votes gossiped on /trivium/votes/<branch>.
  5. Window closes.
  6. Committee members reveal ephemeral keys.
  7. Any node tallies: decrypt votes, verify VRF proofs and signatures.
  8. If quorum is met, the state transition is applied deterministically.
  9. New state root computed and signed by committee members.
  10. Signed state root gossiped on /trivium/state/<branch>.
  11. Mirrors and light nodes verify the state root signatures and replay.

7.5 Quorum Requirements

Standard proposals pass with 2/3 committee approval. Institution onboarding requires 3/4 committee approval on an extended window. Governance proposals (protocol parameter changes, branch count changes) bypass committee sampling entirely and require 80% of all authoritative nodes.

Abstention is explicitly supported and does not count against quorum.

7.6 Tiered Delegation

To manage editorial volume, institutions may delegate review authority for lower-stakes edits. Fast-window proposals (typos, formatting) can be reviewed by department-appointed delegates such as graduate students or postdocs. Standard and extended proposals require faculty-level review.

Delegation is signed. The institution signs a delegation certificate granting review authority to specific individuals for specific domain categories and window classes.


8. Reputation System

8.1 Reputation Trail

Every authoritative institution builds a public reputation history, computed entirely from the append-only vote log. There is no central reputation authority; reputation is an emergent property of the auditable record. Any node can independently compute any institution's reputation.

The tracked metrics include: proposals submitted and their outcomes (approved, rejected, flagged as spam), committee selection count versus actual participation count, votes cast, edits later reverted by subsequent proposals, key rotations completed, and missed rotation deadlines.

From these raw metrics, derived scores are computed. Participation rate is the ratio of committees actually served to committees selected for. Proposal quality is the ratio of approved proposals to total submissions. A composite reliability score weights these factors together.

8.2 Enforcement

Institutions whose participation rate drops below 50% over a rolling 12-epoch window are suspended from committee selection until they recover. Institutions with more than 5 spam-flagged proposals in a single epoch receive an automatic 2-epoch suspension from proposing.

These thresholds are protocol parameters, governable through the standard governance proposal process.

8.3 Decay and Recovery

Reputation scores decay over time without active participation. An institution cannot rest on a historically good record. Recovery from reputation damage requires sustained accurate voting over subsequent epochs.


9. Security Model

9.1 Read-Only Default

Authoritative node databases are read-only outside of vote windows. During vote windows, the only accepted write operations are signed vote transactions from verified committee members, processed through the deterministic state transition function. Direct database writes are never permitted.

9.2 Deterministic State Transitions

Given the same set of votes, any node must compute the same resulting state. This is the foundation of the entire verification model. If an authoritative node's post-vote state root differs from what other nodes independently compute from the published vote set, the divergent node is flagged and frozen.

9.3 Post-Quantum Cryptography

All cryptographic operations use post-quantum algorithms:

Identity keys use SLH-DSA (SPHINCS+), which has the most conservative security assumptions among post-quantum signature schemes. Signatures are larger (roughly 7-40KB) but identity keys are used infrequently: key rotation endorsement, emergency revocation, and similar high-stakes operations. These keys are stored in HSMs or TPM-backed storage and kept offline.

Operational signing keys use ML-DSA (Dilithium), which offers smaller signatures (roughly 2.4KB) and faster verification. These keys handle day-to-day operations: proposals, votes, and state root signing.

All hashing uses SHA3-256. All signed messages include an algorithm identifier field, providing crypto agility for future migration without requiring a hard fork.

9.4 Mandatory Key Rotation

Institutional signing keys rotate monthly. The rotation procedure:

KeyRotation {
    institution_id:       hash
    old_signing_pubkey:   ML-DSA public key
    new_signing_pubkey:   ML-DSA public key
    rotation_epoch:       uint64

    rotation_proof:       ML-DSA sig over (new_key || epoch)
                          signed by OLD signing key
    identity_endorsement: SLH-DSA sig over (new_key || epoch)
                          signed by long-term identity key
    retired_key_hash:     SHA3-256(old_signing_pubkey)
}

Both the old signing key and the cold identity key must endorse the new key. The old key's hash is appended to a retired key log. Messages signed by a retired key are rejected network-wide, and a RetiredKeyUsageAlert is gossiped on /trivium/security.

If an institution hasn't rotated within 30 days plus a 7-day grace period, their voting capability is suspended until rotation completes. If a signing key is compromised before its scheduled rotation, the cold SLH-DSA identity key can immediately revoke and replace it. Institutions can also self-suspend via a SuspensionNotice signed with their identity key.

9.5 Compromise Detection and Recovery

After each vote window closes:

  1. Every authoritative node publishes its computed state root.
  2. Nodes cross-verify by independently computing the expected state root from the published vote set.
  3. If a node's published root does not match, it is automatically frozen pending out-of-band institutional re-verification.
  4. The append-only vote log allows full replay to pinpoint the exact point of divergence.

When a node is confirmed compromised: the node is removed from the active authoritative set, all votes from the compromised node during the suspected window are marked tainted, affected vote windows are replayed excluding tainted votes, and the institution must complete a new key ceremony to rejoin.

9.6 Registry Cross-Verification

Non-authoritative nodes verify their local institution registry replica on a randomized schedule (every 3-7 days). They query 3 or more authoritative nodes from different branches and compare registry roots. If all agree and the local copy matches, no action is taken. If all agree but differ from local, the node fetches the diff and verifies each change has valid rotation proofs or onboarding votes. If authoritatives disagree among themselves, the node queries additional authoritatives. If no consensus emerges, the node enters conservative mode: serving only cached content and refusing to relay votes until the inconsistency is resolved.

9.7 Network Fork Protocol

This is the doomsday mechanism. If all authoritative nodes are compromised, any mirror with a full state replica and complete vote log can fork the network:

ForkDeclaration {
    fork_id:            random UUID
    fork_epoch:         uint64 (last known good epoch)
    fork_reason:        string
    state_snapshot:     map[branch_id -> state_root] at fork_epoch
    vote_log_hash:      hash(complete vote log up to fork_epoch)

    founding_mirrors: []MirrorPromotionClaim {
        peer_id:          libp2p peer ID
        new_identity_key: SLH-DSA public key
        new_signing_key:  ML-DSA public key
        state_proof:      merkle proof of valid state at fork_epoch
    }

    fork_signatures:    multi-sig from all founding mirrors
}

Mirrors coordinate out-of-band (mailing lists, social media, whatever works) to agree the main network is compromised. The fork carries the complete history and continues from the last known good epoch. Founding mirrors become the new authoritatives. Light nodes verify fork state proofs and choose which network to follow.

This is why mirrors must retain the full vote log. It is not a suggestion. The vote log is the fork survival mechanism.


10. Cross-Country Verification

10.1 Purpose

Cross-country verification prevents any single country's authoritative cluster from unilaterally spoofing or altering its canonical data. Countries verify each other using the same deterministic replay mechanism used for intra-country node verification.

10.2 Mechanism

After each vote cycle, each country's governance layer publishes the new state root (merkle root of all articles under that country's governance scope) and the complete vote log for that cycle.

Any other country's authoritative nodes can download the vote log, replay the votes through the deterministic state transition function, compare the computed state root against the published one, and flag discrepancies.

10.3 Legitimate Divergence vs. Integrity Violations

Different countries will legitimately produce different content on certain topics due to genuine editorial disagreement. This is expected and not flagged. What cross-country verification detects is data integrity violations: cases where the published state root is inconsistent with the published vote log. That means the data was altered outside the governance process.

10.4 Coercion Detection

A state-level actor could coerce domestic institutions to cast fraudulent votes. In this scenario, every individual signature is technically valid, but the voting pattern is anomalous. Detection signals include: all branches within a country suddenly converging on a politically convenient narrative, the converged narrative diverging significantly from other countries' treatment of the same topic, sudden coordinated voting shifts across previously independent branches, and domain experts voting outside their expertise areas in a coordinated pattern.

These signals are heuristic rather than cryptographically provable. But the transparency of the system ensures that coercion, even when technically undetectable at the cryptographic level, is detectable at the behavioral level. The VRF committee selection and vote privacy mechanisms (Section 7) make coercion harder to execute in the first place, since the coercing party does not know who will be on the committee or how they voted until the window closes.

10.5 Global Audit Chain

Country-level state roots are aggregated into a global merkle tree. The global root is signed by a rotating subset of cross-country auditor nodes, selected inversely proportional to regional influence so that smaller countries receive proportionally more auditing slots. Historical global roots form an append-only chain, providing a tamper-evident record of the entire knowledge base's state over time.


11. Peer-to-Peer Distribution Layer

11.1 Protocol Foundation

The distribution layer is built on libp2p, using Kademlia DHT for peer discovery and content routing, GossipSub for real-time propagation of proposals, votes, and state root announcements, and bitswap-style content exchange for article retrieval. NAT traversal and hole punching are supported for nodes behind residential networks. The primary transport is QUIC, with TCP as fallback and WebSocket/WebTransport for browser-based light nodes.

11.2 Topic Channels

GossipSub topics:

  • /trivium/proposals/<branch_id> for edit and article proposals
  • /trivium/votes/<branch_id> for vote broadcasts during active windows
  • /trivium/state/<branch_id> for state root announcements after vote windows close
  • /trivium/membership for institution onboarding proposals and departures
  • /trivium/governance for protocol parameter change proposals
  • /trivium/security for security alerts (retired key usage, anomalous behavior)

Institutions subscribe to topics corresponding to their assigned branches. Mirror and light nodes subscribe to topics relevant to the content they serve.

11.3 Content Propagation

Articles propagate through the network using content-addressed retrieval. A node requests an article by its content hash. The DHT routes the request to nodes that have advertised possession of that content. The requesting node verifies the received content against the authoritative signed hash (using a merkle proof from the article root to the branch root). Verified content is cached locally and advertised to the DHT.

Popular articles will naturally have high availability due to widespread caching. Mirror nodes that commit to full replication ensure a baseline availability level for obscure articles.

11.4 Handshake

When two nodes connect, they exchange identity proofs. Capabilities are never self-declared; they are derived from verified identity against the replicated institution registry. An authoritative node presents its institution ID, signing public key, and a signed challenge nonce. The receiving node looks up the institution in its local registry replica, verifies the key matches, and verifies the nonce signature. If verification succeeds, the node is treated as authoritative with the capabilities defined by its registry entry. If verification fails or the node does not claim institutional identity, it is treated as a mirror or light node.

11.5 Message Serialization

Protocol Buffers for all inter-node message serialization, providing efficient binary encoding, schema evolution support, and cross-language compatibility.


12. Threat Model

12.1 Branch Starvation

If a government pressures all institutions in its jurisdiction to leave a specific branch, that branch could drop below minimum participation and become unable to process edits. The protocol addresses this with a minimum participation threshold: if a branch drops below 10% of total authoritative nodes, forced random rebalancing triggers across all branches, overriding cooldown periods. A geographic diversity requirement also applies (each branch must span at least 3 countries). The attack is partly self-defeating because the rebalancing is random, meaning attackers may end up reassigned to the branch they tried to kill.

12.2 Institutional Sybil Attack

An actor creating fictitious institutions to gain voting power faces multiple hurdles. Onboarding requires external verification (accreditation records, .edu domain verification, government higher-education registries) combined with a member ratification vote at 75% approval on an extended window. New institutions cannot participate in committees for their first 3 epochs (probationary period). Affiliation detection flags institutions sharing infrastructure, IP ranges, or suspiciously similar registration timing, grouping them as a single voting entity.

12.3 Key Compromise

A stolen operational signing key has a bounded exposure window. Mandatory monthly rotation means the maximum exposure is roughly 37 days (30-day rotation cycle plus 7-day grace period). The two-tier key hierarchy allows the compromised signing key to be immediately revoked using the cold identity key. The retired key log detects and alerts on any subsequent use of the old key. In the worst case, the institution can self-suspend via its identity key and complete a new key ceremony.

12.4 State-Level Coercion

A government coercing domestic institutions to vote in a coordinated pattern faces several structural obstacles. VRF committee selection means the coercing state does not know which institutions will be on any given committee. Vote privacy means the state cannot observe how institutions voted in real time. The 40% country cap on committee seats limits the impact even if every domestic institution complies. Multi-branch redundancy means capturing one branch still leaves two branches to surface the manipulation through divergence.

Even if coercion succeeds on one branch, the behavioral signatures are publicly visible: sudden coordinated voting shifts, domain experts voting outside their expertise, and anomalous convergence patterns. Coercion is survivable, detectable, and carries reputational consequences for the participating institutions.

12.5 Node Infrastructure Compromise

An attacker who gains control of an authoritative node's infrastructure cannot silently alter the database. The database is read-only outside vote windows. State transitions are deterministic, so post-vote state roots are cross-verified by all other authoritative nodes. A divergent node is automatically frozen. The append-only vote log allows full forensic replay to identify exactly what happened.

12.6 Network Partitioning and Eclipse Attacks

An attacker attempting to isolate nodes and feed them a false view of state faces several defenses. Authoritative nodes use a hardcoded bootstrap list (not DHT discovery), reducing their eclipse attack surface. Multiple transport protocols reduce single-protocol vulnerability. State roots are published through multiple channels (P2P network, HTTPS endpoints, optionally DNS TXT records or public blockchains) so that state information remains available even if one distribution channel is compromised. Registry cross-verification on a randomized schedule (Section 9.6) catches inconsistencies.

12.7 Proposal Spam

A compromised authoritative node flooding the network with garbage proposals is rate-limited by a per-institution proposal cap per epoch. Proposals rejected with over 90% opposition incur reputation penalties. Unanimous REJECT_SPAM verdicts apply heavier penalties and can trigger automatic suspension.

12.8 Content Poisoning

Subtle factual alterations in proposals are surfaced by the semantic block-level diff system, which makes even single-word changes visible in full context. Cross-branch detection means poisoning on one branch appears as divergence in the reader view. The full revision history is append-only and auditable. Institutions whose edits are frequently reverted accumulate negative reputation.


13. Reader Experience

13.1 Unified View

When a reader accesses an article, they are not shown a single branch's version. Instead, the system computes a unified view from all three branches at the block level.

Blocks where all three branches are identical are rendered directly with no annotation. This is the vast majority of content. Blocks where two of three branches agree are rendered with the majority version, with a subtle gutter annotation indicating that one branch dissents. The reader can click to see the dissenting version. Blocks where all three branches differ are rendered with a prominent annotation and an expandable comparison panel showing all three versions.

There is no "default" or "primary" branch. On contested content, the reader sees the disagreement itself.

13.2 Article Confidence Score

Each article displays a confidence score computed as the ratio of unanimous blocks to total blocks. An article where 95% of blocks are identical across all branches has a high confidence score. An article where 30% of blocks diverge has a low one. This gives readers an immediate sense of how settled the content is.

13.3 Comparison View

Readers can also view all branch versions side-by-side, with differences highlighted. This comparison view includes branch-by-branch article text, the convergence percentage, which institutions govern each branch, and reputation scores of those institutions.

13.4 Divergence Reports

After each epoch, a divergence analysis runs per article, producing a structured report:

DivergenceReport {
    article_id:       hash
    epoch:            uint64
    branch_versions:  map[branch_id -> article_root_hash]
    status:           enum { UNANIMOUS, MAJORITY, CONTESTED, PARTIAL }
    block_diffs:      []BlockDivergence {
        block_id:         hash
        block_type:       enum
        branch_content:   map[branch_id -> content_hash]
        agreement:        float (0.0 to 1.0)
    }
}

These reports are available through the content serving API and are useful for researchers, journalists, and fact-checkers who want to track how knowledge consensus evolves over time.


14. Implementation Strategy

14.1 Technology Stack

The implementation uses Go with libp2p for peer-to-peer networking, Protocol Buffers for wire format, SHA3-256 for all hashing, SLH-DSA (SPHINCS+) for long-term identity keys, ML-DSA (Dilithium) for operational signing, and VRF for committee selection and branch assignment. The primary transport is QUIC with TCP fallback and WebSocket support for browser-based light nodes.

14.2 Development Phases

Phase 1 is protocol specification: formalizing message types, handshake protocol, state transition rules, and vote processing logic as a versioned specification document.

Phase 2 is a single-node prototype: implementing the core data model, merkle tree, vote processing, and deterministic state transition engine as a standalone application. Property-based testing validates determinism (same votes in, same state out, every time).

Phase 3 is a multi-node testnet: deploying multiple nodes simulating authoritative, mirror, and light node roles. This phase validates P2P propagation, committee formation, vote privacy, and cross-node verification.

Phase 4 is an institutional pilot: onboarding 5-15 institutional participants for a limited-scope pilot covering a defined set of articles.

Phase 5 is public launch: opening mirror and light node participation, publishing the protocol specification and node software as open source under AGPL-3.0.

14.3 Bootstrapping

The cold-start problem (institutions won't join until the system has legitimacy, and the system has no legitimacy without institutions) is addressed through a small founding set of institutional participants recruited through direct outreach, a bootstrap phase with relaxed quorum requirements (simple majority instead of 2/3), and a protocol that functions with as few as 10-20 founding institutions.

The article database starts empty. There is no seed content. Every article is proposed and voted through the standard pipeline from day one, giving every piece of content a complete audit trail from its creation.


15. Future Work

Several areas are deferred from the initial protocol version.

Domain-weighted voting, where an institution's vote on a physics article carries more weight if the institution has a strong physics department, is the most significant deferred feature. The first version uses equal weights for all institutional votes. The data model supports domain tagging and the extension is designed for, but calibrating the weighting formula (which public signals to use, how to prevent gaming) requires careful analysis that should not block the initial deployment.

Other future work includes formal game-theoretic analysis of the multi-branch voting system, AI-assisted review for automated detection of vandalism and plagiarism before proposals reach human reviewers, multilingual coordination mechanisms for cross-language article equivalence, formal verification of the deterministic state transition function, optimized mobile light node implementations, and a public API for programmatic access to convergence and divergence data.


Appendix A: Glossary

Branch is an independently governed canonical version of the knowledge base. The initial configuration is three branches.

Convergence is the degree to which multiple branches agree on an article's content, measured at the block level.

Vote window is a defined time period during which committee members can cast votes on proposals. Windows are tiered: fast (2h), standard (48h), and extended (7d).

Epoch is the fundamental time unit of the protocol (default 1 week). Branch assignments, committee selection, and reputation calculations are anchored to epochs.

State root is the SHA3-256 merkle root hash representing the complete state of a branch.

Committee is the randomly selected subset of authoritative nodes (15-30 members) that votes on a specific proposal, chosen via VRF.

Quorum is the minimum proportion of committee votes required for a valid decision (2/3 for standard proposals, 3/4 for onboarding, 80% of all nodes for governance).

Cooldown is the mandatory waiting period (3 epochs) before an institution can participate in committees on a newly assigned branch.

Passive branch is a branch that has fallen below minimum participation and no longer accepts edits until rebalancing restores adequate institutional coverage.


This document is a living specification. Contributions, critiques, and revisions are welcome.

About

Censorship-resistant distributed wiki governed by academic institutions through VRF-sampled committee voting

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors