Skip to content

v0.1: Migrate to stateless MSK-derived TEE key architecture #9

@hanwencheng

Description

@hanwencheng

Summary

Replace the current per-user random wallet key storage in the Heima TEE worker with a Master Secret Key (MSK) derivation model, where all user wallet keys are deterministically derived from a single TEE-held MSK + user identity. This eliminates per-user key blobs from TEE sealed storage, prevents duplicate key material, and enables seamless MSK rotation without on-chain state changes.

Target: after Stage 8 production hardening is complete, as part of the v0.1 Heima migration.

Full design notes: wiki/blockchain-tee-architecture.md

Problem

The current Heima TEE worker (tee-worker/omni-executor) stores per-user custodial wallet private keys as individually generated, independently sealed blobs:

Current: N users → N sealed key blobs in TEE storage
  user_1_privkey → sealed blob
  user_2_privkey → sealed blob
  ...
  user_N_privkey → sealed blob

This creates:

  • N copies of sensitive key material in sealed storage (attack surface scales linearly with users)
  • Complex recovery — must restore all N sealed blobs from backup; if backup is stale, keys are lost
  • Complex scaling — each TEE instance needs all N blobs or a sharding scheme
  • Complex migration — TEE hardware upgrade requires copying all N blobs
  • MSK rotation cost — N/A (no MSK exists), but individual key rotation requires per-user chain updates

Proposed design

MSK-derived keys

Proposed: 1 MSK → derive any user key on demand

TEE sealed storage:
  MSK (one value, ~32 bytes)

User key derivation (on demand, not stored):
  user_privkey = KDF(MSK, H(identity_info))
  user_pubkey  = user_privkey × G

Child key derivation (soft derivation, publicly verifiable):
  child_pubkey = soft_derive(user_pubkey, "/agent-alias/generation")
  child_privkey = soft_derive(user_privkey, "/agent-alias/generation")

Properties

  1. Single key storage. One MSK instead of N per-user blobs. The user's private key exists only in TEE memory during the operation, then is discarded. No sealed per-user blobs. No duplicates. Exfiltration surface = 1 value, not N.

  2. Seamless MSK rotation. User-facing OmniAccount addresses are identity-derived (OmniAccountConverter::convert(&identity, &client_id)), NOT key-derived. So when MSK rotates:

    • Derived wallet keys change (inside the TEE)
    • OmniAccount addresses do NOT change (identity-derived)
    • Credential blobs do NOT change (encrypted to shielding key, not wallet key)
    • Audit events do NOT change (reference addresses, not pubkeys)
    • Zero on-chain state changes needed
  3. Public keys NOT stored on chain. Public keys are derived by the TEE on demand and included transiently in extrinsics. Not persisted as on-chain state. This is what makes MSK rotation seamless — there are no stored pubkeys to update.

  4. Soft derivation is safe in TEE-only custody. The child-compromise-reveals-parent risk (inherent to all additive soft derivation schemes) is neutralized because all private keys live inside the TEE — the only way to compromise a child key is to compromise the TEE, which also exposes MSK, making the child→parent derivation a subset of a strictly worse compromise.

  5. TEE partitioning. Different MSKs for different jurisdictions or use cases:

    • TEE-China (MSK_china, paymaster-sponsored)
    • TEE-Global (MSK_global, self-pay)
    • TEE-Enterprise (MSK_enterprise, custom billing)
      All partitions share the same chain. Users are cryptographically isolated between partitions.
  6. Public key verifiability (external, optional). Public key → identity hash relationships CAN be stored externally (not on chain) for third-party verification if needed. The derivation algorithm is deterministic: given master_pubkey + path, anyone can verify child_pubkey via soft derivation. Given identity_hash, the identity can be verified by the identity holder. This is a future requirement, not a v0.1 blocker.

Design decisions from the analysis

Decision Rationale
Unpair disabled Key relationship is a mathematical derivation — can't be "undone." Access control via TEE-side suspend (see issue #7).
Path recycling disabled Reusing a path for a different agent produces the same key, leaks old credentials, breaks recovery.
Generation suffix for key rotation /agent-alias/0, /agent-alias/1, etc. Monotonically increasing integer. See issue #8.
No public keys on chain Keeps chain lean. Enables seamless MSK rotation. Public verification available externally if needed.
On-chain suspend for revocation One suspend event per revoked child path. The only per-child chain state. See issue #7.

Deliverables

TEE worker modifications

  • Add MSK generation and sealed storage (replace per-user key generation)
  • Implement KDF(MSK, H(identity_info)) derivation for user wallet keys
  • Implement soft derivation for child keys at paths with generation suffix
  • Remove per-user sealed key blob storage
  • Add on-demand key derivation in the credential read/sign paths
  • Ensure derived keys are wiped from memory after use (zeroize)
  • MSK rotation procedure: generate new MSK, seal, begin deriving from new MSK

Chain / pallet modifications

  • Remove any pallet state that stores user public keys (if any exists)
  • Add current_generation: u32 per child path for key rotation tracking
  • Ensure OmniAccount addresses remain identity-derived (no dependency on wallet pubkey)

Migration from current model

  • Re-derive all existing user wallet keys from MSK + their identity info
  • Verify re-derived keys produce the same wallet addresses (or migrate addresses if they differ)
  • Remove old sealed key blobs after successful migration verification

Documentation

  • Update wiki/blockchain-tee-architecture.md with finalized MSK architecture
  • Update docs/spec/plans/development-stages.md Stage 9 with implementation plan
  • Document MSK rotation procedure for operators

Sequencing

After Stage 8 production hardening, before or during v0.1 Heima migration.

This is a TEE architecture change, not a product feature. It should ship as part of the Heima TEE worker integration work, not as a standalone milestone. The v0 mock backend is unaffected (it doesn't have a TEE).

Dependencies:

  • Stage 8 (production hardening) should be complete first — the MSK work modifies the same TEE key management code that Stage 8's zeroize/memfd_secret work touches
  • Heima TEE worker must be accessible for modification (blocked on Kai coordination)

References

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or requestv0.1development plan for blockchain backend integration

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions