Skip to content

v0.1: TEE-side per-session read rate limit (abuse defense) #4

@hanwencheng

Description

@hanwencheng

Summary

Add a per-session credential-read rate limit enforced by the TEE (v0.1) or daemon (v0 mock), with a configurable cap and a clear rate-limit error path. This is a general abuse defense that is independently valuable, and is a prerequisite for the Pattern 4 audit submission design (issue #) because Pattern 4 relies on a paymaster-funded audit flow that is vulnerable to unbounded request volume without an upstream rate limiter.

Why this is needed

Right now, there is nothing stopping a buggy or abusive agent from calling agentkeys.get_credential in a tight loop, thousands of times per second. In v0 this drains the backend SQLite; in v0.1 this drains the paymaster that subsidizes audit extrinsics and creates audit log spam that makes real compromise patterns impossible to spot. Neither is acceptable.

Putting the rate limit at the credential-read layer (not at the audit-submission layer) defends everything downstream simultaneously: if you can't do 10,000 reads/second, you can't cause 10,000 audit submissions/second, you can't exfiltrate credentials 10,000 times/second, and you can't drain the paymaster 10,000 fees/second.

This is also useful regardless of which audit submission pattern ships. Even under the current cold-first-read plan, rate limiting is a general DoS defense and belongs in Stage 8.

Design

Policy

  • Default: 100 reads / minute / session.
  • Configurable per-session at creation. The session-creation extrinsic takes an optional read_rate_limit: Option<u32> field. If unset, default applies. If set, must be ≤ a hard cap (e.g., 10,000/min) to prevent abuse of the config itself.
  • Token bucket algorithm. Each session gets a bucket of capacity rate_limit, refilled linearly at rate_limit / 60 tokens per second. Each read_credential consumes one token. Bucket starts full.
  • Excess reads return a structured error that agents can handle: { \"code\": \"rate_limit_exceeded\", \"retry_after_secs\": <integer> }. The retry_after_secs field tells the agent when the next token will be available.

Where it lives

  • v0 (mock backend): rate limit lives in the mock backend's handlers::credential::read_credential path, stored in an in-memory HashMap<SessionToken, TokenBucket>. SQLite-backed for persistence across server restarts is optional — in-memory is fine for the mock.
  • v0.1 (Heima TEE): rate limit lives in the TEE worker's credential-serving path, stored in TEE-internal state. Persistence across TEE restarts is desirable but not critical for the security property — a restart that resets buckets just gives each session one "free" burst, which is the same as session creation. The TEE already holds per-session state; adding a bucket counter is ~5 lines.

Telemetry

Every rate-limit rejection should emit an audit event — rate_limit_exceeded with session_id, service, timestamp, attempted_rate. These events go to the regular audit log path (wherever that is under the active pattern — batched, per-read, paymaster-relayed, etc.) so operators can spot abusive agents via the normal usage-query flow (agentkeys usage).

Override for legitimate high-volume use

Some workloads legitimately need >100 reads/minute (e.g., an agent that makes many parallel API calls). The session-creation read_rate_limit field lets the master configure a higher cap for specific agents at pair time. This is a conscious grant by the master, not a default behavior, so abuse requires both a compromised session and a compromised session-creation flow, raising the bar.

Deliverables

Mock backend (v0)

  • TokenBucket struct in crates/agentkeys-mock-server/src/state.rs with refill + consume methods
  • Per-session bucket stored in SharedState alongside the SQLite handle
  • Rate-limit check at the top of handlers::credential::read_credential before any DB work
  • Structured rate_limit_exceeded error variant in AppError
  • Session-creation extrinsic accepts optional read_rate_limit: Option<u32> field (default 100)
  • Audit log write for rate-limit rejections (emit as a rate_limit_exceeded event, not a successful read)
  • Tests:
    • credential::rate_limit_default_100_per_minute — 100 reads in quick succession succeed, 101st fails
    • credential::rate_limit_refills_linearly — after waiting 6s, 10 more reads succeed (100/min ÷ 60 = ~1.66/s, 6s = 10 tokens)
    • credential::rate_limit_per_session_not_global — two sessions each get their own bucket, neither affects the other
    • credential::rate_limit_configurable_at_creation — creating a session with read_rate_limit: 500 allows 500 reads in a burst
    • credential::rate_limit_emits_audit_event — rejected read writes an audit row with action rate_limit_exceeded

CLI

  • agentkeys read and agentkeys run surface a clear error message when rate-limited: `"Error: RATE_LIMIT. Session 0x... has exceeded 100 reads/minute. Retry after seconds."`
  • `agentkeys run` specifically should treat the rate-limit error as retryable (wait `retry_after_secs`, then retry) up to 3 attempts before giving up, since agents running long tasks can legitimately hit temporary bursts.

Daemon (v0)

  • Daemon proxies rate-limit errors from the backend to the MCP client without modification.
  • Daemon-internal audit log records the rate-limit event even though the backend also records it (redundancy for detection).

Documentation

  • Update `docs/manual-test-stage4.md` with a rate-limit verification test
  • Update `wiki/key-security.md` with a brief note that rate limiting is part of the security story (abuse defense layer)

Acceptance criteria

  • All mock-backend tests pass
  • A burst of 101 reads from one session within a minute results in 100 successes and 1 rejection
  • Two sessions with default rate limit do not affect each other
  • `agentkeys run` tolerates a rate-limit error and retries
  • `agentkeys usage` surfaces rate-limit events as distinguishable rows in the audit output

Effort estimate

1-2 days. Small, self-contained, well-bounded.

Priority

Must-have for v0.1 because it gates Pattern 4 (sponsored audit submission). Should-have for v0 because it closes an obvious abuse vector in the mock backend. Recommended to slot into Stage 8 as part of production hardening.

References

Metadata

Metadata

Assignees

No one assigned

    Labels

    v0.1development plan for blockchain backend integration

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions