Summary
Add a per-session credential-read rate limit enforced by the TEE (v0.1) or daemon (v0 mock), with a configurable cap and a clear rate-limit error path. This is a general abuse defense that is independently valuable, and is a prerequisite for the Pattern 4 audit submission design (issue #) because Pattern 4 relies on a paymaster-funded audit flow that is vulnerable to unbounded request volume without an upstream rate limiter.
Why this is needed
Right now, there is nothing stopping a buggy or abusive agent from calling agentkeys.get_credential in a tight loop, thousands of times per second. In v0 this drains the backend SQLite; in v0.1 this drains the paymaster that subsidizes audit extrinsics and creates audit log spam that makes real compromise patterns impossible to spot. Neither is acceptable.
Putting the rate limit at the credential-read layer (not at the audit-submission layer) defends everything downstream simultaneously: if you can't do 10,000 reads/second, you can't cause 10,000 audit submissions/second, you can't exfiltrate credentials 10,000 times/second, and you can't drain the paymaster 10,000 fees/second.
This is also useful regardless of which audit submission pattern ships. Even under the current cold-first-read plan, rate limiting is a general DoS defense and belongs in Stage 8.
Design
Policy
- Default: 100 reads / minute / session.
- Configurable per-session at creation. The session-creation extrinsic takes an optional
read_rate_limit: Option<u32> field. If unset, default applies. If set, must be ≤ a hard cap (e.g., 10,000/min) to prevent abuse of the config itself.
- Token bucket algorithm. Each session gets a bucket of capacity
rate_limit, refilled linearly at rate_limit / 60 tokens per second. Each read_credential consumes one token. Bucket starts full.
- Excess reads return a structured error that agents can handle:
{ \"code\": \"rate_limit_exceeded\", \"retry_after_secs\": <integer> }. The retry_after_secs field tells the agent when the next token will be available.
Where it lives
- v0 (mock backend): rate limit lives in the mock backend's
handlers::credential::read_credential path, stored in an in-memory HashMap<SessionToken, TokenBucket>. SQLite-backed for persistence across server restarts is optional — in-memory is fine for the mock.
- v0.1 (Heima TEE): rate limit lives in the TEE worker's credential-serving path, stored in TEE-internal state. Persistence across TEE restarts is desirable but not critical for the security property — a restart that resets buckets just gives each session one "free" burst, which is the same as session creation. The TEE already holds per-session state; adding a bucket counter is ~5 lines.
Telemetry
Every rate-limit rejection should emit an audit event — rate_limit_exceeded with session_id, service, timestamp, attempted_rate. These events go to the regular audit log path (wherever that is under the active pattern — batched, per-read, paymaster-relayed, etc.) so operators can spot abusive agents via the normal usage-query flow (agentkeys usage).
Override for legitimate high-volume use
Some workloads legitimately need >100 reads/minute (e.g., an agent that makes many parallel API calls). The session-creation read_rate_limit field lets the master configure a higher cap for specific agents at pair time. This is a conscious grant by the master, not a default behavior, so abuse requires both a compromised session and a compromised session-creation flow, raising the bar.
Deliverables
Mock backend (v0)
CLI
Daemon (v0)
Documentation
Acceptance criteria
- All mock-backend tests pass
- A burst of 101 reads from one session within a minute results in 100 successes and 1 rejection
- Two sessions with default rate limit do not affect each other
- `agentkeys run` tolerates a rate-limit error and retries
- `agentkeys usage` surfaces rate-limit events as distinguishable rows in the audit output
Effort estimate
1-2 days. Small, self-contained, well-bounded.
Priority
Must-have for v0.1 because it gates Pattern 4 (sponsored audit submission). Should-have for v0 because it closes an obvious abuse vector in the mock backend. Recommended to slot into Stage 8 as part of production hardening.
References
Summary
Add a per-session credential-read rate limit enforced by the TEE (v0.1) or daemon (v0 mock), with a configurable cap and a clear rate-limit error path. This is a general abuse defense that is independently valuable, and is a prerequisite for the Pattern 4 audit submission design (issue #) because Pattern 4 relies on a paymaster-funded audit flow that is vulnerable to unbounded request volume without an upstream rate limiter.
Why this is needed
Right now, there is nothing stopping a buggy or abusive agent from calling
agentkeys.get_credentialin a tight loop, thousands of times per second. In v0 this drains the backend SQLite; in v0.1 this drains the paymaster that subsidizes audit extrinsics and creates audit log spam that makes real compromise patterns impossible to spot. Neither is acceptable.Putting the rate limit at the credential-read layer (not at the audit-submission layer) defends everything downstream simultaneously: if you can't do 10,000 reads/second, you can't cause 10,000 audit submissions/second, you can't exfiltrate credentials 10,000 times/second, and you can't drain the paymaster 10,000 fees/second.
This is also useful regardless of which audit submission pattern ships. Even under the current cold-first-read plan, rate limiting is a general DoS defense and belongs in Stage 8.
Design
Policy
read_rate_limit: Option<u32>field. If unset, default applies. If set, must be ≤ a hard cap (e.g., 10,000/min) to prevent abuse of the config itself.rate_limit, refilled linearly atrate_limit / 60tokens per second. Eachread_credentialconsumes one token. Bucket starts full.{ \"code\": \"rate_limit_exceeded\", \"retry_after_secs\": <integer> }. Theretry_after_secsfield tells the agent when the next token will be available.Where it lives
handlers::credential::read_credentialpath, stored in an in-memoryHashMap<SessionToken, TokenBucket>. SQLite-backed for persistence across server restarts is optional — in-memory is fine for the mock.Telemetry
Every rate-limit rejection should emit an audit event —
rate_limit_exceededwithsession_id,service,timestamp,attempted_rate. These events go to the regular audit log path (wherever that is under the active pattern — batched, per-read, paymaster-relayed, etc.) so operators can spot abusive agents via the normal usage-query flow (agentkeys usage).Override for legitimate high-volume use
Some workloads legitimately need >100 reads/minute (e.g., an agent that makes many parallel API calls). The session-creation
read_rate_limitfield lets the master configure a higher cap for specific agents at pair time. This is a conscious grant by the master, not a default behavior, so abuse requires both a compromised session and a compromised session-creation flow, raising the bar.Deliverables
Mock backend (v0)
TokenBucketstruct incrates/agentkeys-mock-server/src/state.rswith refill + consume methodsSharedStatealongside the SQLite handlehandlers::credential::read_credentialbefore any DB workrate_limit_exceedederror variant inAppErrorread_rate_limit: Option<u32>field (default 100)rate_limit_exceededevent, not a successful read)credential::rate_limit_default_100_per_minute— 100 reads in quick succession succeed, 101st failscredential::rate_limit_refills_linearly— after waiting 6s, 10 more reads succeed (100/min ÷ 60 = ~1.66/s, 6s = 10 tokens)credential::rate_limit_per_session_not_global— two sessions each get their own bucket, neither affects the othercredential::rate_limit_configurable_at_creation— creating a session withread_rate_limit: 500allows 500 reads in a burstcredential::rate_limit_emits_audit_event— rejected read writes an audit row with actionrate_limit_exceededCLI
agentkeys readandagentkeys runsurface a clear error message when rate-limited: `"Error: RATE_LIMIT. Session 0x... has exceeded 100 reads/minute. Retry after seconds."`Daemon (v0)
Documentation
Acceptance criteria
Effort estimate
1-2 days. Small, self-contained, well-bounded.
Priority
Must-have for v0.1 because it gates Pattern 4 (sponsored audit submission). Should-have for v0 because it closes an obvious abuse vector in the mock backend. Recommended to slot into Stage 8 as part of production hardening.
References