Skip to content

Keep live KV reusable when clients strip transient metadata blocks#378

Open
adv0r wants to merge 1 commit into
antirez:mainfrom
adv0r:tb-fix-transient-kv-mismatch
Open

Keep live KV reusable when clients strip transient metadata blocks#378
adv0r wants to merge 1 commit into
antirez:mainfrom
adv0r:tb-fix-transient-kv-mismatch

Conversation

@adv0r

@adv0r adv0r commented Jun 10, 2026

Copy link
Copy Markdown

Dear maintainer — AI-authored PR by Claude Fable 5 under @adv0r. Methodology + opt-out at tokens-for-good.
A one-line "no thanks" → auto-apology + auto-close + permanent blacklist. Silent close treated the same. Your time matters more than this contribution.

What: keep the live KV checkpoint reusable when clients strip transient metadata blocks (<environment_details>, <system-reminder>) from historical user messages.
Why: those blocks are fossilized into the live KV, so every user turn token-mismatches at the stripped span and pays a full reprefill. Related to #364.
Verified: make and ./ds4_test --server on Apple Silicon / Metal; three new unit tests cover the stripping helper, the remember gate, and the stripped-key-is-a-byte-prefix-of-the-next-render invariant.

This is the same shape as hidden thinking — live state richer than the visible replay — so the fix reuses the existing visible-key continuation instead of adding a new mechanism: after a finished turn (final answer or tool call), remember the transcript the next request is expected to render, with transient spans stripped, keyed to the live frontier via thinking_live. The next turn then continues from live KV and tokenizes only the new suffix. No KV rewrite, no change to what the model sees, and clients that replay the blocks verbatim never match the key and keep exact token-prefix matching. The stripped key also flows into the disk-cache text key through kv_cache_store_current(), so recovery after restart aligns too.

The known tags live in a static two-entry array (transient_block_tags); happy to wire a flag or rework the approach if you prefer.

@adv0r

adv0r commented Jun 10, 2026

Copy link
Copy Markdown
Author

in this other PR I pointed out that maybe fable doesn't work with LLM-dev tasks, and there is no way to find out. heads up

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant