perf(toolpath-desktop): perf tracer + buildTree memo (net effect: faster preview open)#54
Merged
eliothedeman merged 4 commits intomainfrom Apr 23, 2026
Merged
Conversation
Clicking a session in Quick View or the Browse "Select →" button used to run derive synchronously on the UI path, producing a noticeable pause. This adds an in-memory + on-disk cache (`src/cache.rs`, `TraceCache`) that the tray poller warms after every 30s scan for each recent claude/pi session. Both the popover's `tray_open_trace` and the main-window `derive_claude` / `derive_pi` IPC commands route through the same cache via `derive_claude_impl` / `derive_pi_impl`, so cached hits short-circuit before any derive work. - Memory tier: `HashMap`, 32-entry LRU, rejected when a warmer is already in flight for the same key. - Disk tier: `<temp_dir>/toolpath-desktop/trace-cache/<fnv1a64>.json`, atomic writes (.tmp + rename), 200-entry cap pruned oldest-first on startup, corrupt files deleted on read. - Freshness: keyed on the source session's `last_activity`. Warmer passes overwrite stale entries; user-initiated derives backfill with an empty timestamp and get replaced on the next poll. Tests rise from 17 to 32 unit tests (new cache-tier tests, cache-hit short-circuits for both providers, prewarm provider routing).
To isolate where perceived click latency actually lives — Rust derive vs. Svelte/dagre render — the store and Preview now emit perf marks at every checkpoint in the flow: dispatch → invoke-start → invoke-end → model-updated → preview-mounted → viz-rendered (or dom-painted in chat mode) The popover's `trace:opened` event path also starts its own trace so the overlay can show post-derive render time in isolation (Rust has already finished). Every completed trace logs a phase-delta summary to the devtools console. To also show the phase-bar overlay in the bottom-right of the main window, set `localStorage.perf = "1"` and reload. Scope is read-only: no behavioural change to derive or caching.
The perf tracer showed the real bottleneck sits between model-update and component-mount (~205ms of Svelte render work) rather than in the Rust derive (~80ms), so the two-tier cache added in the earlier commit was optimising the wrong thing. Stripping it removes a lot of complexity (cache.rs, disk persistence, prewarm threading, in-flight slots, LRU eviction) that wasn't buying the user anything. Kept: the perf tracer, the overlay, the `buildTree` / `flattenChatHead` marks — these are what lets us now see exactly where the remaining time goes. Also fix `state_unsafe_mutation` thrown when `perfMark` is called from inside a `$derived` (which happens when ChatView's `turns` derivation runs `buildTree` + `flattenChatHead`): defer `perf.latest` writes to a microtask so mutation never happens during derivation.
…tity Both StepTree.svelte and ChatView.svelte independently called `buildTree(doc)` from their own `$derived` blocks, so every preview open paid the normalize + flatten cost twice. Add WeakMap memos keyed by `doc` / `norm` identity — callers always pass `store.m.preview.doc` which is a stable reference across renders, and the WeakMap lets the old entry get collected when a new derive replaces the doc. Measured end-to-end click-to-painted on a 1737-step / 609-turn Claude session: 593ms → 472ms (buildTree dedupe + JIT-warm flattenChatHead).
|
🔍 Preview deployed: https://1f2fb1a4.toolpath.pages.dev |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Measuring before optimising. Added a click→derive→render perf tracer that records timestamps at each phase and can show them as an always-visible overlay; used it to find that the real cost of opening a trace on a large Claude session was Svelte render work (not Rust derive), then landed a tiny WeakMap memo that removes a duplicate
buildTreecomputation.End-to-end, for a 1737-step / 609-turn Claude session: 593ms → 472ms.
Backend is unchanged vs. main. An earlier commit on the branch prototyped a two-tier pre-derive cache (memory + disk); after the measurement showed derive was only ~40% of the click latency and the render work dominated, the cache was reverted as added complexity without a proportional win. The revert is included here so the branch tells the full story. Net backend diff vs. main: zero.
What's in here
frontend/src/lib/perf.svelte.ts,PerfOverlay.svelte):perfStart/perfMark/perfEndrecord checkpoints; the store marksdispatch,invoke-start,invoke-end,model-updated; Preview markspreview-mounted,viz-rendered/dom-painted;buildTreeandflattenChatHeadmark their own timings with step/turn counts. Summary logs to the devtools console always; setlocalStorage.perf = "1"and reload to also show a phase-bar overlay in the bottom-right.tree.tsforbuildTreeandflattenChatHead: both StepTree and ChatView independently calledbuildTree(doc)from their own$derivedblocks, doubling the normalize + flatten cost on every preview open. Keyed ondoc/normidentity so old entries get GC'd when a new derive replaces the doc.queueMicrotask: fixesstate_unsafe_mutationwhenperfMarkis called from inside a$derived(ChatView'sturnsreadsbuildTree+flattenChatHead, which now callperfMark).Out of scope
read_conversation+derive_pathand consider streaming.Test plan
cargo test -p toolpath-desktop— 17 passing (unchanged vs. main)cargo clippy -p toolpath-desktop --all-targets -- -D warnings— cleanbun run check+bun run build— svelte-check clean, Vite build succeedsSelect →on a long Claude session; perf log prints to console; overlay appears whenlocalStorage.perf = "1"; nostate_unsafe_mutationruntime error.localStorage.perf; reload; open a couple sessions to confirm the memo hits on the second component (look forbuildTree cache-hitin the console).