Skip to content

chore(toolpath-desktop): add synthetic preview benchmarks#47

Merged
eliothedeman merged 1 commit intomainfrom
eliot/issue-41-preview-benchmark
Apr 22, 2026
Merged

chore(toolpath-desktop): add synthetic preview benchmarks#47
eliothedeman merged 1 commit intomainfrom
eliot/issue-41-preview-benchmark

Conversation

@eliothedeman
Copy link
Copy Markdown
Collaborator

Summary

  • Adds gen_synthetic_path binary in toolpath-cli that emits a synthetic Path JSON at configurable step counts (default --steps 1000), with a realistic ~70% text turns / ~20% Edit or Write / ~10% MultiEdit mix. Deterministic seed.
  • Adds crates/toolpath-desktop/frontend/src/lib/__bench__/preview.bench.ts, runnable as bun run bench, covering the Preview's pure-TS hot paths: normalize, buildTree, flattenChatHead, classify, matchesFilter (keystroke simulation).
  • Adds crates/toolpath-desktop/frontend/BENCHMARKS.md — describes how to generate fixtures, run the bench, and the manual Chrome DevTools procedure for render-time + memory (the parts a Tauri webview dominates and an agent can't script). Includes a 2026-04-22 baseline table with real pure-TS numbers and empty rows for the manual measurements.
  • Bumps toolpath-cli 0.3.1 → 0.4.0 (additive public change: new binary). Workspace root Cargo.toml does not list toolpath-cli as a workspace dep, so only Cargo.toml + site/_data/crates.json + CHANGELOG.md needed updates per CLAUDE.md's checklist.
  • .gitignores /bench/fixtures/ — fixtures are regenerated locally, not committed (~5 MB at 10k steps).

Addresses #41. Not Closes — the issue asks for a rerun after #38 and #39 land, so it stays open as a tracker.

Baseline numbers (Apple M4 Pro, bun 1.3.5)

Pure-TS, 10 iterations each:

Size JSON.parse normalize buildTree (median) buildTree (p95) keystroke filter flattenChatHead classify × all
1k 1.16 ms 0.23 ms 3.98 ms 7.5 ms 0.08 ms 0.23 ms 0.14 ms
5k 3.17 ms 0.79 ms 82.2 ms 113 ms 0.43 ms 1.20 ms 0.47 ms
10k 6.32 ms 2.15 ms 579 ms 1830 ms 1.12 ms 5.34 ms 1.49 ms

buildTree at 10k is well above the 200 ms keystroke budget from the issue. Expected — this is the thing #39 is meant to address.

Tauri webview measurements (TFP, keystroke DOM-updated, memory delta) are left blank in BENCHMARKS.md with a manual procedure — those need a human at a running cargo tauri dev session with DevTools open.

Test plan

  • cargo build -p toolpath-cli --bin gen_synthetic_path
  • cargo run -p toolpath-cli --bin gen_synthetic_path -- --steps 1000 --out bench/fixtures/synthetic-1k.path.json (and 5k / 10k)
  • cargo run -p toolpath-cli --bin path -- validate --input bench/fixtures/synthetic-1k.path.json (passes)
  • cd crates/toolpath-desktop/frontend && bun install && bun run bench (produces the numbers above)
  • bun run check (0 errors, 4 pre-existing warnings)
  • bun run build (clean)
  • cargo clippy -p toolpath-cli --bin gen_synthetic_path -- -D warnings (clean)
  • cargo test -p toolpath-cli --bin path (152 pass, 0 fail)
  • Maintainer runs the manual DevTools procedure and fills the Tauri rows in BENCHMARKS.md
  • Rerun bun run bench after toolpath-desktop: memoize markdown rendering per chat turn #38 and toolpath-desktop: buildTree re-normalizes on any preview mutation #39 merge; append the comparison rows

Copy link
Copy Markdown
Collaborator Author

@eliothedeman eliothedeman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Summary

Solid execution of the benchmarking brief: a working fixture generator, a pure-TS bench that actually produces comparable numbers, and a BENCHMARKS.md that is honest about what was measured vs. left for manual. Scope stays clean — no perf fixes smuggled in, no svelte-virtual-list, no speculative sub-issues filed.

Notes

  • crates/toolpath-cli/src/bin/gen_synthetic_path.rs:293 — real tool, deterministic seed, clap CLI, writes valid Path docs (PR confirms path validate passes). Mix (~70/20/10) matches the ask.
  • .gitignore:5 — fixtures correctly gitignored under /bench/fixtures/, not checked in. Good.
  • crates/toolpath-cli/Cargo.toml:51default-run = "path" preserves cargo run -p toolpath-cli behavior; second [[bin]] added cleanly.
  • crates/toolpath-desktop/frontend/src/lib/__bench__/preview.bench.ts:574 — plain performance.now() + median/p95/max over 10 iterations. No Vitest dep added, which fits the minimal-surface-area vibe. KEYSTROKE_QUERIES cycles realistic prefixes.
  • BENCHMARKS.md:489 — baseline table has real numbers for pure-TS ops; Tauri rows left as with an explicit manual procedure. Matches the PR body's honesty claim.
  • BENCHMARKS.md:509 — narrative correctly attributes the 10k buildTree regression (579 ms median, 1.8 s p95) to #39 territory and flags renderMarkdown as the likely keystroke culprit for #38, without doing the fix here.
  • BENCHMARKS.md:551 — explicit "don't file sub-issues for known wins from #38/#39" — respects the scope-creep guardrail from the issue.
  • Version bumps: toolpath-cli 0.3.1 → 0.4.0 in Cargo.toml, Cargo.lock, CHANGELOG.md:19, site/_data/crates.json:84. Workspace root [workspace.dependencies] does not list toolpath-cli (confirmed — it's a leaf binary crate), so that's correctly skipped. Checklist followed.
  • tsconfig.json:765 excludes __bench__/** from svelte-check — sensible; the bench uses Node APIs.
  • Minor: bench/fixtures/synthetic-10k.path.json TFP row in the manual procedure (step 3) candidly admits the paste-into-DevTools route is awkward and recommends using a real long session instead. Honest, slightly hand-wavy — fine as a starting procedure.
  • No CI checks reported on the branch (gh pr checks 47 → "no checks reported"), but the PR body lists the local verification matrix.

Verdict

Approve. Delivers exactly the benchmarking scaffolding the issue asked for, with no scope creep and honest accounting of measured vs. pending numbers. Safe to merge; issue stays open as the rerun tracker post-#38/#39.

@eliothedeman eliothedeman force-pushed the eliot/issue-41-preview-benchmark branch from bf9cb74 to 85b3525 Compare April 22, 2026 21:03
@github-actions
Copy link
Copy Markdown

github-actions Bot commented Apr 22, 2026

🔍 Preview deployed: https://25cf3266.toolpath.pages.dev

Adds a fixture generator (new `gen_synthetic_path` binary in toolpath-cli)
and a pure-TS bench script covering the Preview's hot paths:
`normalize`, `buildTree`, `flattenChatHead`, `classify`, `matchesFilter`.
`bun run bench` reports median/p95/max over 10 iterations on fixtures at
1k / 5k / 10k steps.

BENCHMARKS.md captures the 2026-04-22 baseline on Apple M4 Pro. Notable:
`buildTree` at 10k steps is ~579 ms median (p95 1.8 s) — well over the
200 ms keystroke budget, and the primary thing #39 should improve. Manual
Tauri webview procedure (render time, memory) is documented with an empty
template for a human to fill after a DevTools session.

Bumps toolpath-cli to 0.4.0 (additive public change: new binary).
Fixture files are gitignored — regenerate locally.

Addresses #41
@eliothedeman eliothedeman force-pushed the eliot/issue-41-preview-benchmark branch from 85b3525 to cea63c9 Compare April 22, 2026 21:07
@eliothedeman eliothedeman merged commit 2b99bcd into main Apr 22, 2026
1 check passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant