Skip to content

feat: JSONL streaming format for Path documents#19

Open
eliothedeman wants to merge 4 commits intomainfrom
eliot/gallant-darwin-0840d3
Open

feat: JSONL streaming format for Path documents#19
eliothedeman wants to merge 4 commits intomainfrom
eliot/gallant-darwin-0840d3

Conversation

@eliothedeman
Copy link
Copy Markdown
Collaborator

Summary

Implements docs/RFC-jsonl.md end-to-end. Producers (notably track) can now persist Path documents incrementally via a line-oriented JSONL format instead of rewriting a whole JSON blob on every step. Readers accept either format transparently.

What's in here

toolpath 0.2.0 — new v1::jsonl module

  • JsonlLine enum + body types (PathOpen, Step, ActorDef, Signature, PathMeta, Head, PathClose) matching the RFC wire shape.
  • Strict reader: Path::from_jsonl_reader<R: BufRead> / from_jsonl_str. Per-line-number diagnostics for every fatal case the RFC specifies (empty stream, non-PathOpen first line, duplicate PathOpen, malformed JSON, orphan step signature, ambiguous head, lines after PathClose). Unknown variants are skipped with a stderr warning so older readers stay forward-compatible.
  • Deterministic writer: Path::to_jsonl_writer<W: Write> / to_jsonl_string. Normalized emission order per RFC (sorted ActorDefs, step followed by its sigs, then path sigs, then Head, then PathClose). Step bodies have their signatures stripped out so the following Signature lines are the sole source on round-trip.
  • Schema change: PathIdentity.graph_ref: Option<String>, additive and serialized only when Some so existing documents and signatures remain byte-stable.
  • 37 inline tests including JSON→JSONL→JSON round trips over linear, dead-ended, signed (path and step-level), and graph_ref-bearing paths.

toolpath-cli 0.4.0 — extension-based format routing

  • New io::read_document_auto dispatches *.path.jsonl to the JSONL reader and everything else to the canonical JSON reader. Wired into validate, render dot, render md, query *, and merge. Stdin paths stay JSON-only for this release.
  • track sessions now persist as .path.jsonl streams. The tooling bookkeeping (buffer_cache, seq_to_step, step_counter, etc.) lives in path.meta.extra["track"] and is stripped on export/close so sealed output is clean. Strict append-only writes are deferred as a future optimization; current semantics do atomic full-file rewrites, which preserved all 61 existing track tests verbatim.
  • 5 new CLI integration tests covering .path.jsonl input across validate/render/query/merge plus malformed-JSONL rejection.

Example corpus — adopt the RFC's two-part extension convention

  • Renamed examples/path-*.jsonexamples/path-*.path.json.
  • Generated examples/path-*.path.jsonl siblings via a one-off cargo run -p toolpath-cli --example convert-path-to-jsonl helper.
  • Updated every reference: site eleventy config, playground JS, CLI integration tests, insta snapshot headers.

Why

track (the Claude Code-facing session recorder) currently rewrites a full JSON document on every keystroke-derived step. JSONL lets it append a single line per event, which matters once sessions get large or are being tailed by another process. Having the format as a peer of canonical JSON in the core crate (not bolted on downstream) means every tool that already consumes Paths gets streaming support transparently.

Test plan

  • cargo build --workspace clean
  • cargo test --workspace all green (37 new jsonl tests, 61 track tests under the new format, 17 CLI integration tests, plus all pre-existing tests)
  • cargo clippy --workspace -- -D warnings clean
  • All 20 example files validate under their format-appropriate path via path validate --input
  • path render md output byte-identical between .path.json and .path.jsonl for the same logical path (cross-checked for path-03-signed-pr)
  • cd site && pnpm run build (pnpm not available in this worktree; filename references are updated — please confirm on review)

Implements docs/RFC-jsonl.md end-to-end so producers (notably `track`) can
persist Path documents incrementally instead of rewriting a full JSON blob
on every step.

toolpath 0.2.0:
- New `v1::jsonl` module with `JsonlLine` + body types, a strict line-by-line
  reader (`Path::from_jsonl_reader` / `from_jsonl_str`), and a deterministic
  normalized writer (`Path::to_jsonl_writer` / `to_jsonl_string`).
- `JsonlError` enum with per-line-number diagnostics for every RFC-specified
  fatal condition (empty stream, non-PathOpen first line, duplicate PathOpen,
  malformed JSON, orphan step signature, ambiguous head, lines after close).
- Unknown line variants are skipped with a stderr warning for forward
  compatibility.
- Additive schema change: `PathIdentity.graph_ref: Option<String>`, serialized
  only when Some, so existing documents and signatures remain byte-stable.

toolpath-cli 0.4.0:
- New `io::read_document_auto` helper routes `*.path.jsonl` through the JSONL
  reader and everything else through the canonical JSON reader. Wired into
  validate, render dot, render md, query *, and merge.
- `track` sessions now persist as `.path.jsonl` streams with the tooling
  bookkeeping (buffer_cache, seq_to_step, step_counter) stored in
  `path.meta.extra["track"]`; export/close strip the extra from the sealed
  output. Strict append-only writes are deferred as a future optimization.
- New integration tests cover `.path.jsonl` input across validate/render/
  query/merge plus malformed-JSONL rejection.

Examples adopt the RFC's two-part extension convention:
- examples/path-*.json renamed to examples/path-*.path.json
- new examples/path-*.path.jsonl siblings generated via a one-off
  `cargo run -p toolpath-cli --example convert-path-to-jsonl` helper
- all references in site config, playground JS, integration tests, and
  insta snapshot headers updated to the new filenames.
@github-actions
Copy link
Copy Markdown

github-actions Bot commented Apr 17, 2026

🔍 Preview deployed: https://5f3e3449.toolpath.pages.dev

Resolves conflicts with the toolpath-pi / toolpath-convo 0.6.0 changes
that landed on main in parallel:

- Cargo.toml: keep both bumps — toolpath 0.2.0 (this branch) and
  toolpath-convo 0.6.0 (main).
- CHANGELOG.md: keep both sections in chronological order.
- crates/toolpath-convo/src/derive.rs: add `graph_ref: None` to the
  `PathIdentity` literal introduced by the derive_path move.
Resolves conflicts with the conversation-projection changes (PR #22) that
landed on main while this PR was in review:

- Cargo.toml workspace deps: keep toolpath 0.2.0 (this branch) alongside
  the toolpath-claude 0.7.0 bump from main.
- crates/toolpath-cli/Cargo.toml and site/_data/crates.json: keep
  toolpath-cli 0.4.0 (higher bump supersedes main's 0.3.1).
- Cargo.lock: take main's version and regenerate via cargo build.
- Add `graph_ref: None` to four new PathIdentity struct literals
  introduced by main:
  - crates/toolpath-cli/src/cmd_incept.rs
  - crates/toolpath-cli/src/cmd_project.rs
  - crates/toolpath-cli/tests/roundtrip.rs
  - crates/toolpath-convo/src/extract.rs
Resolves conflicts with the recent wave of provider crates (toolpath-gemini,
toolpath-codex, toolpath-opencode) and toolpath-desktop that landed on main:

- Cargo.toml: single conflict marker — keep toolpath 0.2.0 (this branch)
  alongside toolpath-convo 0.7.0 (main's Write-before-state work).
- Add `graph_ref: None` to three new PathIdentity struct literals
  introduced by the new provider crates (only surfaces at compile time):
  - crates/toolpath-gemini/src/derive.rs:109
  - crates/toolpath-codex/src/derive.rs:162
  - crates/toolpath-opencode/src/derive.rs:158
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant