feat: JSONL streaming format for Path documents#19
Open
eliothedeman wants to merge 4 commits intomainfrom
Open
feat: JSONL streaming format for Path documents#19eliothedeman wants to merge 4 commits intomainfrom
eliothedeman wants to merge 4 commits intomainfrom
Conversation
Implements docs/RFC-jsonl.md end-to-end so producers (notably `track`) can persist Path documents incrementally instead of rewriting a full JSON blob on every step. toolpath 0.2.0: - New `v1::jsonl` module with `JsonlLine` + body types, a strict line-by-line reader (`Path::from_jsonl_reader` / `from_jsonl_str`), and a deterministic normalized writer (`Path::to_jsonl_writer` / `to_jsonl_string`). - `JsonlError` enum with per-line-number diagnostics for every RFC-specified fatal condition (empty stream, non-PathOpen first line, duplicate PathOpen, malformed JSON, orphan step signature, ambiguous head, lines after close). - Unknown line variants are skipped with a stderr warning for forward compatibility. - Additive schema change: `PathIdentity.graph_ref: Option<String>`, serialized only when Some, so existing documents and signatures remain byte-stable. toolpath-cli 0.4.0: - New `io::read_document_auto` helper routes `*.path.jsonl` through the JSONL reader and everything else through the canonical JSON reader. Wired into validate, render dot, render md, query *, and merge. - `track` sessions now persist as `.path.jsonl` streams with the tooling bookkeeping (buffer_cache, seq_to_step, step_counter) stored in `path.meta.extra["track"]`; export/close strip the extra from the sealed output. Strict append-only writes are deferred as a future optimization. - New integration tests cover `.path.jsonl` input across validate/render/ query/merge plus malformed-JSONL rejection. Examples adopt the RFC's two-part extension convention: - examples/path-*.json renamed to examples/path-*.path.json - new examples/path-*.path.jsonl siblings generated via a one-off `cargo run -p toolpath-cli --example convert-path-to-jsonl` helper - all references in site config, playground JS, integration tests, and insta snapshot headers updated to the new filenames.
|
🔍 Preview deployed: https://5f3e3449.toolpath.pages.dev |
Resolves conflicts with the toolpath-pi / toolpath-convo 0.6.0 changes that landed on main in parallel: - Cargo.toml: keep both bumps — toolpath 0.2.0 (this branch) and toolpath-convo 0.6.0 (main). - CHANGELOG.md: keep both sections in chronological order. - crates/toolpath-convo/src/derive.rs: add `graph_ref: None` to the `PathIdentity` literal introduced by the derive_path move.
Resolves conflicts with the conversation-projection changes (PR #22) that landed on main while this PR was in review: - Cargo.toml workspace deps: keep toolpath 0.2.0 (this branch) alongside the toolpath-claude 0.7.0 bump from main. - crates/toolpath-cli/Cargo.toml and site/_data/crates.json: keep toolpath-cli 0.4.0 (higher bump supersedes main's 0.3.1). - Cargo.lock: take main's version and regenerate via cargo build. - Add `graph_ref: None` to four new PathIdentity struct literals introduced by main: - crates/toolpath-cli/src/cmd_incept.rs - crates/toolpath-cli/src/cmd_project.rs - crates/toolpath-cli/tests/roundtrip.rs - crates/toolpath-convo/src/extract.rs
Resolves conflicts with the recent wave of provider crates (toolpath-gemini, toolpath-codex, toolpath-opencode) and toolpath-desktop that landed on main: - Cargo.toml: single conflict marker — keep toolpath 0.2.0 (this branch) alongside toolpath-convo 0.7.0 (main's Write-before-state work). - Add `graph_ref: None` to three new PathIdentity struct literals introduced by the new provider crates (only surfaces at compile time): - crates/toolpath-gemini/src/derive.rs:109 - crates/toolpath-codex/src/derive.rs:162 - crates/toolpath-opencode/src/derive.rs:158
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Implements docs/RFC-jsonl.md end-to-end. Producers (notably
track) can now persistPathdocuments incrementally via a line-oriented JSONL format instead of rewriting a whole JSON blob on every step. Readers accept either format transparently.What's in here
toolpath0.2.0 — newv1::jsonlmoduleJsonlLineenum + body types (PathOpen,Step,ActorDef,Signature,PathMeta,Head,PathClose) matching the RFC wire shape.Path::from_jsonl_reader<R: BufRead>/from_jsonl_str. Per-line-number diagnostics for every fatal case the RFC specifies (empty stream, non-PathOpen first line, duplicate PathOpen, malformed JSON, orphan step signature, ambiguous head, lines after PathClose). Unknown variants are skipped with a stderr warning so older readers stay forward-compatible.Path::to_jsonl_writer<W: Write>/to_jsonl_string. Normalized emission order per RFC (sorted ActorDefs, step followed by its sigs, then path sigs, then Head, then PathClose). Step bodies have their signatures stripped out so the followingSignaturelines are the sole source on round-trip.PathIdentity.graph_ref: Option<String>, additive and serialized only whenSomeso existing documents and signatures remain byte-stable.graph_ref-bearing paths.toolpath-cli0.4.0 — extension-based format routingio::read_document_autodispatches*.path.jsonlto the JSONL reader and everything else to the canonical JSON reader. Wired intovalidate,render dot,render md,query *, andmerge. Stdin paths stay JSON-only for this release.tracksessions now persist as.path.jsonlstreams. The tooling bookkeeping (buffer_cache,seq_to_step,step_counter, etc.) lives inpath.meta.extra["track"]and is stripped on export/close so sealed output is clean. Strict append-only writes are deferred as a future optimization; current semantics do atomic full-file rewrites, which preserved all 61 existingtracktests verbatim..path.jsonlinput across validate/render/query/merge plus malformed-JSONL rejection.Example corpus — adopt the RFC's two-part extension convention
examples/path-*.json→examples/path-*.path.json.examples/path-*.path.jsonlsiblings via a one-offcargo run -p toolpath-cli --example convert-path-to-jsonlhelper.Why
track(the Claude Code-facing session recorder) currently rewrites a full JSON document on every keystroke-derived step. JSONL lets it append a single line per event, which matters once sessions get large or are being tailed by another process. Having the format as a peer of canonical JSON in the core crate (not bolted on downstream) means every tool that already consumes Paths gets streaming support transparently.Test plan
cargo build --workspacecleancargo test --workspaceall green (37 new jsonl tests, 61 track tests under the new format, 17 CLI integration tests, plus all pre-existing tests)cargo clippy --workspace -- -D warningscleanpath validate --inputpath render mdoutput byte-identical between.path.jsonand.path.jsonlfor the same logical path (cross-checked forpath-03-signed-pr)cd site && pnpm run build(pnpm not available in this worktree; filename references are updated — please confirm on review)