Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions CLAUDE.md
Original file line number Diff line number Diff line change
Expand Up @@ -208,3 +208,4 @@ Build the site after changes: `cd site && pnpm run build` (should produce 7 page
- Pi provider: `toolpath-pi` reads Pi session JSONL from `~/.pi/agent/sessions/`. Sessions use a tree (id/parentId) in a single file, and may link to a parent file via `parentSession` in the header. The tree is preserved as a DAG in the derived `Path`.
- Codex provider: `toolpath-codex` reads Codex CLI rollout files from `~/.codex/sessions/YYYY/MM/DD/rollout-*.jsonl`. Sessions are date-bucketed (not project-keyed). File-change fidelity is excellent — Codex's `patch_apply_end` events carry either the unified diff (for updates) or the full file content (for adds), so the derived `Path` gets a real `raw` perspective on every file artifact. See `docs/agents/formats/codex.md` for the full format reference.
- opencode provider: `toolpath-opencode` reads a SQLite database at `~/.local/share/opencode/opencode.db` (opened read-only). Each session's messages and 12 typed part variants (text, reasoning, tool, step-start/-finish, snapshot, patch, file, agent, subtask, retry, compaction) land as one step per message with tool invocations attached. File diffs come from a sibling bare git repo at `snapshot/<project-id>/[<sha1(worktree)>]/` via `git2` tree↔tree diffs — opencode respects the user's `.gitignore`, so changes under gitignored paths fall back to tool-input-derived structural changes with no `raw` perspective. Project id is the SHA of the repo's first root commit. See `docs/agents/formats/opencode.md` for the full format reference.
- Format references for the agent on-disk formats we derive from live at `docs/agents/formats/`. The Claude Code format (`~/.claude/projects/…` JSONL) gets the deepest treatment — twelve focused docs at `docs/agents/formats/claude-code/` covering envelope, entry types, tools, session chains, compaction, writing-compatible JSONL, a linear walkthrough, and a version-keyed changelog. Sibling single-file references: `codex.md`, `gemini.md`, `opencode.md`. Keep them in sync with their derive crates when fields or behaviors change.
2 changes: 2 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -206,6 +206,8 @@ let md_string = render(&doc, &RenderOptions::default());
- [CHANGELOG.md](CHANGELOG.md) -- Release history
- [schema/toolpath.schema.json](schema/toolpath.schema.json) -- JSON Schema
- [examples/](examples/) -- 11 example documents covering steps, paths, and graphs
- [docs/agents/formats/](docs/agents/formats/README.md) -- Reference for the on-disk
formats emitted by agents we derive from (Claude Code today; more as they land)

## Requirements

Expand Down
6 changes: 6 additions & 0 deletions crates/toolpath-claude/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -16,6 +16,12 @@ Reads Claude Code conversation data from `~/.claude/projects/` and provides:
- **Derivation**: Map conversations to Toolpath Path documents
- **Watching**: Monitor conversation files for live updates (feature-gated)

For the on-disk format itself — envelope fields, entry types, session chains,
compaction, and the empirical rules for writing Claude-compatible JSONL — see
[`docs/agents/formats/claude-code/`](https://github.com/empathic/toolpath/tree/main/docs/agents/formats/claude-code).
That directory is the authoritative reference; this crate is its reference
implementation.

## Derivation

Convert Claude conversations into Toolpath documents:
Expand Down
16 changes: 10 additions & 6 deletions crates/toolpath-claude/src/derive.rs
Original file line number Diff line number Diff line change
Expand Up @@ -1687,22 +1687,26 @@ mod tests {
// steps[0] = assistant turn, steps[1] = tool step (siblings).
let tool_step = &path.steps[1];
let ch = &tool_step.change["/src/login.rs"];
let raw = ch.raw.as_deref().expect("edit tool should emit unified diff");
let raw = ch
.raw
.as_deref()
.expect("edit tool should emit unified diff");
// Leading `/` is stripped from the header so `a/`/`b/` don't double up
// (git-style prefixes already denote the repo root). See #36.
assert!(raw.contains("--- a/src/login.rs"), "{}", raw);
assert!(raw.contains("+++ b/src/login.rs"), "{}", raw);
assert!(!raw.contains("a//"), "header should not double-slash: {}", raw);
assert!(
!raw.contains("a//"),
"header should not double-slash: {}",
raw
);
assert!(raw.contains("-validate_token()"), "{}", raw);
assert!(raw.contains("+validate_token_v2()"), "{}", raw);

// Sanity-check the parent wiring that the chat view relies on:
// the tool step's parent is the assistant step, and they share
// the same `entry.uuid` root so the frontend splice works.
assert_eq!(
tool_step.step.parents,
vec![path.steps[0].step.id.clone()]
);
assert_eq!(tool_step.step.parents, vec![path.steps[0].step.id.clone()]);
}

// ── tool result assembly ──────────────────────────────────────────
Expand Down
11 changes: 5 additions & 6 deletions crates/toolpath-cli/src/bin/gen_synthetic_path.rs
Original file line number Diff line number Diff line change
Expand Up @@ -60,11 +60,7 @@ const LOREM: &[&str] = &[
"sed ut perspiciatis unde omnis iste natus error sit voluptatem",
];

const TOOLS: &[(&str, f64)] = &[
("Edit", 0.50),
("Write", 0.30),
("MultiEdit", 0.20),
];
const TOOLS: &[(&str, f64)] = &[("Edit", 0.50), ("Write", 0.30), ("MultiEdit", 0.20)];

const FILES: &[&str] = &[
"src/main.rs",
Expand Down Expand Up @@ -101,7 +97,10 @@ fn pick_tool(rng: &mut StdRng) -> &'static str {

fn synth_diff(rng: &mut StdRng, path: &str) -> String {
let lines = rng.random_range(3..12);
let mut s = format!("--- a/{}\n+++ b/{}\n@@ -1,{} +1,{} @@\n", path, path, lines, lines);
let mut s = format!(
"--- a/{}\n+++ b/{}\n@@ -1,{} +1,{} @@\n",
path, path, lines, lines
);
for i in 0..lines {
if rng.random_bool(0.5) {
s.push_str(&format!("-old_line_{} = value;\n", i));
Expand Down
5 changes: 5 additions & 0 deletions crates/toolpath-cli/src/cmd_incept.rs
Original file line number Diff line number Diff line change
@@ -1,5 +1,10 @@
//! `path incept` — project a toolpath document into a Claude session
//! that Claude Code can load and resume.
//!
//! Format rules this command obeys are documented at
//! `docs/agents/formats/claude-code/writing-compatible-jsonl.md`. When a new
//! empirical constraint is discovered here, capture it there in the same
//! change.

use anyhow::Result;
use std::io::Read;
Expand Down
9 changes: 6 additions & 3 deletions crates/toolpath-convo/src/derive.rs
Original file line number Diff line number Diff line change
Expand Up @@ -326,7 +326,10 @@ fn file_write_change(
extra.insert("edits".to_string(), serde_json::Value::Array(edits.clone()));
}

(file_write_diff(&tool.name, input, path, before_state), extra)
(
file_write_diff(&tool.name, input, path, before_state),
extra,
)
}

/// Compute a unified diff string for a file-write tool invocation, given the
Expand Down Expand Up @@ -675,8 +678,8 @@ mod tests {
"file_path": "hello.txt",
"content": "hi\nthere\n",
});
let raw = file_write_diff("Write", &input, "hello.txt", None)
.expect("write should emit diff");
let raw =
file_write_diff("Write", &input, "hello.txt", None).expect("write should emit diff");
assert!(raw.contains("+hi"));
assert!(raw.contains("+there"));
// No `-` lines — nothing was there before.
Expand Down
17 changes: 6 additions & 11 deletions crates/toolpath-desktop/src/tray.rs
Original file line number Diff line number Diff line change
Expand Up @@ -362,11 +362,7 @@ pub fn tray_open_trace(
basename_slug(&project),
short(&session_id)
);
(
value,
format!("Claude: {}", basename(&project)),
filename,
)
(value, format!("Claude: {}", basename(&project)), filename)
}
"pi" => {
let value = crate::commands::derive::derive_pi(
Expand All @@ -380,11 +376,7 @@ pub fn tray_open_trace(
basename_slug(&project),
short(&session_id)
);
(
value,
format!("pi.dev: {}", basename(&project)),
filename,
)
(value, format!("pi.dev: {}", basename(&project)), filename)
}
// Not wired up in the desktop backend yet. The popover disables
// rows for these, but we still reject politely if one slips through.
Expand Down Expand Up @@ -574,7 +566,10 @@ mod tests {
// produce a well-formed snapshot with all five provider slots.
let s = collect_stats();
let providers: Vec<_> = s.counts.iter().map(|c| c.provider).collect();
assert_eq!(providers, vec!["claude", "gemini", "codex", "opencode", "pi"]);
assert_eq!(
providers,
vec!["claude", "gemini", "codex", "opencode", "pi"]
);
}

#[test]
Expand Down
64 changes: 64 additions & 0 deletions docs/agents/formats/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,64 @@
# Agent session formats

This directory holds our working reference for the on-disk formats emitted by
coding agents whose sessions we derive `toolpath` documents from. These are the
documents we would like external consumers (other toolpath crates, workshop,
etc.) to be able to trust without having to reverse-engineer the format
themselves from a sampled `~/.claude/projects/…` directory.

The goal is **practitioner-grade reference**: exactly what fields appear, what
they mean, where the format has quirks or bugs, and how our own code copes with
them. Not a spec — we don't own any of these formats. But close enough that a
new contributor can add a derivation or a projector without a week of cargo-
culting.

## Contents

- **[`claude-code/`](claude-code/README.md)** — Claude Code
(`~/.claude/projects/…` JSONL). Split into focused docs covering
directory layout, the JSONL line envelope, entry types, the `message`
object and content parts, tool invocation lifecycle, session chains
and compaction, peripheral files, writing-compatible JSONL, known
issues, a line-by-line walkthrough, and a version-keyed format
changelog. Each revision of the reference carries a date stamp at
the top of the subdirectory's README.
- **[`codex.md`](codex.md)** — Codex CLI rollout files under
`~/.codex/sessions/YYYY/MM/DD/rollout-*.jsonl`. Single-file reference
covering the date-bucketed session format and the `patch_apply_end`
events that drive file-change fidelity.
- **[`gemini.md`](gemini.md)** — Gemini CLI chats under
`~/.gemini/tmp/<project>/chats/`, including the main-file + sibling
sub-agent UUID directory layout.
- **[`opencode.md`](opencode.md)** — opencode's SQLite database
(`~/.local/share/opencode/opencode.db`), its 12 typed message-part
variants, and the sibling bare-git snapshot repo used for file diffs.

The Claude Code reference is the most detailed because it's the
longest-standing provider and has the most moving parts (JSONL
envelope variants, session chaining, compaction, sidechains, and the
loader's own undocumented strictness on what it will accept). The
other three sit in single files because their formats are either
simpler or sufficiently covered there.

## Conventions used in these docs

- **"In the wild"** = observed in real JSONL files on disk, not just in types
we've defined.
- **Field tables** show the name as it appears in JSON (so `parentUuid`, not
`parent_uuid`), its shape, and whether it's optional. "Optional" means we've
seen entries without it; "required" means we've never seen an entry missing
it (not that the format promises it'll always be there).
- **Citations** point either to files under this repo (`crates/<name>/src/…`)
or to external sources (marked with URLs). Repo citations dominate — we
trust our own parsers and tests more than we trust blog posts.
- **Version numbers** when quoted (e.g. "Claude Code 2.1.90") are what we've
seen in sample data, not what Anthropic has officially tagged a format
change at.

## Maintenance

When `toolpath-claude` (or its siblings) learns about a new field, entry type,
or edge case, update the corresponding doc here in the same change. The point
of this directory is to be the single place where format knowledge
accumulates; if the knowledge only lives in code comments or commit messages,
it effectively doesn't exist.
Loading
Loading