feat: add chainweaver profile <traces...> CLI for bottleneck analysis (#147)#159
Open
dgenio wants to merge 1 commit into
Open
feat: add chainweaver profile <traces...> CLI for bottleneck analysis (#147)#159dgenio wants to merge 1 commit into
chainweaver profile <traces...> CLI for bottleneck analysis (#147)#159dgenio wants to merge 1 commit into
Conversation
Closes #147. Single-trace mode answers "which step is slow?" from one ExecutionResult JSON file. Multi-trace mode answers "is it always slow?" by aggregating p50 / p95 / p99 / mean / stdev per step across N traces. Usage: chainweaver profile path/to/trace.json chainweaver profile path/to/trace.json --top 5 chainweaver profile path/to/*.trace.json --format json Single trace (table): - Total wall-clock + sum-of-step ms + orchestration overhead. - Per-step ASCII bar chart sorted by duration_ms descending. - --top N (default 10) truncates and surfaces "M more step(s) not shown". Multi trace (table): - Verifies all traces share the same flow_name and step count (exits 1 with a clear message otherwise). - Per-step p50 / p95 / p99 / mean / stdev via stdlib statistics. - Consistency warning when a step's stdev > 50% of its mean. JSON format (both modes) is a stable machine-readable shape suitable for CI gates. Exit codes: - 0 — analysis ran (a failed flow still exits 0 because profile is read-only; failure is signalled in the per-step rows). - 1 — malformed trace, mixed flow names, mismatched step counts, or invalid --top. - 2 — file not found. Implementation: - chainweaver/viz.py — private `_render_step_bar_chart()` helper using unicode block characters; scales bars to the longest row, truncates tool names that exceed `name_width`. - chainweaver/cli.py — `profile_command` + two private helpers (`_load_execution_result`, `_percentiles`) + a small linear-interp quantile (`_quantile`) so single-call p95/p99 don't require allocating the full decile list. - No new runtime dependency — pure stdlib statistics + viz string ops. Tests: 12 new cases in tests/test_cli_profile.py covering: - Single trace: table happy path, JSON shape, --top truncation, --top validation, failed-step marker, missing file (exit 2), malformed trace (exit 1). - Multi trace: percentile output, table aggregation, mixed flow names (exit 1), mismatched step counts (exit 1), consistency-warning surfaces. Verification: $ ruff check chainweaver/ tests/ examples/ # All checks passed $ ruff format --check chainweaver/ tests/ ... # 50 files already formatted $ python -m mypy chainweaver/ tests/ # Success: no issues $ python -m pytest tests/ -q --no-cov # 491 passed in 1.86s Stacked on top of #158 (run CLI subcommand); the base auto-retargets to #157 → main as those merge. https://claude.ai/code/session_01QcSJ3NWhe5B4k1EP25Hx3n
8 tasks
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Adds
chainweaver profile <traces...>so operators can answer "which step is slow? is it always slow?" fromExecutionResultJSON files — without writing custom Python. Self-hosted, dep-free.Stacked on top of #158 (run CLI); base auto-retargets through
#158 → #157 → mainas those merge.Closes #147.
Changes
chainweaver/viz.py— private_render_step_bar_chart(rows, ...)helper using unicode block characters; scales bars to the longest row and right-truncates tool names that exceedname_width.chainweaver/cli.py— newprofile_command, three private helpers (_load_execution_result,_percentiles,_quantile), single-trace and multi-trace renderers, module-docstring update.tests/test_cli_profile.py— new test file (12 cases).Behavior
Single trace:
duration_msdescending.--top N(default 10) truncates and surfaces"M more step(s) not shown".Multi trace:
flow_nameand step count (exits 1 with a clear message otherwise).statistics.--format jsonemits a stable, machine-readable shape (single-trace + multi-trace branches differ bytrace_count).Exit codes
0— analysis ran. Note: a failed flow still exits0becauseprofileis read-only; the failure is signalled in the per-step rows (ERR).1— malformed trace, mixed flow names, mismatched step counts across traces, or invalid--top.2— file not found.Testing
ruff check chainweaver/ tests/ examples/)ruff format --check chainweaver/ tests/ examples/)python -m mypy chainweaver/ tests/)Diff stat:
3 files changed, 599 insertions(+), 2 deletions(-).Related Issues
Closes #147. Sister tool to the upcoming
chainweaver diff(#148).Checklist
AGENTS.mdanddocs/agent-context/)_render_step_bar_chartis intentionally private (single in-package consumer)Tradeoffs / risks
tests/test_cli_profile.py) rather than appending totests/test_cli.py. Justification: the existingtest_cli.pyis now 778 lines afterrun; splitting per verb keeps each file tractable. Owner-mode scope-delta call._quantileinstead ofstatistics.quantiles(..., n=100): cheaper for single p95/p99 reads (avoids allocating the 99-element decile list). Linear interpolation matches theinclusivemethod semantics.profileon a failed flow exits 0: the analysis itself succeeded; the failure shows up asERRin the rows. This matches the read-only contract ofinspect/vizand avoids conflating "couldn't analyze" with "the analyzed flow failed."Scope notes
Closes #147 only. Adjacent items deferred:
chainweaver diff(#148, next PR in this stack), example-trace fixture underexamples/(tests/fixtures/-style fixtures suffice for the in-test coverage), and the planned MkDocs site (docs/cli.md— depends on #133).https://claude.ai/code/session_01QcSJ3NWhe5B4k1EP25Hx3n
Generated by Claude Code