feat: bugfix workflow redesign with new /fix command by maxritter · Pull Request #135 · maxritter/pilot-shell

maxritter · 2026-04-29T08:14:46Z

Summary

Adds /fix as the standard bugfix command — investigate, RED test, fix, audit, done — with mandatory end-to-end verification. Slims the existing /spec bugfix lane substantially. Addresses the issue where two simple bugs took 21+ minutes and 400k tokens halfway through.

/fix — new user-invocable command

New skill at pilot/skills/fix/ with six steps (investigate → RED → fix → audit → quality → finalise)
Always quick: stops cleanly and tells the user to re-invoke with /spec when complexity exceeds the quick lane (no --full flag, no auto-routing between commands — honour the user's command choice)
Iron Law docs: reorder setup with less context switching #4: never claim done from unit tests alone. Step 4.3 mandates running the actual program with concrete evidence
- UI: Claude Code Chrome → Chrome DevTools MCP → playwright-cli → agent-browser
- CLI / API / library / background job: real invocation with captured output
Step 1.4 adds explicit Instrument-when-needed guidance with SPEC-DEBUG: markers (cleanup grep enforced at audit)
Registered as user-invocable, model: opus

Slimmed full bugfix lane (/spec)

Codex dropped from spec-bugfix-plan (was step 7) — adversarial review on a plan with no code yet was overhead. Codex still runs at spec-verify time.
Codex-once rule across spec-plan + spec-verify: a session-scoped sentinel file prevents re-runs across plan iterations within one /spec invocation
spec-bugfix-verify: 11 → 8 steps. Dropped redundant plan-verify, runtime-verification, post-merge-verify
Step 2 (Behavior Contract audit): tightened from ~6.4KB to ~3.8KB. Step 2.6 now enforces mandatory E2E verification with concrete evidence — no claim-VERIFIED-on-tests-alone
spec-bugfix-plan: tightened red-flags step with Common Rationalizations and User Signals tables (from superpowers/systematic-debugging). Step 3.3 adds concrete multi-layer instrumentation example. Step 4 fix-approach defaults to picking the obvious fix.
spec-implement Task 2 (bugfix lane): drops the per-task full-suite run — single biggest token sink in bundled bugfix plans. Full suite now runs once at the Quality Gate.

Iteration cap

spec-bugfix-verify Step 8 caps fix iterations at 3. After three failed verify cycles, AskUserQuestion presents Continue/Pivot/Abandon — stops the fix-on-fix-on-fix death spiral.

Dispatcher / model defaults / Console

/spec dispatcher: clean split — /spec routes bugs to spec-bugfix-plan, /fix is a separate command for the quick lane
launcher/model_config.py: /fix and /benchmark added to default skill models (opus). Console useSettings.ts mirrors.
Console Settings page: General Rows now lists /fix and /benchmark with descriptive labels + sub-text. SPEC_PHASE_ROWS rendered with sub-text.
launcher/banner.py + installer/steps/finalize.py: /fix and /benchmark added to quick-start tips and post-install workflow list
pilot/settings.json: companyAnnouncements line lists /fix; spinner tip updated

Docs

New Docusaurus page docs/workflows/fix.md with workflow diagram, Common Issues table, mandatory E2E section, /spec vs /fix decision table
/spec page trimmed of bugfix-only content (now points at /fix)
README: new /fix section with How /fix works + How /spec handles bugs details blocks
Marketing site WhatsInside.tsx updated to mention /fix

Other-session changes incorporated

Console: parsePlanContent.ts + tests for spec section rendering and plan annotator persistence (Bug A + Bug B fixes from a parallel session)
pilot/hooks/tool_redirect.py + tests
pilot/.mcp.json + hooks/hooks.json + scripts/*.cjs minor updates
pilot/rules/development-practices.md + mcp-servers.md updates
installer/downloads.py minor
pilot/ui/* compiled bundles rebuilt
.devcontainer/* tweaks

Verification

✅ 1690 Python tests pass (launcher + hooks + installer)
✅ Console: 1438 tests pass + tsc --noEmit clean
✅ ruff check clean on all files I touched

Summary by CodeRabbit

Release Notes

New Features
- Pilot Bot: 24/7 automation agent with scheduled job execution.
- /benchmark workflow: Quantitative before/after evaluation with graded assertions.
- /fix workflow: Streamlined bugfix process with RED→fix→verify discipline.
- Customization framework: Team-wide Pilot Shell workflow and configuration overrides.
- Chrome DevTools MCP: Enhanced browser automation as enterprise fallback.
- Context-mode: Sandbox routing for large outputs with significant token savings.
Improvements
- Resilient download retries with targeted HTTP 5xx backoff.
- Native SQLite support and Claude CLI plugin marketplace integration.
- Refined skill artifact handling and generation.

… filters, and context-mode integration - Redesign Console dashboard: 8 clickable stat cards (Projects, Sessions, Active, Memories, Extensions, Requirements, Specifications, Changes), Recent Specifications and Recent Memories cards - Flatten sidebar navigation: single nav list with inline ProjectFilter dropdowns per view - Add TopbarSpecs: active spec pills in top bar across all projects - Add notification navigation with project switching and clear-all support - Add session resume workflow: copy session ID, /resume deep-linking - Memory cards: show linked session, click to navigate, remove clutter - Cross-project APIs: /api/plans/active/all, /api/extensions?all=true - Project list from project_roots table (filtered to real git repos) - Update branding: Make Claude Code production-ready - Add context-mode integration for token-optimized context management - Update README, Docusaurus docs, and marketing site BREAKING CHANGE: Console UI completely restructured — sidebar flattened, project selector removed from sidebar and replaced with per-view inline filters, dashboard cards redesigned, PlanStatus and GitStatus widgets removed

…able)

…s, background jobs, and optional Telegram

# [8.0.0](v7.11.4...v8.0.0) (2026-04-08) ### Bug Fixes * pin Bun to 1.3.11 (1.2.15 lacks compression APIs, latest is unstable) ([2372ea7](2372ea7)) ### Features * add Pilot Bot — persistent automation agent with scheduled tasks, background jobs, and optional Telegram ([cab87f6](cab87f6)) * complete Console overhaul — v8.0.0 ([#126](#126)) ([bfa261f](bfa261f)) ### BREAKING CHANGES * Console UI completely restructured — sidebar flattened, project selector removed from sidebar and replaced with per-view inline filters, dashboard cards redesigned, PlanStatus and GitStatus widgets removed * fix: remove duplicate Help slide, update hooks count to 21 * fix: pin Bun to 1.2.15 in CI (latest has execSync regression on Linux)

…t skills

## [8.0.1](v8.0.0...v8.0.1) (2026-04-08) ### Bug Fixes * use correct CronCreate parameter name (cron, not schedule) in bot skills ([ad7f4af](ad7f4af))

…down, session filtering, spec tab consistency

## [8.0.2](v8.0.1...v8.0.2) (2026-04-09) ### Bug Fixes * console dashboard overhaul — 2x2 recent cards, usage model breakdown, session filtering, spec tab consistency ([3648a2d](3648a2d))

## [8.0.3](v8.0.2...v8.0.3) (2026-04-09) ### Bug Fixes * Improved Code Reviewer Stability, Codegraph Usage and Context Mode ([e78d68b](e78d68b))

## [8.0.4](v8.0.3...v8.0.4) (2026-04-09) ### Bug Fixes * Improved Dependency Installation Speed for Installer and Updater ([e426d3c](e426d3c)) * Improved PRD and Spec Sharing / Annotation UI on pilot-shell.com ([afc32f7](afc32f7))

## [8.0.5](v8.0.4...v8.0.5) (2026-04-09) ### Bug Fixes * skip CodeGraph indexing in non-git directories ([b4281df](b4281df))

## [8.0.6](v8.0.5...v8.0.6) (2026-04-09) ### Bug Fixes * walk up directory tree for git repo detection in CodeGraph guard ([f028b57](f028b57))

…ortal ssr leak

## [8.0.7](v8.0.6...v8.0.7) (2026-04-10) ### Bug Fixes * add tool output compression + sandboxed content cards to Usage view ([46127a1](46127a1)) * hook venv sync, console annotation writes, codegraph native sqlite ([4e9d264](4e9d264)) * restore dom globals after terminal-preview-xss test to prevent portal ssr leak ([77eda81](77eda81)) * stop incomplete child_process mocks poisoning CI test runs ([7e30074](7e30074))

…ier, fix auto-mode flag and docs - Add chrome-devtools-mcp plugin installation to installer (marketplace + parallel install) - Update browser automation from 3-tier to 4-tier: Chrome → Chrome DevTools MCP → playwright-cli → agent-browser - Update rules (browser-automation, verification, testing), skills (spec-verify, spec-bugfix-verify, spec-implement) - Update docs: Docusaurus (open-source-tools, installation, mcp-servers, rules, spec workflow), site, README - Remove deprecated --enable-auto-mode flag from wrapper (fixes user-reported launch errors) - Fix project filtering to support git repo subdirectories in Console - Update hooks documentation count and footer links

## [8.0.10](v8.0.9...v8.0.10) (2026-04-14) ### Bug Fixes * codegraph native SQLite repair, /spec new-branch option, session prompt display ([21fb7b6](21fb7b6))

…anced UI - Add SessionJsonlService to parse Claude Code JSONL session files - Extract token usage, tool breakdown, model distribution, duration, cost - Cost calculation with up-to-date pricing for Opus 4.5+/4, Sonnet, Haiku tiers - New GET /api/sessions/:id/stats and POST /api/sessions/costs endpoints - Session detail modal with stats grid, prompt, tool usage (2-col layout) - Session cards show cost and total messages instead of observations - Sortable session list columns (date, cost, messages, prompts) - Dashboard shows cost per session, cross-project navigation - 15 unit tests with JSONL fixtures

# [8.1.0](v8.0.10...v8.1.0) (2026-04-15) ### Features * add session details with JSONL stats, cost calculation, and enhanced UI ([0590783](0590783))

- Add customization packs: pilot customize install/update/remove/status CLI commands for teams on Team/Enterprise tiers to overlay custom skills, rules, hooks, and agents on top of core Pilot via a git repository. Auto-reapplied on pilot update and installer runs. Security: symlink rejection, license validation, atomic manifest writes, stale file cleanup on re-apply. - Add Opus 4.7 model support across launcher, console, and documentation. - Support custom plan statuses in Console and statusline — teams can write arbitrary Status: values in plan files and they display as-is with neutral styling instead of being hidden. - Never skip Codex adversarial review in spec-plan/spec-verify/spec-bugfix-plan — wait for the task completion notification, never read the output file to determine completion. - Docs: new customization page, updated pricing, extensions cross-ref, README.

# [8.2.0](v8.1.0...v8.2.0) (2026-04-16) ### Features * customization packs, Opus 4.7 support, and Codex bugfixes ([af711a4](af711a4))

…rtifacts untracked

## [8.2.1](v8.2.0...v8.2.1) (2026-04-17) ### Bug Fixes * customization overrides, skill decomposition polish, generated artifacts untracked ([ec1c268](ec1c268)) * Removed static effort for skills/commands to adjust based on global ([4fd9050](4fd9050))

## [8.2.2](v8.2.1...v8.2.2) (2026-04-17) ### Bug Fixes * Updated inconsistent steps in spec workflow ([a1e6118](a1e6118))

… skills

## [8.2.3](v8.2.2...v8.2.3) (2026-04-17) ### Bug Fixes * drop skill banner, normalize step heading levels across workflow skills ([05d37fa](05d37fa))

…n can't leak workers Fixes #129. ## Problem The SessionEnd hook does two side-effects: 1. POST session-complete to Console (localhost:41777) 2. Stop the bun worker-service.cjs if this is the last active session Both currently run synchronously in the hook process: - `_complete_session` calls `urllib.request.urlopen(..., timeout=5)` inline - `main` calls `subprocess.run(["bun", ...], timeout=15)` inline When Claude Code's harness cancels the SessionEnd hook (which it does after a short internal deadline), the hook process is terminated before either side-effect completes. Symptoms: - Terminal shows "SessionEnd hook [...] failed: Hook cancelled" - bun worker-service.cjs daemons accumulate as orphans across sessions - Orphans hold open file handles to ~/.pilot/memory/pilot-memory.db (WAL) but serve no traffic (only the newest is bound to :41777) Reproduced on Arch/Manjaro with Claude Code 2.x — observed three running daemons (two orphaned) after a handful of sessions. ## Fix Hand both side-effects off to detached subprocesses (`start_new_session=True`) before returning from `main`. The parent hook becomes essentially instantaneous (two `Popen` calls, no waiting), so harness cancellation no longer races the work. Ordering: worker-stop first, Console POST second. The worker-stop is the leakier side effect (holds file descriptors + network port); prioritising it means that even in a pathological case where the second `Popen` fails, the critical resource release still happens. Timeouts inside the detached workers are preserved (Console POST 5s, no explicit timeout on `bun stop` — unchanged from previous behaviour because it now runs in its own session and doesn't tie up the hook). The Console POST logic is inlined as a string constant (`_COMPLETE_SESSION_WORKER`) passed to `python -c`. This avoids creating a new file in the repo and keeps the worker self-contained; the subprocess startup cost (~30ms) is the only overhead when Console is down. ## Verification - `python -m py_compile` on the modified file: passes - `main()` with no env vars: returns 0 cleanly (short-circuits on missing CLAUDE_PLUGIN_ROOT) - Inlined worker script compiles and executes cleanly when invoked with a live Console (tested with `python3 -c <script> <url> <sid>` → POST 200) - Behavioural smoke test: run a session, exit, observe no "Hook cancelled" message and no orphan `bun worker-service.cjs --daemon` processes in `ps` - Data safety: workers are stateless HTTP fronts over shared SQLite-WAL; killing the old synchronous path's leaked workers with SIGTERM had no data-integrity impact (PRAGMA integrity_check = ok) ## Why not just increase the hook timeout? The hook timeout is controlled by the harness, not this repo. Even if it were raised, network stalls / bun startup latency / loaded-disk fsync spikes can all push a synchronous implementation past any fixed deadline. Detaching removes the class of failure entirely.

…coped annotations Bundles several independent fixes under one release. ## Vector DB disk exhaustion (closes #130) hnswlib's persisted `link_lists.bin` could grow to 11.5 TB logical / 859 GB physical, filling a 1 TB WSL disk. Adds three independent layers so the file cannot grow unbounded: - `VectorDbSizeGuard` module with both logical (`st.size`) and physical (`st.blocks * 512`) measurement so sparse-file bloat is detected. - Boot-time guard in `DatabaseManager` evicts the dir before Chroma starts (defaults: 2 GB physical / 50 GB logical, configurable via `CLAUDE_PILOT_VECTOR_DB_MAX_PHYSICAL_MB` / `CLAUDE_PILOT_VECTOR_DB_MAX_LOGICAL_MB`). - Pre-write gate in `ChromaSync.assertSafeToWrite()` — every `chroma_add_documents` / `chroma_create_collection` call checks size first (tighter caps: 1 GB physical / 10 GB logical). hnswlib never receives a write that would extend an already-oversized file. - Post-retention guard in `RetentionScheduler` re-checks after every cleanup and closes Chroma if tripped. - Auto-disable after 2 evictions within 7 days atomically flips `CLAUDE_PILOT_CHROMA_ENABLED=false` in `settings.json`. Search falls back to SQLite via existing `SearchOrchestrator` path (same escape hatch as claude-mem #991). - `/api/vector-db/health` now reports physical size, largest file, and last-eviction marker so operators can see what happened. ## Project-scoped annotations in Changes view Annotations were visible across projects — two concurrent "Changes" sessions showed each other's annotations on the same screen. Scopes `AnnotationRoutes` queries by project and filters the Changes view accordingly. ## Model routing settings UI Surface per-role model selection (opus/sonnet/haiku) in the console Settings view. Launcher `model_config.py` + `settings_injector.py` read `~/.pilot/config.json` and inject model preferences into plugin settings on startup. ## Spec workflow refinements - spec-bugfix-plan: expanded red-flags step, tightened root-cause plan + write-plan steps. - spec-bugfix-verify: behaviour-contract audit in verify-fix and plan-verify steps. - spec-implement: TDD-loop guidance updated. - docs: workflows/spec.md and features/{console,model-routing, statusline}.md updated to match. ## Statusline widgets Widget formatter updates with corresponding test coverage. ## Installer `claude_files.py` gains ~216 lines for non-destructive settings/skill merges on re-install; tests updated accordingly. ## Bundled assets `pilot/scripts/*.cjs` and `pilot/ui/*.js` regenerated from `console/src/` to match the fixes above. ## Verification - Bun: 1349 pass / 0 fail (22 new tests for `VectorDbSizeGuard`) - Python: 1574 pass / 0 fail - typecheck clean

## [8.2.4](v8.2.3...v8.2.4) (2026-04-21) ### Bug Fixes * **hooks:** make SessionEnd fully non-blocking so harness cancellation can't leak workers ([3cb965a](3cb965a)) * prevent vector-db disk exhaustion + model routing UI + project-scoped annotations ([1a066d7](1a066d7)), closes [#130](#130) [#991](https://github.com/maxritter/pilot-shell/issues/991)

…e task cards - statusline: read rate_limits from Claude Code stdin (cross-platform, no OAuth), single-percentage pacing arrow, fall back to lines+git when the field is absent (API/Enterprise users). Drop token-count suffix on the context bar. - installer: vendor installer/skill_builder.py and build SKILL.md in-process during install; no longer shells out to the pilot binary. Drift-guard test keeps it byte-identical to launcher/skill_builder.py. Bump parallel download workers 8 -> 16. - launcher/wrapper: generic capability probe caches claude --help once and appends --thinking-display summarized when supported. - launcher/codegraph: silence redundant native-SQLite repair message. - console/Topbar: spec pills now show hover cards with parsed task lists (completed + next pending) and bugfix/feature icons. - console/Changes: prefer main-branch plans over worktree plans when auto-selecting active plan so annotations match the current checkout; CodeReviewPanel filters diffs by path. - skills: prd/03-clarify skips questions already answered upstream; spec-bugfix-plan gains the git log -S pattern for token-change forensics; verify steps tightened. - settings: enable showThinkingHint. - ignore local claude-pace/ reference clone.

# [8.3.0](v8.2.4...v8.3.0) (2026-04-23) ### Features * cross-platform statusline usage, in-process skill build, console task cards ([9f394be](9f394be))

Primary feature: /benchmark — quantitative before/after measurement for Claude Code skills and rules. Isolated per-run sandboxes, crash-proof global-rule auto-hiding, parallel workers, and an LLM grader that emits structured per-assertion verdicts plus improvement suggestions. - pilot/skills/benchmark/: orchestrator + 6 step files (intake → target discovery → author evals → execute → present findings → improvement plan), runner.py with parallel workers and per-run filesystem isolation, aggregate_benchmark.py, isolation.py, progress.py, utils.py, agents/grader.md, and ~1800 LOC of unit test coverage - docs/docusaurus/docs/workflows/benchmark.md: workflow documentation Also in this commit: - /prd skill: new 02b-ideate step, orchestrator and 8 steps updated - console: pilot-spawn utility + tests, license routes tests, user-message handler fixes, worker-service refactor, Dashboard/Requirements/Topbar viewer updates - installer: finalize step + claude_files test updates - launcher: banner and statusline formatter updates - CI/release: release-dev and release workflow tweaks, pre-commit hook - docs/site: ConsoleSection, InstallSection, WorkflowSteps, PRD and spec workflow pages, intro, statusline feature page - misc: README, .gitignore, pyproject.toml

# [8.4.0](v8.3.0...v8.4.0) (2026-04-24) ### Features * add /benchmark skill and evaluation framework ([843ef85](843ef85))

console/usage: - Replace external ccusage CLI with native Claude session analytics - New pricing engine (LiteLLM fetch + bundled fallback + model aliases) filters out <synthetic>/unknown internal markers - New /api/usage/{daily,monthly,models,yield,plan} endpoints; cross-view parity ensures Sessions and Usage agree on cost - Restructured Usage view: equal-height pairs (Cost-by-Model + Routing, Code-Quality + Tool-Output-Compression), Plan progress below - Cost-by-Model groups by family (Opus/Sonnet/Haiku) with distinct colors - Code-Quality card: Day/Week/Month toggle, commits-shipped headline, cost-per-shipped-commit, productive-spend %, degraded-state surfacing - Plan tracking via SettingsRoutes (Pro/Max/Custom + month-end forecast) - Yield uses execFile + per-call timeouts + ref validation + HEAD-sha cache ccusage scrub: - Drop install_ccusage from installer + tests, uninstall hint, and the open-source-tools docs row; codeburn/ source repo deleted pilot/hooks: - New tool_redirect hook (with tests) to redirect deferred sub-agent flows pilot/rules: - New documentation-sync rule; tweaks to development-practices, testing, and verification pilot/skills: - benchmark: runner/utils tightening, review-iterate + improvement-plan step rewrites, new test coverage - setup-rules: split out new sync-agents step, renumber summary 11 -> 12 docs: - Refreshed benchmark, setup-rules, statusline, rules, intro pages - Site SEO/sitemap/indexnow plumbing, hero copy, Schema.org logo fix - Stddev/pass-rate disambiguation and removal of phantom Step 6 refs in benchmark.md (CodeRabbit findings)

## [8.4.1](v8.4.0...v8.4.1) (2026-04-27) ### Bug Fixes * native usage analytics, ccusage scrub, hook + rules + site updates ([c7cd604](c7cd604))

Adds /fix as the standard bugfix command — investigate, RED test, fix, audit, done — with mandatory end-to-end verification. Slims the existing /spec bugfix lane substantially. The workflow change addresses the issue where two simple bugs took 21+ minutes and 400k tokens halfway through. ## /fix — new user-invocable command - New skill at pilot/skills/fix/ with six steps (investigate → RED → fix → audit → quality → finalise). Always quick: stops cleanly and tells the user to re-invoke with /spec when complexity exceeds the quick lane. - Iron Law #4: never claim done from unit tests alone. Step 4.3 is mandatory end-to-end verification — UI bugs must run browser automation via Claude Code Chrome / Chrome DevTools MCP / playwright-cli / agent-browser; CLI/API/library bugs must be exercised with real invocations and concrete evidence captured in the completion report. - Step 1.4 adds explicit Instrument-when-needed guidance with SPEC-DEBUG markers (cleanup grep enforced at audit). - Registered as user-invocable, model: opus. ## Slimmed full bugfix lane (/spec) - Dropped Codex from spec-bugfix-plan entirely (was step 7) — adversarial review on a plan with no code yet was overhead. Codex still runs at spec-verify time. - Codex-once rule across spec-plan + spec-verify: a session-scoped sentinel file prevents re-runs across plan iterations within one /spec invocation (was re-running on every annotate/verify cycle). - spec-bugfix-verify: 11 steps → 8. Dropped redundant plan-verify (no-op), runtime-verification (foldable into quality), post-merge-verify (squash merges with no conflicts are safe). - spec-bugfix-verify Step 2 (Behavior Contract audit): tightened from ~6.4KB to ~3.8KB. Sub-steps 2.5/2.6 merged into 2.4. Step 2.6 now enforces mandatory E2E verification with concrete evidence — no claim-VERIFIED-on-tests-alone. - spec-bugfix-plan: tightened red-flags step with Common Rationalizations and User Signals tables (inspired by superpowers/systematic-debugging). Step 3.3 adds a concrete multi-layer instrumentation example. Step 4 fix-approach-selection defaults to picking the obvious fix rather than manufacturing fake alternatives. - spec-implement Task 2 (bugfix lane): drops the per-task full-suite run. For bundled bugfix plans this was the single biggest token sink — full suite now runs once at the Quality Gate, targeted module test runs between fix iterations. ## Iteration cap - spec-bugfix-verify Step 8 caps fix iterations at 3. After three failed verify cycles, AskUserQuestion presents Continue/Pivot/Abandon — stops the fix-on-fix-on-fix death spiral that previously had no bound. ## Dispatcher / model defaults / Console - /spec dispatcher restored to clean split: /spec routes bugs to spec-bugfix-plan (full lane), /fix is a separate user-facing command for the quick lane. No --full flag, no auto-routing between commands. - launcher/model_config.py: /fix and /benchmark added to default skill models (opus). Console useSettings.ts mirrors. - Console Settings page: General Rows now lists /fix and /benchmark with descriptive labels + sub-text. SPEC_PHASE_ROWS rendered with sub-text. - launcher/banner.py + installer/steps/finalize.py: /fix and /benchmark added to quick-start tips and post-install workflow list. Spec-Driven feature description updated to mention both /spec and /fix. - pilot/settings.json: companyAnnouncements line now lists /fix; spinner tip updated. ## Docs - New Docusaurus page docs/workflows/fix.md with workflow diagram, Common Issues table, mandatory E2E section, /spec vs /fix decision table. Sidebars updated. - /spec page trimmed of bugfix-only content (now points at /fix). - README: new /fix section with How /fix works + How /spec handles bugs details blocks. - Marketing site WhatsInside.tsx updated to mention /fix. ## Other-session changes incorporated - Console: parsePlanContent.ts + tests for spec section rendering and plan annotator persistence (Bug A + Bug B fixes). - pilot/hooks/tool_redirect.py + tests: extended. - pilot/.mcp.json + hooks/hooks.json + scripts/*.cjs: minor updates. - pilot/rules/development-practices.md + mcp-servers.md: updates. - installer/downloads.py: minor. - pilot/ui/* compiled bundles: rebuilt. - .devcontainer/*: minor adjustments.

vercel · 2026-04-29T08:14:51Z

The latest updates on your projects. Learn more about Vercel for GitHub.

1 Skipped Deployment

Project	Deployment	Actions	Updated (UTC)
pilot-shell	Ignored		Apr 29, 2026 8:14am

coderabbitai · 2026-04-29T08:14:59Z

Caution

Review failed

Pull request was closed or merged during review

Walkthrough

This PR represents version 8.4.1, repositioning Pilot Shell as a "Claude Code Engineering Platform" with major documentation overhaul, new workflows (/benchmark, /fix), platform features (Pilot Bot, customization system), substantial installer refactoring (skill-builder, plugin installers, customization reapplication), and comprehensive site/SEO metadata updates.

Changes

Cohort / File(s)	Summary
Infrastructure & Environment `.devcontainer/Dockerfile`, `.devcontainer/devcontainer.json`, `.githooks/pre-commit`, `.gitignore`	Added OS-level dependencies (bubblewrap, iptables, ipset), persistent Claude config volume mounting, benchmark test detection in pre-commit, and expanded .gitignore patterns for generated artifacts (benchmarks, skill outputs, local configs).
CI/CD Workflows `.github/workflows/release-dev.yml`, `.github/workflows/release.yml`	Extended pytest to include benchmark tests; pinned Bun version to 1.3.11 (replacing `latest`) for consistent console test/build toolchain.
Version & Changelog `CHANGELOG.md`, `installer/__init__.py`	Added extensive changelog sections for versions 8.4.1→8.0.0 covering new features (benchmark framework, Pilot Bot, customization), bug fixes, and improvements; updated installer version from 7.11.4 to 8.4.1.
Installer Core Logic `installer/skill_builder.py`, `installer/downloads.py`, `installer/steps/claude_files.py`, `installer/steps/config_migration.py`, `installer/steps/dependencies.py`, `installer/steps/finalize.py`	Introduced skill_builder module for normalized SKILL.md generation; enhanced download retry logic with HTTP status differentiation; added skill SKILL.md materialization with fragment recovery and customization reapplication; refactored dependencies to use Claude CLI plugins (context-mode, Codex, Chrome DevTools MCP), better-sqlite3, and parallel task execution; updated config migration to preserve worktree choices; refined finalize UI for new workflows.
Installer Tests `installer/tests/unit/steps/test_claude_files.py`, `installer/tests/unit/steps/test_config_migration.py`, `installer/tests/unit/steps/test_dependencies.py`, `installer/tests/unit/steps/test_dependencies_playwright.py`, `installer/tests/unit/steps/test_finalize.py`, `installer/tests/unit/test_*.py`, `installer/tests/unit/test_skill_builder.py`	Expanded test coverage for SKILL.md building, customization reapplication, config migration with worktree preservation, new plugin installers, better-sqlite3, codegraph git detection, parallel task execution, browser ARM64 installs, and skill_builder module vendoring integrity.
Documentation - Platform Messaging `README.md`, `docs/docusaurus/docs/intro.md`, `installer/ui.py`	Repositioned as "Claude Code Engineering Platform" with plan→implement→verify workflow, persistent context, quality hooks, and local dashboard visibility; removed prior generic "What's Inside" sections.
Documentation - Workflows `docs/docusaurus/docs/workflows/benchmark.md`, `docs/docusaurus/docs/workflows/fix.md`, `docs/docusaurus/docs/workflows/prd.md`, `docs/docusaurus/docs/workflows/spec.md`, `docs/docusaurus/docs/workflows/create-skill.md`, `docs/docusaurus/docs/workflows/setup-rules.md`	Added `/benchmark` (quantitative evaluation with assertions) and `/fix` (RED→GREEN bugfix discipline); updated `/prd` to emphasize brainstorming/divergent ideation; enhanced `/spec` with bugfix routing, branch strategies, and Chrome DevTools MCP for E2E; refined `/setup-rules` phases and `/create-skill` messaging.
Documentation - Features `docs/docusaurus/docs/features/bot.md`, `docs/docusaurus/docs/features/cli.md`, `docs/docusaurus/docs/features/console.md`, `docs/docusaurus/docs/features/context-optimization.md`, `docs/docusaurus/docs/features/customization.md`, `docs/docusaurus/docs/features/extensions.md`, `docs/docusaurus/docs/features/hooks.md`, `docs/docusaurus/docs/features/mcp-servers.md`, `docs/docusaurus/docs/features/model-routing.md`, `docs/docusaurus/docs/features/open-source-tools.md`, `docs/docusaurus/docs/features/remote-control.md`, `docs/docusaurus/docs/features/rules.md`, `docs/docusaurus/docs/features/statusline.md`	Introduced Pilot Bot documentation with skills and scheduled-task automation; added Customization guide for team-wide config/skill overlays; updated MCP servers to document context-mode sandbox routing and Chrome DevTools MCP; refined console views (Requirements, Dashboard, Sessions, Memories, Specifications); updated model routing for Opus 4.7; expanded rules with documentation-sync and context-mode; revised statusline for rate_limits-driven pacing display.
Documentation - Getting Started `docs/docusaurus/docs/getting-started/installation.md`, `docs/docusaurus/docs/getting-started/prerequisites.md`, `docs/docusaurus/docs/getting-started/permission-modes.md`	Updated browser automation flow to include Chrome DevTools MCP as enterprise fallback; repositioned Codex from optional to included with auto-setup; updated Auto Mode requirements to Claude Sonnet 4.6 or Opus 4.7.
Docusaurus Configuration `docs/docusaurus/docusaurus.config.ts`, `docs/docusaurus/sidebars.ts`	Updated tagline and added trailingSlash control plus social/SEO metadata (site image, keywords, Twitter card, Open Graph); extended sidebar with new workflow/feature entries.
Site Build & Distribution `docs/site/build-all.sh`, `docs/site/vercel.json`	Updated build script to handle trailingSlash: false routing (docs.html, search.html, sitemap placement under docs/); added Vercel redirects from claude-pilot.com domains to pilot-shell.com.
Site Pages & SEO `docs/site/index.html`, `docs/site/public/manifest.json`, `docs/site/public/robots.txt`, `docs/site/public/0bd196f90bbc9ec8113bd78de2507fb2.txt`	Extensive SEO overhaul: updated title, description, keywords; added schema.org (WebSite, SoftwareApplication, HowTo, FAQPage); added dns-prefetch hints, IndexNow verification, noscript fallback; updated robots.txt with explicit crawler policies; updated manifest description to platform tagline; added IndexNow key verification file.
Site Components - Messaging & Navigation `docs/site/src/components/HeroSection.tsx`, `docs/site/src/components/Logo.tsx`, `docs/site/src/components/NavBar.tsx`, `docs/site/src/components/Footer.tsx`, `docs/site/src/components/SEO.tsx`, `docs/site/src/components/InstallSection.tsx`	Updated hero section with "Pilot Shell" headline and new feature highlights; changed badge labels (Remote Control→LSP Servers, Reusable Skills→Pilot Bot); updated docs link normalization; refreshed taglines and default SEO metadata; changed /prd label from requirements to brainstorm.
Site Components - Feature Content `docs/site/src/components/ConsoleSection.tsx`, `docs/site/src/components/DeepDiveSection.tsx`, `docs/site/src/components/WhatsInside.tsx`, `docs/site/src/components/WorkflowSteps.tsx`	Reordered console slides with updated descriptions; updated Hooks Pipeline stage/file names and hook count; replaced MCP server names; added context-mode content emphasis; refactored WhatsInside with href links and new Customization feature; expanded WorkflowSteps with /benchmark workflow and /prd emphasis on brainstorming.
Site Components - FAQ & Testimonials `docs/site/src/components/FAQSection.tsx`, `docs/site/src/components/TestimonialsSection.tsx`, `docs/site/src/components/PricingSection.tsx`	Updated FAQ model default to Opus 4.7 and added customization FAQ; refreshed testimonials with new quotes and unified role labels; updated pricing tiers (Solo: Pilot Bot 24/7 automation, Team: Customization).
Site Annotation & Sharing `docs/site/src/components/feedback/AnnotationToolbar.tsx`, `docs/site/src/components/feedback/BlockRenderer.tsx`, `docs/site/src/components/feedback/FeedbackSidebar.tsx`, `docs/site/src/components/feedback/index.ts`, `docs/site/src/lib/annotation/types.ts`, `docs/site/src/lib/annotation/useAnnotation.ts`, `docs/site/src/lib/annotation/index.ts`, `docs/site/src/lib/sharing/types.ts`, `docs/site/src/pages/Shared.tsx`	Removed AnnotationToolbar component and pending-selection state management; updated BlockRenderer styling (Amber→Blue palette) and removed onBlockMouseUp prop; simplified FeedbackSidebar empty state UI; added contentType field to SharePayload for requirement vs specification labeling; refactored Shared page to remove selection-based submission plumbing.
Site Plugins & Configuration `docs/site/vite-plugin-indexnow.ts`, `docs/site/vite-plugin-sitemap.ts`, `docs/site/vite.config.ts`, `docs/site/tsconfig.app.json`	Added IndexNow plugin for URL indexing submission; updated sitemap plugin to generate sitemap-pages.xml with priority/changefreq and sitemap.xml index; integrated plugins into Vite config; removed baseUrl from tsconfig.
Install Script `install.sh`	Added conditional restart message based on PILOT_RESTART_BOT_MODE environment variable.
Platform Utilities `installer/platform_utils.py`	Removed redundant blank line.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~75 minutes

Possibly related PRs

feat: replace Teams with Skillshare-based Share system, streamline spec workflow and hooks #95: Modifies config_migration.py's worktree logic, directly related to this PR's preservation of user worktree choices.
feat: complete Console overhaul — v8.0.0 #126: Overlapping changes to installer dependencies (context-mode, better-sqlite3), hooks/docs/statusline updates, and config migration behavior.
feat: plan annotations, code review mode, E2E test scenarios in spec workflow #121: Touches the same annotation UI/library code paths (AnnotationToolbar, BlockRenderer, annotation hooks) that are refactored in this PR.

Suggested labels

released

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 75.66% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (4 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title clearly and specifically describes the main change: introduction of a new /fix command with a redesigned bugfix workflow, which aligns with the PR's core objective.
Linked Issues check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check	✅ Passed	Check skipped because no linked issues were found for this pull request.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

📝 Generate docstrings

Create stacked PR
Commit on current branch

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch dev

⚔️ Resolve merge conflicts

Resolve merge conflict in branch dev

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Review rate limit: 7/8 reviews remaining, refill in 7 minutes and 30 seconds.}

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

maxritter and others added 30 commits April 8, 2026 13:01

fix: remove duplicate Help slide, update hooks count to 21

12fdb50

fix: pin Bun to 1.2.15 in CI (latest has execSync regression on Linux)

1e9d5c9

fix: pin Bun to 1.3.11 (1.2.15 lacks compression APIs, latest is unst…

fd5af4e

…able)

feat: add Pilot Bot — persistent automation agent with scheduled task…

559e664

…s, background jobs, and optional Telegram

fix: use correct CronCreate parameter name (cron, not schedule) in bo…

22a70af

…t skills

chore(release): 8.0.1 [skip ci]

c52ca83

## [8.0.1](v8.0.0...v8.0.1) (2026-04-08) ### Bug Fixes * use correct CronCreate parameter name (cron, not schedule) in bot skills ([ad7f4af](ad7f4af))

fix: console dashboard overhaul — 2x2 recent cards, usage model break…

77d9603

…down, session filtering, spec tab consistency

chore(release): 8.0.2 [skip ci]

c7a221e

## [8.0.2](v8.0.1...v8.0.2) (2026-04-09) ### Bug Fixes * console dashboard overhaul — 2x2 recent cards, usage model breakdown, session filtering, spec tab consistency ([3648a2d](3648a2d))

chore: Updated specifications images

fd5e005

fix: Improved Code Reviewer Stability, Codegraph Usage and Context Mode

94557e7

chore(release): 8.0.3 [skip ci]

ff22fa4

## [8.0.3](v8.0.2...v8.0.3) (2026-04-09) ### Bug Fixes * Improved Code Reviewer Stability, Codegraph Usage and Context Mode ([e78d68b](e78d68b))

fix: Improved Dependency Installation Speed for Installer and Updater

bf13133

fix: Improved PRD and Spec Sharing / Annotation UI on pilot-shell.com

9755abe

chore(release): 8.0.4 [skip ci]

7ff43d4

## [8.0.4](v8.0.3...v8.0.4) (2026-04-09) ### Bug Fixes * Improved Dependency Installation Speed for Installer and Updater ([e426d3c](e426d3c)) * Improved PRD and Spec Sharing / Annotation UI on pilot-shell.com ([afc32f7](afc32f7))

chore: Improved Readme

6b099f2

chore: Updated Readme

4db01ff

fix: skip CodeGraph indexing in non-git directories

5663e44

chore(release): 8.0.5 [skip ci]

61e867d

## [8.0.5](v8.0.4...v8.0.5) (2026-04-09) ### Bug Fixes * skip CodeGraph indexing in non-git directories ([b4281df](b4281df))

fix: walk up directory tree for git repo detection in CodeGraph guard

676f08c

chore(release): 8.0.6 [skip ci]

c6abc8c

## [8.0.6](v8.0.5...v8.0.6) (2026-04-09) ### Bug Fixes * walk up directory tree for git repo detection in CodeGraph guard ([f028b57](f028b57))

fix: hook venv sync, console annotation writes, codegraph native sqlite

6c58138

fix: add tool output compression + sandboxed content cards to Usage view

f10098d

fix: stop incomplete child_process mocks poisoning CI test runs

9ddfd4a

fix: restore dom globals after terminal-preview-xss test to prevent p…

f75ce31

…ortal ssr leak

chore: Updated Readme

78ebfee

chore: Updated Demo Gif

7f026da

semantic-release-bot and others added 26 commits April 29, 2026 10:13

chore(release): 8.0.10 [skip ci]

98ab698

## [8.0.10](v8.0.9...v8.0.10) (2026-04-14) ### Bug Fixes * codegraph native SQLite repair, /spec new-branch option, session prompt display ([21fb7b6](21fb7b6))

chore(release): 8.1.0 [skip ci]

1cafc8e

# [8.1.0](v8.0.10...v8.1.0) (2026-04-15) ### Features * add session details with JSONL stats, cost calculation, and enhanced UI ([0590783](0590783))

chore(release): 8.2.0 [skip ci]

75d1bc4

# [8.2.0](v8.1.0...v8.2.0) (2026-04-16) ### Features * customization packs, Opus 4.7 support, and Codex bugfixes ([af711a4](af711a4))

fix: customization overrides, skill decomposition polish, generated a…

e3991e8

…rtifacts untracked

fix: Removed static effort for skills/commands to adjust based on global

a654a39

fix: Updated inconsistent steps in spec workflow

c113ab6

chore(release): 8.2.2 [skip ci]

c899407

## [8.2.2](v8.2.1...v8.2.2) (2026-04-17) ### Bug Fixes * Updated inconsistent steps in spec workflow ([a1e6118](a1e6118))

chore: reframe customization docs and rework site hero/pillars

ec1e00f

fix: drop skill banner, normalize step heading levels across workflow…

781f0c9

… skills

chore(release): 8.2.3 [skip ci]

7041fb1

## [8.2.3](v8.2.2...v8.2.3) (2026-04-17) ### Bug Fixes * drop skill banner, normalize step heading levels across workflow skills ([05d37fa](05d37fa))

chore(release): 8.3.0 [skip ci]

4c4d019

# [8.3.0](v8.2.4...v8.3.0) (2026-04-23) ### Features * cross-platform statusline usage, in-process skill build, console task cards ([9f394be](9f394be))

chore(release): 8.4.0 [skip ci]

3a9080e

# [8.4.0](v8.3.0...v8.4.0) (2026-04-24) ### Features * add /benchmark skill and evaluation framework ([843ef85](843ef85))

chore(release): 8.4.1 [skip ci]

50b0588

## [8.4.1](v8.4.0...v8.4.1) (2026-04-27) ### Bug Fixes * native usage analytics, ccusage scrub, hook + rules + site updates ([c7cd604](c7cd604))

chore: Updated Readme

3b1e663

chore: Updated Website SEO

8bbd0d3

chore: Change to Vercel Config

44300e9

maxritter closed this Apr 29, 2026

maxritter deleted the dev branch April 29, 2026 08:20

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: bugfix workflow redesign with new /fix command#135

feat: bugfix workflow redesign with new /fix command#135
maxritter wants to merge 60 commits intomainfrom
dev

maxritter commented Apr 29, 2026 •

edited by coderabbitai Bot

Loading

Uh oh!

vercel Bot commented Apr 29, 2026

Uh oh!

coderabbitai Bot commented Apr 29, 2026 •

edited

Loading

Review failed

❌ Failed checks (1 warning)

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

maxritter commented Apr 29, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

/fix — new user-invocable command

Slimmed full bugfix lane (/spec)

Iteration cap

Dispatcher / model defaults / Console

Docs

Other-session changes incorporated

Verification

Summary by CodeRabbit

Release Notes

Uh oh!

vercel Bot commented Apr 29, 2026

Uh oh!

coderabbitai Bot commented Apr 29, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Review failed

Walkthrough

Changes

Estimated code review effort

Possibly related PRs

Suggested labels

❌ Failed checks (1 warning)

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

maxritter commented Apr 29, 2026 •

edited by coderabbitai Bot

Loading

coderabbitai Bot commented Apr 29, 2026 •

edited

Loading