Skip to content

Ignore subagent hook events fired after a turn's Stop (#199)#208

Merged
dhilgaertner merged 1 commit intomainfrom
feature/crow-199-recap-stuck-working
Apr 24, 2026
Merged

Ignore subagent hook events fired after a turn's Stop (#199)#208
dhilgaertner merged 1 commit intomainfrom
feature/crow-199-recap-stuck-working

Conversation

@dhilgaertner
Copy link
Copy Markdown
Contributor

Summary

Fixes #199. Claude Code 2.1.108's awaySummaryEnabled "session recap" generates a SubagentStop hook event ~minutes after the user's turn has ended. Crow's old TaskCreated, TaskCompleted, SubagentStop arm unconditionally drove claudeState back to .working, trapping the sidebar dot until session end.

The fix tracks lastTopLevelStopAt: Date? on SessionHookState. When set, the dispatcher suppresses state elevation in the SubagentStart and TaskCreated/TaskCompleted/SubagentStop arms — those events are treated as background. The flag is cleared on UserPromptSubmit (next real turn), SessionStart, and SessionEnd, so legitimate mid-turn subagents (/explore, sub-tasks, etc.) are unaffected.

This generalizes beyond recap: any future "Claude wakes up after a turn ends" scenario (background telemetry, async cleanup hooks) will be correctly ignored.

Evidence the fix works

Captured with the new CROW_HOOK_DEBUG=1 flag during a real session that produced a recap:

16:22:32  UserPromptSubmit  state=done→working      ← turn begins
16:22:45  Stop              state=working→done      ← lastTopLevelStopAt set
16:23:45  Notification      (no state change)
16:25:48  SubagentStop      payload=[agent_id, agent_transcript_path, agent_type, …]
          ↑ NO state-change line — guard suppressed it

Pre-fix: that SubagentStop would drive done → working and never recover. Post-fix: lastTopLevelStopAt != nil, so claudeState stays at .done. A separate session that was actively mid-turn during the same capture window correctly kept its state at .working while running TaskCreated/TaskCompleted events, confirming the guard is scoped to post-Stop.

Why hook events, not terminal text

Per the ticket: text-pattern matching ※ recap: would rot the next time Claude Code changes the recap glyph or copy. The fix lives in the hook dispatcher where the actual lifecycle bug is.

Also included

  • CROW_HOOK_DEBUG=1-gated [hook-event] NSLog stream — logs every event arrival + every ClaudeState transition. Off by default. Kept in for the next time Claude Code's hook schema shifts (it changes often). Documented in docs/troubleshooting.md.
  • Troubleshooting docs row covering the user-side workaround for older builds (awaySummaryEnabled = false / /config / CLAUDE_CODE_ENABLE_AWAY_SUMMARY=0).

Test plan

  • All 31 existing tests pass (make test)
  • Reproduced original bug behavior would have triggered (SubagentStop arrives 3 min after user-turn Stop)
  • Confirmed sidebar stays at done after recap renders
  • Confirmed mid-turn subagents still elevate state to working (regression check on a separate session in the same capture window)
  • Reviewer: try a session with /explore or another subagent-heavy slash command — sidebar should still show working during the run

🤖 Generated with Claude Code

Claude Code 2.1.108's awaySummaryEnabled "session recap" generates a
SubagentStop hook event ~minutes after the user's turn has ended. The
old TaskCreated/TaskCompleted/SubagentStop arm unconditionally drove
claudeState back to .working, trapping the sidebar dot until session
end.

Track lastTopLevelStopAt on SessionHookState; when set, suppress state
elevation in the SubagentStart and Subagent/TaskCreated/TaskCompleted
arms. Cleared on UserPromptSubmit (next real turn), SessionStart, and
SessionEnd.

Also adds a CROW_HOOK_DEBUG=1 env-gated [hook-event] NSLog stream
documenting event arrival + ClaudeState transitions, used to confirm
the fix and kept in for future hook-lifecycle bugs (Claude Code's hook
schema changes often). Documented in docs/troubleshooting.md alongside
the user-side workaround for older builds.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@dhilgaertner dhilgaertner requested a review from dgershman as a code owner April 24, 2026 21:34
Copy link
Copy Markdown
Collaborator

@dgershman dgershman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code & Security Review

Critical Issues

None.

Security Review

Strengths:

  • CROW_HOOK_DEBUG reads from the process environment once at startup and uses a simple Bool — no injection surface, no runtime overhead when disabled.
  • Debug logging truncates session IDs to 8 chars (shortID) — safe, but even full UUIDs aren't sensitive here since they're local-only.
  • Date() stored on lastTopLevelStopAt is never serialized or sent externally — pure in-memory state.

Concerns:

  • None identified. No new external inputs, no new persistence paths, no new network calls.

Code Quality

Well done:

  • The lastTopLevelStopAt guard is the right abstraction level — it's event-schema-agnostic, so it naturally handles future background-event scenarios without needing pattern updates per event type.
  • Clearing the guard on UserPromptSubmit, SessionStart, and SessionEnd covers all legitimate "turn is live" transitions. No gap I can find where a real mid-turn subagent would be incorrectly suppressed.
  • Capturing stateBefore before the switch block and logging the transition only when the state actually changed is clean — no noise in debug output.
  • The troubleshooting docs include the three workaround paths (settings.json, /config, env var) for users on older builds — good completeness.

One observation (not blocking):

  • lastTopLevelStopAt stores a Date but the value is only ever checked for nil vs non-nil — a plain Bool like hasStopped would be slightly simpler. That said, keeping the timestamp costs nothing extra and could be useful for future diagnostics (e.g. "how long ago did the turn end?"), so this is fine as-is.

Summary Table

Priority Issue
🟢 lastTopLevelStopAt could be a Bool instead of Date? — but Date? is arguably better for future debug use

Recommendation: Approve. The fix is well-scoped, correctly guarded, and all 112 CrowCore tests pass. The approach is resilient to future Claude Code hook schema changes since it keys off the turn lifecycle (StopUserPromptSubmit) rather than matching specific event names or payloads.

@dhilgaertner dhilgaertner merged commit b719f65 into main Apr 24, 2026
2 checks passed
@dhilgaertner dhilgaertner deleted the feature/crow-199-recap-stuck-working branch April 24, 2026 21:48
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Session state sticks at "working" after Claude Code emits a session recap

2 participants