Problem
When a container exits non-zero, the worker logs container stderr — which for Claude Code is typically just MCP server startup noise, not the actual error. The real failure reason (tool refusal, context overflow, task misunderstanding) is buried in the JSONL agent transcript and never surfaced.
Proposed solution
On non-zero container exit:
- Locate the agent JSONL transcript in the workspace (e.g. from the Deep Agents runtime output)
- Parse the last N assistant turns for error signals: refusals,
result: "error" tool responses, stop_reason: "max_tokens"
- Log the extracted error context at
logger.error level alongside the exit code
- Include the transcript path in the log so operators can inspect it directly
Reference
Learned from fullsend: internal/cli/run.go lines ~839–847 — parses transcript JSONL on non-zero exit and emits ::error:: annotations to the workflow log.
Problem
When a container exits non-zero, the worker logs container stderr — which for Claude Code is typically just MCP server startup noise, not the actual error. The real failure reason (tool refusal, context overflow, task misunderstanding) is buried in the JSONL agent transcript and never surfaced.
Proposed solution
On non-zero container exit:
result: "error"tool responses,stop_reason: "max_tokens"logger.errorlevel alongside the exit codeReference
Learned from fullsend:
internal/cli/run.golines ~839–847 — parses transcript JSONL on non-zero exit and emits::error::annotations to the workflow log.