Skip to content

feat: surface agent transcript errors on container failure #79

@eshulman2

Description

@eshulman2

Problem

When a container exits non-zero, the worker logs container stderr — which for Claude Code is typically just MCP server startup noise, not the actual error. The real failure reason (tool refusal, context overflow, task misunderstanding) is buried in the JSONL agent transcript and never surfaced.

Proposed solution

On non-zero container exit:

  1. Locate the agent JSONL transcript in the workspace (e.g. from the Deep Agents runtime output)
  2. Parse the last N assistant turns for error signals: refusals, result: "error" tool responses, stop_reason: "max_tokens"
  3. Log the extracted error context at logger.error level alongside the exit code
  4. Include the transcript path in the log so operators can inspect it directly

Reference

Learned from fullsend: internal/cli/run.go lines ~839–847 — parses transcript JSONL on non-zero exit and emits ::error:: annotations to the workflow log.

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions