Problem
Voice segments and annotations are listed separately in the output with independent timestamps. An AI agent must manually cross-reference [00:14] Circle around X with [00:16] "this needs to have actual logos" to understand they're related. This should be automatic.
Proposed Solution
When an annotation timestamp is within ~3s of a voice segment, merge them into a single intent unit in the formatted output:
Before:
## Annotations
1. [00:14] Circle around span.agency-name
## Voice Transcript
[00:16] "this needs to have actual logos over here"
After:
## Feedback
1. [00:14] Circle around span.agency-name — "this needs to have actual logos over here"
[screenshot]
Implementation
- Formatter change in
src/shared/formatter.ts
- Match annotations to voice segments by timestamp proximity (configurable window, default 3s)
- Match annotations to screenshots by
annotationIndex
- Group into "feedback items" that combine element + voice + screenshot
🤖 Generated with Claude Code
Problem
Voice segments and annotations are listed separately in the output with independent timestamps. An AI agent must manually cross-reference
[00:14] Circle around Xwith[00:16] "this needs to have actual logos"to understand they're related. This should be automatic.Proposed Solution
When an annotation timestamp is within ~3s of a voice segment, merge them into a single intent unit in the formatted output:
Before:
After:
Implementation
src/shared/formatter.tsannotationIndex🤖 Generated with Claude Code