feat(agents-server-ui): show per-response token usage in meta row by kevin-dp · Pull Request #4502 · electric-sql/electric

kevin-dp · 2026-06-04T11:51:27Z

Summary

Adds a token-usage label to the agent response meta row, e.g.
Thinking · 12s · 1.2k ↑ 412 ↓ while streaming and
✓ done in 12s · 1.2k ↑ 412 ↓ · 14:18 once settled. Counter updates at
step boundaries — for a single-turn LLM call it lands once at done;
for tool-using runs it jumps as each step completes (the LLM SDK only
emits usage at end-of-step, so we can't tick smoothly between
streamed tokens — the elapsed-time ticker still ticks every second
alongside it).

Plumbing

The runtime already had the token data — pi-adapter.ts:358-359
extracts tokenInput/tokenOutput from the provider's per-step
usage payload — but the bridge silently dropped them before
persistence. This PR closes that gap and surfaces them all the way to
the UI:

StepValue gains optional input_tokens / output_tokens columns
(Zod + TS). Strictly additive: events recorded before this change
still validate (both fields optional), so no migration is needed.
outbound-bridge.ts:onStepEnd now persists the values it was
already receiving from pi-adapter.ts.
IncludesStep / EntityTimelineStepItem surface the new fields,
and the three .select() blocks that materialize step rows include
them.
The cached agent_response section grows a
tokens?: { input?, output? } summed across the run's steps at
section-build time, and the fingerprintRun cache key includes
per-step token deltas so a late-arriving onStepEnd invalidates a
stale cached section.
New <TokenUsage> component in agents-server-ui with
tabular-nums so digits don't jitter, locale-aware compact
formatting via Intl.NumberFormat. Renders next to <ElapsedTime>
in both the live and cached meta rows.

Test plan

pnpm typecheck clean in agents-runtime + agents-server-ui
pnpm test in agents-server-ui (66 passed)
pnpm test outbound-bridge use-chat entity-timeline in
agents-runtime (74 passed)
Full agents-runtime test suite: my branch matches the same
pre-existing 401 failures observed on clean main (unrelated
permission-system breakage in the test harness, not introduced
by this PR)
Manual: launch a turn that uses tools and watch the counter
jump at each step boundary
Manual: pure-text turn — counter lands once at done

Notes

Historical responses recorded before this change have no token data
persisted (older steps rows lack the columns). The tokens field
is conditional on at least one step reporting a number, so those
sections continue to render with no token row instead of "0 / 0".
Display format 1.2k ↑ 412 ↓ chosen for compactness in the meta
row. Open to changing to 1.2k in / 412 out or similar if the
arrow direction is unclear — input goes up to the model, output
comes down.

🤖 Generated with Claude Code

Sums input/output tokens across every step of the run and renders them next to the elapsed-time ticker (e.g. `Thinking · 12s · 1.2k ↑ 412 ↓`). Counter updates at step boundaries — the LLM SDK only reports `usage` at end-of-step, so within a single text stream the value stays flat; tool-using runs see jumps as each step settles. Token plumbing (additive, no migration): - `StepValue` Zod + TS gains optional `input_tokens` / `output_tokens` - `outbound-bridge.ts:onStepEnd` now persists the `tokenInput` / `tokenOutput` values it was already receiving but dropping - `IncludesStep` / `EntityTimelineStepItem` and the three step `.select()` blocks surface the new fields - The cached `agent_response` section gets a summed `tokens?: { input?, output? }`, and the section-cache fingerprint includes per-step token deltas so a late `onStepEnd` invalidates a stale section

github-actions · 2026-06-04T11:52:09Z

Electric Agents Desktop Builds

Build artifacts for commit 36ccc20.

Platform	Status	Artifact
macOS Apple Silicon	Passed	DMG
macOS Intel	Passed	DMG
Windows x64	Passed	Installer
Linux x64	Passed	AppImage / deb

Workflow run

codecov · 2026-06-04T11:52:42Z

Codecov Report

❌ Patch coverage is 49.78541% with 117 lines in your changes missing coverage. Please review.
⚠️ Please upload report for BASE (main@7709c9a). Learn more about missing BASE report.
⚠️ Report is 22 commits behind head on main.
✅ All tests successful. No failed tests found.

Files with missing lines	Patch %	Lines
packages/agents-runtime/src/entity-timeline.ts	47.82%	84 Missing ⚠️
.../agents-server-ui/src/components/AgentResponse.tsx	0.00%	17 Missing ⚠️
...ges/agents-server-ui/src/components/TokenUsage.tsx	0.00%	14 Missing ⚠️
packages/agents-runtime/src/pi-adapter.ts	93.54%	2 Missing ⚠️

Additional details and impacted files

@@           Coverage Diff           @@
##             main    #4502   +/-   ##
=======================================
  Coverage        ?   56.56%           
=======================================
  Files           ?      359           
  Lines           ?    39243           
  Branches        ?    11028           
=======================================
  Hits            ?    22198           
  Misses          ?    16974           
  Partials        ?       71

Flag	Coverage Δ
packages/agents	`70.75% <ø> (?)`
packages/agents-mcp	`77.54% <ø> (?)`
packages/agents-mobile	`66.92% <ø> (?)`
packages/agents-runtime	`80.07% <57.42%> (?)`
packages/agents-server	`74.16% <ø> (?)`
packages/agents-server-ui	`6.20% <0.00%> (?)`
packages/electric-ax	`46.42% <ø> (?)`
packages/experimental	`87.73% <ø> (?)`
packages/react-hooks	`86.48% <ø> (?)`
packages/start	`82.83% <ø> (?)`
packages/typescript-client	`91.83% <ø> (?)`
packages/y-electric	`56.05% <ø> (?)`
typescript	`56.56% <49.78%> (?)`
unit-tests	`56.56% <49.78%> (?)`

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Harness.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

github-actions · 2026-06-04T11:55:35Z

Electric Agents Mobile Build

Local mobile checks ran for commit 36ccc20.

The EAS Android preview build was skipped because the mobile-eas-build label is not present.
Add the mobile-eas-build label to this PR to produce an installable preview build.

Workflow run

samwillis

Interactive review with GPT.

Thanks for wiring this through. I traced the token usage path end-to-end:

pi-adapter message_end.usage
→ bridge.onStepEnd({ tokenInput, tokenOutput })
→ steps.update({ input_tokens, output_tokens })
→ timeline step rows
→ UI meta row.

Overall the stream write looks sound: token usage is attached to the step completion update, so it appears once at the end of a pure text generation step, and jumps at step boundaries for tool-using runs.

A couple of suggestions/questions:

Can we compute the per-run total in the timeline query/view model?

Right now AgentResponseLive subscribes to run.steps and sums input_tokens / output_tokens in React, while buildAgentSection separately performs the same aggregation for materialized sections.

If TanStack DB supports this shape cleanly, I think it would be better to expose a single per-run tokens field from createEntityTimelineQuery / the includes query, e.g. via a scoped aggregate over steps for the run. Then the UI only renders run.tokens / section.tokens, and the aggregation logic lives in one layer.

Since usage only lands at step end, not token-by-token, updating the parent run/timeline row at those boundaries seems acceptable to me.

Missing usage fields currently become real zeroes

In pi-adapter, when msg.usage exists, missing sides are coerced to 0:

...(usage && {
  tokenInput: usage.input ?? usage.inputTokens ?? 0,
  tokenOutput: usage.output ?? usage.outputTokens ?? 0,
}),

Now that these values are persisted and displayed, that can make an unknown side look like a real 0. If pi-ai guarantees both input and output are always present, a small regression test would be useful. Otherwise I’d preserve undefined for missing sides and only write numeric values.

Test coverage

I’d like to see a targeted regression test that proves token usage reaches the steps.update event, and another around the timeline/view-model total if we move the aggregation there. That would lock down the important stream contract introduced by this PR.

KyleAMathews · 2026-06-04T15:45:53Z

👍 yeah just showing total tokens for the run at the bottom makes sense to me

… layer Move the input/output token summation out of `AgentResponseLive` and `buildAgentSection` into a single `leftJoin` against a per-run aggregate subquery (groupBy run_id, sum + count) in both `createEntityTimelineQuery` and `createEntityIncludesQuery`. Consumers now read `run.tokens` directly without re-summing step rows. - `IncludesRun` and `EntityTimelineRunRow` gain `tokens?: { input?, output? }`. - `buildIncludesRuns` (in-memory builder) computes the same shape from materialized steps. - `fingerprintRun` hashes the resolved tokens instead of per-step deltas. - `AgentResponseLive` drops the React-side step reducer; coerces TanStack DB's SQL-style `null` for absent sides to `undefined` so `TokenUsage`'s `!= null` checks stay correct. - Adds 7 regression tests covering the live query path, in-memory builder, and section plumbing. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

…-adapter `pi-adapter`'s `message_end` handler was coercing a missing `usage.input` / `usage.output` to `0` via `?? 0`, making an unreported side indistinguishable from a real zero-token step in the rendered meta row. With the per-run aggregate now landing in the query layer (see prior commit), a fabricated `0` would also poison `count(input_tokens)` / `count(output_tokens)`, marking absent sides as present. Forward `undefined` for any side that doesn't arrive as a `number`; `onStepEnd` already conditionally writes those columns, so the `steps` row stays null on the missing side. Adds regression tests for both ends of the contract: - `outbound-bridge.test.ts`: `onStepEnd` with token fields produces a `steps.update` event whose `input_tokens` / `output_tokens` match, and omits a column when the corresponding arg is undefined. - `pi-adapter.test.ts`: a synthetic `message_end` with a `Usage` payload routes through to the step update; deleting a side from the payload omits that column instead of writing `0`. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

pi-ai's `Usage` splits the input side across three counters: `input` (new uncached tokens), `cacheRead` (prompt-cache hits — the system prompt + history once the cache is warm) and `cacheWrite` (tokens added to the cache this turn). The adapter was reading only `usage.input`, which on second+ turns of any cache-using provider (Anthropic, etc.) collapses to a handful of tokens because everything else hits the cache. The meta row was showing "3 ↑" regardless of how large the actual prompt was. Sum all three input-side counters into `tokenInput` so the displayed total reflects the prompt volume the model actually saw. `inputTokens` remains as a legacy flat-field fallback for non-pi-ai providers that don't split the cache columns. Adds a regression test that a `Usage` payload with all three counters populated yields the sum on the `steps.update` event. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

netlify · 2026-06-08T09:55:35Z

✅ Deploy Preview for electric-next ready!

Name	Link
🔨 Latest commit	`36ccc20`
🔍 Latest deploy log	https://app.netlify.com/projects/electric-next/deploys/6a26912d2ee12f0008b8dbd8
😎 Deploy Preview	https://deploy-preview-4502--electric-next.netlify.app
📱 Preview on mobile	Toggle QR Code... Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify project configuration.

samwillis reviewed Jun 4, 2026

View reviewed changes

kevin-dp and others added 3 commits June 8, 2026 11:17

kevin-dp requested a review from samwillis June 8, 2026 11:21

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(agents-server-ui): show per-response token usage in meta row#4502

feat(agents-server-ui): show per-response token usage in meta row#4502
kevin-dp wants to merge 4 commits into
mainfrom
kevin/agent-token-usage

kevin-dp commented Jun 4, 2026 •

edited

Loading

Uh oh!

github-actions Bot commented Jun 4, 2026 •

edited

Loading

Uh oh!

codecov Bot commented Jun 4, 2026 •

edited

Loading

Uh oh!

github-actions Bot commented Jun 4, 2026 •

edited

Loading

Uh oh!

samwillis left a comment

Uh oh!

KyleAMathews commented Jun 4, 2026

Uh oh!

netlify Bot commented Jun 8, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

kevin-dp commented Jun 4, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Plumbing

Test plan

Notes

Uh oh!

github-actions Bot commented Jun 4, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Electric Agents Desktop Builds

Uh oh!

codecov Bot commented Jun 4, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

github-actions Bot commented Jun 4, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Electric Agents Mobile Build

Uh oh!

samwillis left a comment

Choose a reason for hiding this comment

Uh oh!

KyleAMathews commented Jun 4, 2026

Uh oh!

netlify Bot commented Jun 8, 2026

✅ Deploy Preview for electric-next ready!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

kevin-dp commented Jun 4, 2026 •

edited

Loading

github-actions Bot commented Jun 4, 2026 •

edited

Loading

codecov Bot commented Jun 4, 2026 •

edited

Loading

github-actions Bot commented Jun 4, 2026 •

edited

Loading