Skip to content

feat: agent-sdk observability + Kimi K2.6 + UI test buttons (beta.13)#134

Merged
luokerenx4 merged 5 commits intomasterfrom
dev
Apr 21, 2026
Merged

feat: agent-sdk observability + Kimi K2.6 + UI test buttons (beta.13)#134
luokerenx4 merged 5 commits intomasterfrom
dev

Conversation

@luokerenx4
Copy link
Copy Markdown
Contributor

Summary

  • agent-sdk observability — full result-event metadata (model / usage / cost / sessionId) to pino + one-liner console.info per turn; opt-in ALICE_SDK_DEBUG=1 flag turns on DEBUG_CLAUDE_AGENT_SDK and pipes the spawned CLI's stderr to logs/agent-sdk-debug.log, which is the only reliable way to verify routing when a fake-IP proxy sits in front of the network layer.
  • docs/agent-sdk-notes.md — integration contract for the @anthropic-ai/claude-agent-sdk backend: what env vars matter, what CLAUDE_CODE_SIMPLE actually does (strips CLI extras, not auth mode), which endpoints stay hardcoded to api.anthropic.com regardless of ANTHROPIC_BASE_URL, the error classifier, and a "cosplay" caveat for Kimi K2.x (model happily identifies as Claude under persona prompts; verify routing via debug log before concluding anything is misrouted).
  • Kimi K2.6 — bumped default model in the Kimi preset; K2.5 stays as a fallback option.
  • UI test buttons — every existing AI provider profile gets a one-click Test button in the list view, and the Edit modal gains a Test button alongside Save (mirrors the Create-modal flow).
  • Version0.9.0-beta.13.

Test plan

  • npx tsc --noEmit clean
  • pnpm test — 1088 tests pass
  • Manual: sent a test-kimi message with ALICE_SDK_DEBUG=1, inspected logs/agent-sdk-debug.log — confirmed main generation hits api.moonshot.ai, only hardcoded management paths hit api.anthropic.com
  • Manual: verify Test buttons render correctly on both list card and Edit modal, for all preset types

🤖 Generated with Claude Code

Ame and others added 5 commits April 21, 2026 07:47
The lock flag `userScrolledUp` is now driven only by user-intent events
(wheel, touchmove, touchend, the floating scroll-to-bottom button).
onScroll is demoted to pure UI state — it updates showScrollBtn and
newMsgCount but no longer writes userScrolledUp.

Previously onScroll reset the flag whenever post-scroll distance fell
below the threshold, which during streaming undid the synchronous
wheel/touchmove lock before the next auto-scroll could see it. Moving
the unlock transition to a rAF-deferred distance check inside the
wheel-down / touchend handlers keeps all transitions on the same
user-input timeline.

Does not yet fully resolve the user-reported symptom — bug still
reproduces under streaming. Committed as a structural step before
further investigation.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Moonshot released Kimi K2.6 on 2026-04-13 (1T params, stronger coding +
agent planning than K2.5). Makes kimi-k2.6 the default; keeps kimi-k2.5
in the dropdown as a fallback option.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Creation already had a Test Connection step; editing and the profile
list didn't — once a profile was saved, the only way to verify it was
to send a real chat and wait. Adds two entry points:

- Each profile card in the list gets a one-click Test button with
  transient status (testing → OK / Failed → back to idle). Failure
  hover shows the error message.
- The Edit modal gains a Test button alongside Save, mirroring the
  Create modal's inline result display so edits can be verified before
  committing.

Both reuse the existing POST /api/config/profiles/test endpoint and
agentCenter.testWithProfile() path — no backend changes needed.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
… doc

Three changes that hang together as "be able to prove where a request
actually went":

- Result event metadata is now logged in full (model, usage, cost,
  sessionId) to pino, plus a one-liner console.info per turn:
    [agent-sdk] result: model=kimi-k2.6 subtype=success in=... out=...
  When the server doesn't echo a model field, the line reads
  model=(unreported) — useful as an early tell for proxy-shaped paths.

- ALICE_SDK_DEBUG=1 toggles a deeper debug path: injects
  DEBUG_CLAUDE_AGENT_SDK into the spawned CLI and streams its stderr
  into logs/agent-sdk-debug.log, prefixed with a per-request separator
  (timestamp / loginMethod / model / baseUrl). Surfaces every outbound
  URL the CLI hits, which is the only reliable way to verify routing
  when fake-IP proxies sit in front of the network layer.

- docs/agent-sdk-notes.md captures the integration contract:
  - what env vars the CLI actually honors (and what CLAUDE_CODE_SIMPLE
    really does — it strips CLI extras, not auth mode)
  - which endpoints stay hardcoded to api.anthropic.com regardless of
    ANTHROPIC_BASE_URL (telemetry, MCP discovery, org metrics, MCP
    proxy) — none of them do LLM inference, but metadata leaks are a
    known given, not a bug on our side
  - the error classifier's purpose and the debug workflow
  - a "cosplay" note for Kimi K2.x: the model happily identifies as
    Claude under persona prompts; verify routing via debug log before
    concluding anything is misrouted

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@luokerenx4 luokerenx4 merged commit 0fd44ba into master Apr 21, 2026
2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant