fix(mcp): discriminated oneOf on codedb_bundle ops items (#437)#438
Closed
fix(mcp): discriminated oneOf on codedb_bundle ops items (#437)#438
Conversation
The bundle inputSchema advertises ops items with required: ["tool"]
and arguments as a bare {type: "object"}, so function-calling LLMs
emit {tool, arguments: {}} as the minimum-valid payload. This test
asserts "arguments" is in items.required so models are forced to
populate it.
Also exposes tools_list as pub for the test to introspect.
Fails on main; fix follows in next commit.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Stage 1 of the bundle-schema fix. Prior schema had ops items with
required: ["tool"] and arguments as a bare {type: "object"}, so
function-calling LLMs read {} as a valid arguments payload and
emitted {tool, arguments: {}} as the minimum-valid call. The empty
object then triggered the #424 inline-args fallback, which used the
op object itself as the args bag and surfaced as
"received keys: [tool, arguments]" from each sub-tool.
Adding "arguments" to items.required forces the model to populate
it. The runtime inline-args fallback at mcp.zig:1948 stays as a
backstop for non-conformant clients.
Stage 2 (discriminated oneOf over tool, binding arguments to each
sub-tool's inputSchema) is a follow-up — it requires turning the
hand-rolled tools_list literal into a builder.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Stage 2 of the bundle-schema fix. Stage 1 (#434) made `arguments` required at the items level, but the field is still a bare {type: "object"} so a schema-greedy model can satisfy `required` by emitting `arguments: {}`. This test asserts the items schema contains a discriminated `oneOf` with one branch per dispatchable codedb_* sub-tool, each pinning `tool` to a const and `arguments` to that sub-tool's actual inputSchema. Adds a stub `buildAugmentedToolsList` that returns the unaugmented schema so the test fails at runtime instead of as a compile error. The real builder lands in the fix commit. Fails on this branch; fix follows in next commit. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Stage 2 of the bundle-schema fix. Stage 1 (#434) made `arguments` required at the items level but left it as a bare {type: "object"}; a schema-greedy model could still satisfy the required check by emitting `arguments: {}`. Stage 2 binds the *contents* of arguments to each sub-tool's actual inputSchema via a discriminated oneOf on `tool` (const) → `arguments` (sub-tool inputSchema). Once a model picks `tool: "codedb_outline"`, the only matching branch requires arguments.path:string — there is no schema-minimal escape. `buildAugmentedToolsList` parses the existing tools_list literal once at server startup, mutates the bundle items to add the oneOf, and serializes back. No hand-maintained duplication — branches are generated from the per-sub-tool schemas already advertised. Falls back to the raw tools_list if augmentation fails (parse error / OOM) so clients still get a valid schema. codedb_bundle (recursive) and codedb_edit (write op) are excluded from the oneOf since handleBundle rejects them at runtime anyway. Schema payload roughly doubles (~12KB → ~24KB after augmentation, 19 branches across the dispatchable codedb_* tools). Test: tests.zig now asserts the augmented schema contains oneOf with branches that pin tool to a const and preserve each sub-tool's required args (codedb_outline branch must require `path`). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
4 tasks
justrach
added a commit
that referenced
this pull request
May 6, 2026
…434, #436, #437) (#439) * feat(explore): opt-in rerank-trace logger for offline tuning experiments Adds a v0 trace logger that appends one JSON line per searchContent invocation when enabled via .codedbrc (rerank_trace = true). Captures {ts, query, results:[{path,line,score}]} so we can inspect the data and decide whether online learning-to-rank from agent traces is worth building. Pure observation — does not change ranking behavior. Disabled by default. Caps query at 256 bytes, results at 50 entries, and rotates the file by truncate-clobber once it crosses 10 MB. All I/O errors are swallowed; logging never breaks a search. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix(explore): always score in rerankAndFinalize, not just when len > 1 Pre-fix the multi-signal scoring loop only ran when result_list had more than one item — a micro-optimization that skipped sorting a single result. With the rerank-trace logger added in 54b6b72, this made single-result entries log score=0.0, making them indistinguishable from genuinely zero-confidence matches in offline analysis. The fix runs scoring unconditionally and keeps the sort guarded behind len > 1. Cost: a few µs per single-result search — negligible. Caught by end-to-end binary verification. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * test(mcp): issue-434 failing test for codedb_bundle ops schema The bundle inputSchema advertises ops items with required: ["tool"] and arguments as a bare {type: "object"}, so function-calling LLMs emit {tool, arguments: {}} as the minimum-valid payload. This test asserts "arguments" is in items.required so models are forced to populate it. Also exposes tools_list as pub for the test to introspect. Fails on main; fix follows in next commit. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix(mcp): require arguments in codedb_bundle ops items schema (#434) Stage 1 of the bundle-schema fix. Prior schema had ops items with required: ["tool"] and arguments as a bare {type: "object"}, so function-calling LLMs read {} as a valid arguments payload and emitted {tool, arguments: {}} as the minimum-valid call. The empty object then triggered the #424 inline-args fallback, which used the op object itself as the args bag and surfaced as "received keys: [tool, arguments]" from each sub-tool. Adding "arguments" to items.required forces the model to populate it. The runtime inline-args fallback at mcp.zig:1948 stays as a backstop for non-conformant clients. Stage 2 (discriminated oneOf over tool, binding arguments to each sub-tool's inputSchema) is a follow-up — it requires turning the hand-rolled tools_list literal into a builder. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * test(mcp): issue-437 failing test for bundle items oneOf Stage 2 of the bundle-schema fix. Stage 1 (#434) made `arguments` required at the items level, but the field is still a bare {type: "object"} so a schema-greedy model can satisfy `required` by emitting `arguments: {}`. This test asserts the items schema contains a discriminated `oneOf` with one branch per dispatchable codedb_* sub-tool, each pinning `tool` to a const and `arguments` to that sub-tool's actual inputSchema. Adds a stub `buildAugmentedToolsList` that returns the unaugmented schema so the test fails at runtime instead of as a compile error. The real builder lands in the fix commit. Fails on this branch; fix follows in next commit. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix(mcp): discriminated oneOf on codedb_bundle ops items (#437) Stage 2 of the bundle-schema fix. Stage 1 (#434) made `arguments` required at the items level but left it as a bare {type: "object"}; a schema-greedy model could still satisfy the required check by emitting `arguments: {}`. Stage 2 binds the *contents* of arguments to each sub-tool's actual inputSchema via a discriminated oneOf on `tool` (const) → `arguments` (sub-tool inputSchema). Once a model picks `tool: "codedb_outline"`, the only matching branch requires arguments.path:string — there is no schema-minimal escape. `buildAugmentedToolsList` parses the existing tools_list literal once at server startup, mutates the bundle items to add the oneOf, and serializes back. No hand-maintained duplication — branches are generated from the per-sub-tool schemas already advertised. Falls back to the raw tools_list if augmentation fails (parse error / OOM) so clients still get a valid schema. codedb_bundle (recursive) and codedb_edit (write op) are excluded from the oneOf since handleBundle rejects them at runtime anyway. Schema payload roughly doubles (~12KB → ~24KB after augmentation, 19 branches across the dispatchable codedb_* tools). Test: tests.zig now asserts the augmented schema contains oneOf with branches that pin tool to a const and preserve each sub-tool's required args (codedb_outline branch must require `path`). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * release: v0.2.5808 — codedb_bundle schema fix + rerank-trace logger (#434, #436, #437) Bundles three PRs: - #435 (Stage 1, #434): require `arguments` on bundle ops items - #438 (Stage 2, #437): discriminated oneOf per sub-tool - #436: opt-in rerank-trace logger for offline tuning + score-on-len-1 fix End-to-end Sonnet 4.6 verifies the schema constraint flows through to model output: bundle calls now arrive with populated, correctly- named `arguments` for each sub-op. 513/513 tests pass on the merged branch. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Owner
Author
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
buildAugmentedToolsListtosrc/mcp.zig: parsestools_listat server startup and mutates the codedb_bundle ops items schema to include a discriminatedoneOfarray with one branch per dispatchable codedb_* sub-tool.toolto aconst(e.g."codedb_outline") and bindsargumentsto that sub-tool's actual inputSchema, so once a model pickstool: "codedb_outline"the only matching branch requiresarguments.path: string— there is no schema-minimalarguments: {}escape.tools_listif augmentation fails (parse error / OOM) so clients always get a valid schema.Fixes #437. Builds on #435 (Stage 1).
Why Stage 2
Stage 1 made
argumentsa required field but left it as a bare{type: "object"}. A schema-greedy function-calling model can still satisfyrequired: ["tool", "arguments"]by emitting{tool: "...", arguments: {}}— the bug morphs from "noargumentskey" into "emptyargumentsvalue." Stage 2 closes that loophole at the schema level.What's excluded from the oneOf
codedb_bundle(recursive — rejected atmcp.zig:1922)codedb_edit(write op — rejected atmcp.zig:1927)Both are still present as standalone tools in
tools/list; they're just not valid sub-ops ofcodedb_bundle.Numbers
tools_listconst.Caveats
oneOfenforcement varies across providers. Anthropic accepts it but isn't strict the way OpenAI structured-output is, and some MCP clients pass schemas straight through to whichever LLM they target. The#424runtime inline-args fallback atmcp.zig:1948stays as a backstop for non-conformant clients.Branch contents
When stacked on top of
main, this branch contains Stage 1 (#434/#435) and Stage 2 (#437):d470e5b— test for mcp: codedb_bundle ops schema permits empty arguments, function-calling LLMs emit {} #4347fb1e87— fix for mcp: codedb_bundle ops schema permits empty arguments, function-calling LLMs emit {} #434 (Stage 1)15907ae— test for mcp: codedb_bundle ops items schema needs discriminated oneOf to constrain arguments contents #4378c85c24— fix for mcp: codedb_bundle ops items schema needs discriminated oneOf to constrain arguments contents #437 (Stage 2)Once PR #435 merges, this PR will rebase to show only commits 3 and 4.
Test plan
zig build test— 508/508 pass (was 507/508 with the Stage 2 test failing before the fix).~/bin/codedbconfirms the served bundle items schema includesoneOfwith 19 branches.🤖 Generated with Claude Code