fix(mcp): require arguments in codedb_bundle ops schema (#434) by justrach · Pull Request #435 · justrach/codedb

justrach · 2026-05-06T13:57:16Z

Summary

Adds "arguments" to the codedb_bundle ops items required array so function-calling LLMs are forced to populate it instead of emitting {tool, arguments: {}}.
Includes a failing test (issue-434) that introspects the tools_list schema and asserts arguments is required for ops items.

Fixes #434.

Background

Prior schema had ops items as required: ["tool"] with arguments as a bare {type: "object"}. Function-calling models read the schema as authoritative and emitted the minimum-valid payload — {tool: "...", arguments: {}} — which then misrouted through the #424 inline-args fallback (since the empty {} triggers fallthrough at mcp.zig:1948). The op object's only keys are tool and arguments, so each sub-tool errored with received keys: [tool, arguments].

This is Stage 1 (one-line schema change). Stage 2 — discriminated oneOf over tool binding arguments to each sub-tool's actual inputSchema — is a follow-up issue: it requires turning the hand-rolled tools_list literal at src/mcp.zig:497-520 into a builder so per-sub-tool schemas can be composed at startup.

The runtime inline-args fallback at mcp.zig:1948 stays as a backstop for non-conformant clients.

Commits

d470e5b — adds failing test, exposes tools_list as pub for the test to introspect.
7fb1e87 — applies the schema fix; updates the description text in the same line to reflect that arguments is now required.

Test plan

zig build test — 507/507 pass with the fix; was 506/507 (issue-434 failing) without it.
Confirm against a real MCP client (graff or Claude) that bundle calls now arrive with populated arguments.

🤖 Generated with Claude Code

The bundle inputSchema advertises ops items with required: ["tool"] and arguments as a bare {type: "object"}, so function-calling LLMs emit {tool, arguments: {}} as the minimum-valid payload. This test asserts "arguments" is in items.required so models are forced to populate it. Also exposes tools_list as pub for the test to introspect. Fails on main; fix follows in next commit. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Stage 1 of the bundle-schema fix. Prior schema had ops items with required: ["tool"] and arguments as a bare {type: "object"}, so function-calling LLMs read {} as a valid arguments payload and emitted {tool, arguments: {}} as the minimum-valid call. The empty object then triggered the #424 inline-args fallback, which used the op object itself as the args bag and surfaced as "received keys: [tool, arguments]" from each sub-tool. Adding "arguments" to items.required forces the model to populate it. The runtime inline-args fallback at mcp.zig:1948 stays as a backstop for non-conformant clients. Stage 2 (discriminated oneOf over tool, binding arguments to each sub-tool's inputSchema) is a follow-up — it requires turning the hand-rolled tools_list literal into a builder. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 7fb1e87a2a

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

chatgpt-codex-connector · 2026-05-06T13:59:06Z

    \\{"name":"codedb_status","description":"Current indexed-file count, sequence number, and scan phase.","inputSchema":{"type":"object","properties":{"project":{"type":"string","description":"Optional absolute path to a different project (must have codedb.snapshot)"}},"required":[]}},
    \\{"name":"codedb_snapshot","description":"Pre-rendered JSON snapshot of the entire index — tree, outlines, symbols, deps. For caching or shipping to edge workers.","inputSchema":{"type":"object","properties":{"project":{"type":"string","description":"Optional absolute path to a different project (must have codedb.snapshot)"}},"required":[]}},
-    \\{"name":"codedb_bundle","description":"Run up to 20 codedb_* calls in one round-trip. Each op is either MCP-style {\"tool\":\"codedb_search\",\"arguments\":{\"query\":\"Agent\"}} or inline {\"tool\":\"codedb_search\",\"query\":\"Agent\"} — both are accepted. Example: {\"ops\":[{\"tool\":\"codedb_search\",\"arguments\":{\"query\":\"Agent\"}},{\"tool\":\"codedb_outline\",\"arguments\":{\"path\":\"src/main.zig\"}}]}. Best for parallel outline/symbol/search; avoid bundling large codedb_read calls — responses are not size-capped. If a sub-op reports `received keys: []`, the wrapper field is misnamed: use `arguments` (MCP spec), not `args`.","inputSchema":{"type":"object","properties":{"ops":{"type":"array","description":"Sub-tool calls to dispatch (max 20). Each item must have `tool`; pass per-op args either nested under `arguments` (MCP shape) or inline alongside `tool`.","items":{"type":"object","properties":{"tool":{"type":"string","description":"codedb_* tool name to invoke (e.g. codedb_outline, codedb_symbol, codedb_search, codedb_word, codedb_callers, codedb_read, codedb_deps, codedb_tree, codedb_hot, codedb_status, codedb_changes). Required."},"arguments":{"type":"object","description":"Per-call args matching that tool's inputSchema. The field MUST be named `arguments` (MCP `tools/call` convention) — `args` is silently ignored. May be omitted if you supply args inline at the op level instead."}},"required":["tool"]}},"project":{"type":"string","description":"Optional absolute path to a different project (must have codedb.snapshot)"}},"required":["ops"]}},
+    \\{"name":"codedb_bundle","description":"Run up to 20 codedb_* calls in one round-trip. Each op is either MCP-style {\"tool\":\"codedb_search\",\"arguments\":{\"query\":\"Agent\"}} or inline {\"tool\":\"codedb_search\",\"query\":\"Agent\"} — both are accepted. Example: {\"ops\":[{\"tool\":\"codedb_search\",\"arguments\":{\"query\":\"Agent\"}},{\"tool\":\"codedb_outline\",\"arguments\":{\"path\":\"src/main.zig\"}}]}. Best for parallel outline/symbol/search; avoid bundling large codedb_read calls — responses are not size-capped. If a sub-op reports `received keys: []`, the wrapper field is misnamed: use `arguments` (MCP spec), not `args`.","inputSchema":{"type":"object","properties":{"ops":{"type":"array","description":"Sub-tool calls to dispatch (max 20). Each item must have `tool` AND `arguments` (pass `{}` if the sub-tool takes none). Inline args alongside `tool` are still accepted as a fallback.","items":{"type":"object","properties":{"tool":{"type":"string","description":"codedb_* tool name to invoke (e.g. codedb_outline, codedb_symbol, codedb_search, codedb_word, codedb_callers, codedb_read, codedb_deps, codedb_tree, codedb_hot, codedb_status, codedb_changes). Required."},"arguments":{"type":"object","description":"Per-call args matching that tool's inputSchema. Field MUST be named `arguments` (MCP `tools/call` convention) — `args` is silently ignored. Pass `{}` only if the sub-tool takes no arguments. Required."}},"required":["tool","arguments"]}},"project":{"type":"string","description":"Optional absolute path to a different project (must have codedb.snapshot)"}},"required":["ops"]}},


Require the actual per-op arguments

For schema-driven clients that emit the minimum valid object, this still permits the failing shape {"tool":"codedb_search","arguments":{}}: adding arguments to required only requires the property to exist, while the nested schema is just {"type":"object"} with no per-tool required fields or minProperties. When that empty object reaches handleBundle, the existing empty-arguments fallback treats the whole op as inline args, so tools like codedb_search/codedb_outline still receive only tool and arguments and fail with the same missing-argument diagnostics. The schema needs to constrain arguments for each sub-tool (or otherwise reject empty arguments for tools that require args) to actually fix #434.

Useful? React with 👍 / 👎.

…434, #436, #437) (#439) * feat(explore): opt-in rerank-trace logger for offline tuning experiments Adds a v0 trace logger that appends one JSON line per searchContent invocation when enabled via .codedbrc (rerank_trace = true). Captures {ts, query, results:[{path,line,score}]} so we can inspect the data and decide whether online learning-to-rank from agent traces is worth building. Pure observation — does not change ranking behavior. Disabled by default. Caps query at 256 bytes, results at 50 entries, and rotates the file by truncate-clobber once it crosses 10 MB. All I/O errors are swallowed; logging never breaks a search. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix(explore): always score in rerankAndFinalize, not just when len > 1 Pre-fix the multi-signal scoring loop only ran when result_list had more than one item — a micro-optimization that skipped sorting a single result. With the rerank-trace logger added in 54b6b72, this made single-result entries log score=0.0, making them indistinguishable from genuinely zero-confidence matches in offline analysis. The fix runs scoring unconditionally and keeps the sort guarded behind len > 1. Cost: a few µs per single-result search — negligible. Caught by end-to-end binary verification. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * test(mcp): issue-434 failing test for codedb_bundle ops schema The bundle inputSchema advertises ops items with required: ["tool"] and arguments as a bare {type: "object"}, so function-calling LLMs emit {tool, arguments: {}} as the minimum-valid payload. This test asserts "arguments" is in items.required so models are forced to populate it. Also exposes tools_list as pub for the test to introspect. Fails on main; fix follows in next commit. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix(mcp): require arguments in codedb_bundle ops items schema (#434) Stage 1 of the bundle-schema fix. Prior schema had ops items with required: ["tool"] and arguments as a bare {type: "object"}, so function-calling LLMs read {} as a valid arguments payload and emitted {tool, arguments: {}} as the minimum-valid call. The empty object then triggered the #424 inline-args fallback, which used the op object itself as the args bag and surfaced as "received keys: [tool, arguments]" from each sub-tool. Adding "arguments" to items.required forces the model to populate it. The runtime inline-args fallback at mcp.zig:1948 stays as a backstop for non-conformant clients. Stage 2 (discriminated oneOf over tool, binding arguments to each sub-tool's inputSchema) is a follow-up — it requires turning the hand-rolled tools_list literal into a builder. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * test(mcp): issue-437 failing test for bundle items oneOf Stage 2 of the bundle-schema fix. Stage 1 (#434) made `arguments` required at the items level, but the field is still a bare {type: "object"} so a schema-greedy model can satisfy `required` by emitting `arguments: {}`. This test asserts the items schema contains a discriminated `oneOf` with one branch per dispatchable codedb_* sub-tool, each pinning `tool` to a const and `arguments` to that sub-tool's actual inputSchema. Adds a stub `buildAugmentedToolsList` that returns the unaugmented schema so the test fails at runtime instead of as a compile error. The real builder lands in the fix commit. Fails on this branch; fix follows in next commit. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix(mcp): discriminated oneOf on codedb_bundle ops items (#437) Stage 2 of the bundle-schema fix. Stage 1 (#434) made `arguments` required at the items level but left it as a bare {type: "object"}; a schema-greedy model could still satisfy the required check by emitting `arguments: {}`. Stage 2 binds the *contents* of arguments to each sub-tool's actual inputSchema via a discriminated oneOf on `tool` (const) → `arguments` (sub-tool inputSchema). Once a model picks `tool: "codedb_outline"`, the only matching branch requires arguments.path:string — there is no schema-minimal escape. `buildAugmentedToolsList` parses the existing tools_list literal once at server startup, mutates the bundle items to add the oneOf, and serializes back. No hand-maintained duplication — branches are generated from the per-sub-tool schemas already advertised. Falls back to the raw tools_list if augmentation fails (parse error / OOM) so clients still get a valid schema. codedb_bundle (recursive) and codedb_edit (write op) are excluded from the oneOf since handleBundle rejects them at runtime anyway. Schema payload roughly doubles (~12KB → ~24KB after augmentation, 19 branches across the dispatchable codedb_* tools). Test: tests.zig now asserts the augmented schema contains oneOf with branches that pin tool to a const and preserve each sub-tool's required args (codedb_outline branch must require `path`). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * release: v0.2.5808 — codedb_bundle schema fix + rerank-trace logger (#434, #436, #437) Bundles three PRs: - #435 (Stage 1, #434): require `arguments` on bundle ops items - #438 (Stage 2, #437): discriminated oneOf per sub-tool - #436: opt-in rerank-trace logger for offline tuning + score-on-len-1 fix End-to-end Sonnet 4.6 verifies the schema constraint flows through to model output: bundle calls now arrive with populated, correctly- named `arguments` for each sub-op. 513/513 tests pass on the merged branch. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

justrach · 2026-05-06T15:50:37Z

Superseded by #439 (release v0.2.5808). All commits from this branch landed in 907ac96 on main.

justrach and others added 2 commits May 6, 2026 21:43

chatgpt-codex-connector Bot reviewed May 6, 2026

View reviewed changes

justrach closed this May 6, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(mcp): require arguments in codedb_bundle ops schema (#434)#435

fix(mcp): require arguments in codedb_bundle ops schema (#434)#435
justrach wants to merge 2 commits intomainfrom
fix/issue-434-bundle-schema

justrach commented May 6, 2026

Uh oh!

chatgpt-codex-connector Bot left a comment

Uh oh!

chatgpt-codex-connector Bot May 6, 2026

Uh oh!

justrach commented May 6, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

justrach commented May 6, 2026

Summary

Background

Commits

Test plan

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

chatgpt-codex-connector Bot May 6, 2026

Choose a reason for hiding this comment

Uh oh!

justrach commented May 6, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant