test(scripts): cross-platform MCP handshake tests for first-install (#125)#129
Merged
test(scripts): cross-platform MCP handshake tests for first-install (#125)#129
Conversation
Demonstrates the bug in #125: invoking `run.sh stdio` without a binary (the Claude Code first-install flow) fast-exits instead of downloading. Because Claude Code does not retry failed MCP servers, this leaves the `lumen` tools unavailable for the entire session. The existing `stdio early-exit guard tests` block never executes run.sh — it reimplements the guard condition inline and asserts "the guard fires for stdio" as correct behaviour, codifying the bug. This new integration test stubs `curl` in PATH, creates a temporary plugin root with a manifest but no binary, runs `bash run.sh stdio`, and asserts exit 0 + that the expected binary file is produced. Red now; turns green once the stdio fast-fail is removed (see #127). Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
The existing stdio first-install integration test (added earlier in this
PR) stubbed curl to write `#!/bin/sh; exit 0` as the "downloaded binary"
and only checked the launcher exited 0 with a file on disk. That passes
on any launcher that reaches the download step — it can't distinguish a
launcher that actually execs a working MCP server from one that swallows
stdin/stdout, closes pipes early, or silently drops to the wrong binary
path. It also tested only run.sh; run.bat (with the same fast-fail bug)
was untested end-to-end.
Replaces the stub with a real mock MCP server (scripts/testdata/
mock_mcp_server/main.go — pure Go, no CGO, cross-compiles on all three
OSes). The test now:
- Builds the mock for the current OS/arch
- Creates a clean plugin root with a manifest but no bin/
- Stubs curl (POSIX) / curl.bat (Windows) to copy the mock into the
path the launcher will exec
- Pipes a real JSON-RPC `initialize` request on stdin
- Asserts the stdout response contains `jsonrpc: "2.0"` AND
`serverInfo.name == "mock-lumen"`
A passing test transitively proves the launcher did not fast-exit in
stdio mode, reached the download code path, wrote the artefact where it
would exec it, set it executable, and exec'd it with stdin/stdout
inherited correctly — the actual MCP-server-alive contract.
Adds scripts/test_run_windows.ps1 (same shape, against run.bat) and
wires it into the CI `scripts` matrix as a new pwsh step. Also expands
the matrix to ubuntu-latest + macos-latest + windows-latest and adds
actions/setup-go so the mock can be built.
Both the POSIX test and the Windows test are RED against the current
launchers because run.sh AND run.bat both still contain the stdio
fast-fail guard — merging #127 plus an equivalent run.bat change will
flip them green.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Previous version used `return` inside a try/finally block in a dot-sourced script, which exits the script without running the summary or `exit 1` line — making CI mark a real failure as success. Also relied on Start-Process -RedirectStandardInput, which doesn't reliably propagate the child's exit code back through PowerShell. Rewrites both: uses System.Diagnostics.Process directly for deterministic stdin/stdout/stderr wiring and exit-code reporting, and restructures into an if/elseif chain so all paths flow through the final summary + exit logic. The test will now correctly fail the Windows job when run.bat fast- exits in stdio mode. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
When run.bat hits the stdio fast-fail and `exit /b 1`s, its stdin pipe is already closed by the time PowerShell tries to write the initialize request — producing a `pipe is being closed` exception that masks the real diagnostic. Catch it and let the exit-code check produce the proper "#125 — MCP server would be dead for the session" failure message. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Adds a deliberately failing end-to-end test suite in
scripts/test_run.shand a newscripts/test_run_windows.ps1that reproduce the first-install MCP failure from #125 on both POSIX and Windows launchers. Expect CI to go red on ubuntu, macos, and windows — that is the point of this PR.Why the existing CI suite missed the bug
Two gaps, stacked:
stdio early-exit guard testsblock reimplemented the guard condition inline as a local helper and asserted "the guard fires for thestdioarg" as correct behaviour — codifying the bug as the spec. Nothing actually ranbash run.sh stdio(let alonerun.bat stdio) with a missing binary.run.bat's first-install code path.run.bathas the identical fast-fail guard at lines 28–32. fix(scripts): download binary synchronously in stdio mode on first install #127 only fixes POSIX; a Windows-only fix still needs to land, and nothing in CI would have told us that.What this PR adds
scripts/testdata/mock_mcp_server/main.go— a tiny pure-Go mock MCP server. Reads one line from stdin, and if it is a JSON-RPCinitializerequest it writes a valid MCPinitializeresponse and exits 0. Cross-compiles on all three runners, no CGO.test_run.sh— replaces the old stub-curl-plus-exit 0test. The test now builds the mock, stubscurlin PATH to drop the mock binary into place, pipes a realinitializerequest intobash run.sh stdio, and asserts the stdout response contains"jsonrpc":"2.0"and"name":"mock-lumen". A pass transitively proves the launcher didn't fast-exit, reached the download path, wrote the artefact where it would exec it, chmod'd it executable, and exec'd it with stdin/stdout inherited correctly.scripts/test_run_windows.ps1— same shape, againstrun.bat, via PowerShell. Uses acurl.batstub on a PATH-prepended dir (cmd.exe's PATHEXT lookup finds the.batfirst).scriptsjob expanded toubuntu-latest + macos-latest + windows-latest,fail-fast: false,actions/setup-goadded. New pwsh step runs the Windows launcher test.Bugs this kind of test catches that the old one couldn't
Beyond the #125 fast-fail, a stubbed-curl / stubbed-binary test can pass even when the launcher has any of these real bugs:
chmod +xrun.batthat prevent the process from ever startingA real MCP handshake against a real process is the only way to catch these.
Red / green
maintoday → RED on POSIX and Windows (both launchers still contain the stdio fast-fail)run.batfix → GREEN on all three OSesNext steps
Merge #127 and add the matching
run.batchange to flip both tests green, or fold these commits into that PR as its TDD red-phase.🤖 Generated with Claude Code