Skip to content

fix: exit cleanly on stdin EOF in StdioServerTransport#182

Open
geremyturcotte wants to merge 1 commit intoQuantGeekDev:mainfrom
geremyturcotte:fix/stdio-exit-on-stdin-eof
Open

fix: exit cleanly on stdin EOF in StdioServerTransport#182
geremyturcotte wants to merge 1 commit intoQuantGeekDev:mainfrom
geremyturcotte:fix/stdio-exit-on-stdin-eof

Conversation

@geremyturcotte
Copy link
Copy Markdown

Summary

Fixes #181.

StdioServerTransport.start() now installs end/close listeners on process.stdin so the server exits cleanly when its parent disconnects, instead of busy-looping at ~99% CPU as a PPID=1 zombie until killed manually.

The bug is observed daily on macOS Darwin 25.x whenever the parent (Claude CLI / Claude Desktop) is hard-killed. Source-code analysis is in the linked issue. Quick recap: StdioServerTransport delegates everything to @modelcontextprotocol/sdk's StdioServerTransport, which uses readline under the hood — when the parent dies on Darwin, the EOF doesn't propagate to a clean stream-end event, the readline poll spins on null reads, and nothing in either layer calls process.exit(). This PR adds the missing exit guard at the consumer layer where the lifecycle is owned.

Changes

  • src/transports/stdio/server.ts (+17 LOC) — install end/close listeners on process.stdin in start(), remove them in close() (symmetric, leak-free), call process.exit(0) after best-effort transport.close() when either fires.
  • tests/transports/stdio/server.test.ts (+65 LOC, new file) — 3 tests:
    1. close event → process.exit(0) called.
    2. end event → process.exit(0) called.
    3. listeners removed on close() (re-instantiation leak-free).

Why this layer (not the SDK)

The SDK can't safely call process.exit() because some consumers might want to intercept stdin EOF and do graceful shutdown. But a stdio MCP server has no other input source — once stdin is gone, it has nothing to do. The right place for the exit guard is in mcp-framework's wrapper, where the process lifecycle is owned. (Cross-filing at modelcontextprotocol/typescript-sdk is also possible but seems unnecessary if mcp-framework handles it.)

Test plan

npm install
npm test -- tests/transports/stdio/server.test.ts   # 3/3 pass
npm test                                              # 852 pass / 1 baseline flake; 60/66 suites pass (6 pre-existing failures unrelated to transports)
npm run build                                         # tsc clean

Real-world repro verified: with the patch installed, killing the parent process of a default mcp create test-server no longer leaves a PPID=1 zombie. Without the patch, the same operation leaves a process spinning at 99% CPU indefinitely.

Open questions / alternatives

Happy to redo in a different shape if you'd prefer:

  • Opt-out option — expose exitOnStdinClose: boolean (default true) on the constructor for consumers who want different shutdown semantics. Adds one constructor arg and one boolean check; minor surface increase.
  • Route through onclose chain — instead of process.exit(0), fire the existing onclose handler if set, then exit. Slightly more invasive; lets consumers run cleanup before exit.
  • Move to MCPServer.start() — install the guard at the higher level so non-stdio transports don't pay for it (they don't anyway, since they don't read stdin, but it's slightly more explicit).

Default chosen: minimal patch at the transport layer, hard process.exit(0) after best-effort SDK cleanup. Happy to iterate.

Workaround we're using until this lands

We ship a launchd watchdog (~50 LOC reaper script) on dev machines that detects the orphan signature (PPID=1 + MCP allowlist match + %CPU > 50 + etime < 10 min) and SIGKILLs on a 120s probe. Effective at zero false positives but obviously a band-aid. Shipping this PR upstream lets us delete that mitigation.

Thanks for maintaining mcp-framework!

Without explicit listeners on process.stdin's "end" / "close" events, the
underlying SDK readline poll spins on null reads after the parent process
disconnects. The server is reparented to init at ~99% CPU until killed
manually. Observed daily on macOS Darwin 25.x when the parent (Claude CLI
or Claude Desktop) is hard-killed.

Install both listeners in start() and remove them symmetrically in close().
On either event, best-effort close the SDK transport then process.exit(0).
The stdio transport has no other input source, so exiting is correct.

Adds 3 regression tests in tests/transports/stdio/server.test.ts:
  - "close" event triggers process.exit(0)
  - "end" event triggers process.exit(0)
  - listeners removed on close() (re-instantiation leak-free)

Fixes QuantGeekDev#181.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

bug: stdio transport does not exit on stdin EOF — server reparented to init at ~99% CPU when parent disconnects

1 participant