Skip to content

bug: stdio transport does not exit on stdin EOF — server reparented to init at ~99% CPU when parent disconnects #181

@geremyturcotte

Description

@geremyturcotte

bug: stdio transport does not exit on stdin EOF — server reparented to init at ~99% CPU when parent disconnects

Environment

  • mcp-framework: 0.2.16 (also confirmed on 0.2.22)
  • Node.js: 22.x (Apple Silicon)
  • OS: macOS Darwin 25.x
  • Parent process tested: Claude CLI / Claude Desktop on macOS
  • Affected MCP server: @matpb/mysql-mcp-server@2.0.0 (uses mcp-framework as a direct dep). Likely affects every MCP server built on mcp-framework@0.2.x in stdio mode.

Steps to reproduce

  1. Scaffold a default stdio MCP server: mcp create test-server && cd test-server && npm run build.
  2. Launch it from a parent process: node dist/index.js.
  3. Hard-kill the parent (SIGKILL) without sending SIGTERM to the child first. (Same shape as: closing the terminal window where the parent runs, or losing the parent to OOM.)
  4. Observe the child process in ps -eo pid,ppid,%cpu,etime,command.

Expected behavior

The server exits cleanly when its stdin pipe closes (EOF on fd 0).

Actual behavior

The server is reparented to launchd (PPID=1), spins at 96–99% CPU indefinitely, and slowly leaks memory until killed manually (kill -9). SIGTERM is ignored because the process is in an uninterruptible busy-loop on process.stdin.read() returning null. We observe ~10 occurrences/day in production every time Claude CLI is killed mid-session — total ~9.5 GB of stuck node processes accumulated in one bad day before manual cleanup.

Sample live process snapshot:

PID    PPID  %CPU  ETIME    COMMAND
61830  1     95.8  00:01:29 node /…/node_modules/@matpb/mysql-mcp-server/…
61911  1     99.7  00:01:33 node /…/perplexity-mcp/server.js
63017  1     97.2  00:00:55 node /…/@matpb/mysql-mcp-server/…
63585  1     95.9  00:00:47 node /…/perplexity-mcp/server.js

Root cause hypothesis

Reading src/transports/stdio/server.ts in 0.2.22:

async start(): Promise<void> {
  await this.transport.start();
  this.running = true;
}

StdioServerTransport is a thin wrapper that delegates everything to @modelcontextprotocol/sdk's StdioServerTransport. Neither mcp-framework's wrapper nor MCPServer itself attaches a process.stdin.on('end' | 'close', …) handler. When the parent is hard-killed on Darwin, the readline poll inside the SDK transport keeps spinning on null reads — the EOF event isn't routed to a clean process.exit(0) path anywhere.

The bug also exists in @modelcontextprotocol/sdk's stdio transport, but the missing exit guard belongs at the consumer layer (this repo), where the lifecycle of the process is owned. The SDK can't safely call process.exit() because consumers might not want it, but a stdio MCP server has no other input source — once stdin is gone, it has nothing to do.

Suggested fix

One-line guard in StdioServerTransport.start() plus symmetric removal in close(). Diff:

   async start(): Promise<void> {
     await this.transport.start();
     this.running = true;
+
+    process.stdin.on("end", this.handleStdinClose);
+    process.stdin.on("close", this.handleStdinClose);
   }
+
+  private handleStdinClose = (): void => {
+    this.transport.close().finally(() => process.exit(0));
+  };

   async close(): Promise<void> {
+    process.stdin.off("end", this.handleStdinClose);
+    process.stdin.off("close", this.handleStdinClose);
     await this.transport.close();
     this.running = false;
   }

Verified the fix locally against 0.2.22 head: the busy-loop pattern no longer reproduces; the server exits within milliseconds of its parent dying. I'm happy to open a PR with this patch + tests if you'd like — just let me know if there's a different shape you'd prefer (e.g., expose as opt-out option, route through onclose callback chain instead of process.exit).

Workaround we're using until upstream lands

We ship a launchd watchdog (~50 LOC reaper script) on dev machines that detects the orphan signature (PPID=1 + MCP allowlist match + %CPU > 50 + etime < 10 min) and SIGKILLs them on a 120s probe. Effective at zero false positives but obviously a band-aid. Shipping the upstream fix lets us delete that mitigation.

Happy to provide more diagnostics — strace/dtruss output, longer-running ps snapshots, or test against any specific version. Thanks for maintaining mcp-framework!

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions