fix: prevent stack overflow on concurrent transport close#1700
fix: prevent stack overflow on concurrent transport close#1700ctonneslan wants to merge 1 commit intomodelcontextprotocol:mainfrom
Conversation
When multiple MCP transport sessions close simultaneously (server restart, network partition, mass disconnect), the onclose callbacks can trigger recursive close operations that overflow the call stack. The process stays alive but becomes completely unresponsive. Added a _closing guard flag that prevents re-entrant close() calls. Once close() is entered, subsequent calls return immediately. Fixes modelcontextprotocol#1699
🦋 Changeset detectedLatest commit: e750b28 The changes in this PR will be included in the next version bump. This PR includes changesets to release 4 packages
Not sure what this means? Click here to learn what changesets are. Click here if you're a maintainer who wants to add another changeset to this PR |
@modelcontextprotocol/client
@modelcontextprotocol/server
@modelcontextprotocol/express
@modelcontextprotocol/hono
@modelcontextprotocol/node
commit: |
travisbreaks
left a comment
There was a problem hiding this comment.
The re-entrancy guard is the right pattern for this bug. Simple and effective.
However, note that PR #1705 by the same author targets the same method with a more comprehensive fix: it snapshots the stream mapping before clearing, wraps individual cleanup calls in try/catch, and moves onclose into a finally-adjacent position.
Specifically, #1705 also adds onerror callback invocations across multiple transport send() methods (stdio, websocket, streamableHttp), which makes it a superset of this PR plus #1704.
Recommendation for maintainers: These three PRs (#1700, #1704, #1705) overlap significantly. It may be worth consolidating. If only one lands, #1705 covers the most ground, though it also has the largest blast radius.
One note on #1705's close(): onclose?.() is called after the try/finally block. If _requestResponseMap.clear() throws (unlikely but possible), onclose would never fire. Moving it into the finally block would be safer:
} finally {
this.onclose?.();
}|
Closing in favor of #1705 which covers the same re-entrancy guard plus snapshot-before-clear and individual try/catch on cleanup calls. |
Problem
When multiple MCP transport sessions close simultaneously (server restart, network partition, mass client disconnect), a recursive promise rejection cascade in
close()causes a stack overflow that crashes the Node.js process and makes it completely unresponsive.The process doesn't exit — it stays alive but hangs. All new connections time out. The only recovery is a full restart. This happens 2-3 times per day in production with 100+ concurrent sessions.
Root Cause
close()callsthis.onclose?.()which can trigger the Protocol layer to callclose()again on the same transport. With multiple transports closing simultaneously, the callbacks cascade recursively until the stack overflows.Fix
Added a
_closingguard flag toclose():This is the standard pattern for preventing re-entrant async operations.
Fixes #1699