feat: add destroy() to MdkNode to fix 402 zombie node race condition#30
Merged
martinsaposnic merged 2 commits intomainfrom Feb 20, 2026
Merged
feat: add destroy() to MdkNode to fix 402 zombie node race condition#30martinsaposnic merged 2 commits intomainfrom
martinsaposnic merged 2 commits intomainfrom
Conversation
On serverless platforms (Vercel/Lambda), MdkNode's inner Rust Node and its tokio runtime survive after getInvoice() returns because V8 GC is non-deterministic. When the agent pays a 402 invoice instantly (<1s), the webhook handler creates a second MdkNode for the same wallet while the first is still alive. The zombie's reconnection loop steals the LSP peer connection from the new node, preventing the JIT channel from being established and causing "retries exhausted" payment failures. In normal checkout this doesn't happen because the human delay (5-30s) gives GC time to collect the old node before the webhook fires. destroy() wraps the inner Node in Option<Node>, allowing JS callers to explicitly drop the Rust Node and its tokio runtime immediately after invoice creation, eliminating the race condition.
f3r10
approved these changes
Feb 20, 2026
martinsaposnic
added a commit
that referenced
this pull request
Feb 20, 2026
…30) * feat: add destroy() method to MdkNode for explicit cleanup On serverless platforms (Vercel/Lambda), MdkNode's inner Rust Node and its tokio runtime survive after getInvoice() returns because V8 GC is non-deterministic. When the agent pays a 402 invoice instantly (<1s), the webhook handler creates a second MdkNode for the same wallet while the first is still alive. The zombie's reconnection loop steals the LSP peer connection from the new node, preventing the JIT channel from being established and causing "retries exhausted" payment failures. In normal checkout this doesn't happen because the human delay (5-30s) gives GC time to collect the old node before the webhook fires. destroy() wraps the inner Node in Option<Node>, allowing JS callers to explicitly drop the Rust Node and its tokio runtime immediately after invoice creation, eliminating the race condition. * style: fix cargo fmt
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Adds a
destroy()method toMdkNodethat explicitly drops the inner RustNodeand its tokio runtime, preventing zombie node race conditions on serverless platforms.Problem
402 (machine-to-machine) payments fail intermittently with "retries exhausted" on serverless platforms (Vercel/Lambda). Root cause is a race condition between two
MdkNodeinstances for the same wallet:create402Response()builds MdkNode add pay to lnurl capability, also add other methods that may be helpful #1, callsgetInvoice()(which internally starts/stops the node), returns the 402 responseNodeand its tokio runtime (with peer reconnection loop) survivestartReceiving()open_channelto MdkNode add pay to lnurl capability, also add other methods that may be helpful #1 (stopped, can't process it)In normal checkout this doesn't happen because the human delay (5-30s between invoice creation and payment) gives V8 GC time to collect the old node.
Why
stop()isn't enoughNode::stop()disconnects peers and signals background tasks, but:Arc<Runtime>, only dies onNode::drop())Solution
NodeinOption<Node>destroy()callsnode.take()+stop()+drop(), killing the tokio runtime and all background tasks immediatelynode()helper that panics with a clear message if called after destroyUsage in the 402 flow
This is safe because
getInvoice()is fully synchronous viablock_on()- by the time it returns, the LSPS4 negotiation is complete and the invoice string is in JS land. The client never persists a local SCID mapping (only the LSP does).Test plan
cargo checkpasses