meta-corruption in attachable-subscription.test.ts#1106
Conversation
297c9a9 to
a5b69a1
Compare
Meta-Corruption Investigation SummaryRoot Cause Identified: CAR File Management Race ConditionIssue: During database attachment process, CAR files become "stale" and are removed from carLog before blocks can be properly validated, causing Key Findings:
Evidence Pattern:
The Sequence:
Next Steps:
Files Modified (with enhanced logging):
The meta-corruption issue is definitively a CAR file lifecycle management race condition during database attachment, not the originally suspected DbMeta structure corruption. |
Deadlock Investigation - Attempted Synchronization FixesAttempted Fix #1: Promise.all() on Commit Queue// In writeCar() - DEADLOCK 🔴
const remoteWrites = this.attached.remotes().map((r) =>
this.commitQueue.enqueue(async () => {
await r.active.car.save(block);
return [];
})
);
await Promise.all(remoteWrites); // ← DEADLOCKResult: Test hung indefinitely, no logs produced after initial setup Attempted Fix #2: commitQueue.waitIdle() in writeCar()// In writeCar() - DEADLOCK 🔴
this.attached.remotes().forEach((r) => {
this.commitQueue.enqueue(async () => {
await r.active.car.save(block);
return [];
});
});
await this.commitQueue.waitIdle(); // ← DEADLOCKResult: Test hung during execution, timeout after 10+ seconds Attempted Fix #3: commitQueue.waitIdle() in writeMeta()// In writeMeta() - DEADLOCK 🔴
async writeMeta(cids: AnyLink[]): Promise<void> {
const meta = { cars: cids };
await this.commitQueue.waitIdle(); // ← DEADLOCK
await this.attached.local().active.meta.save(meta);
}Result: Test hung in Root Cause of DeadlocksThe commit queue system is designed for eventual consistency, not strict synchronization:
Confirmed Race ConditionEnhanced network logging definitively showed:
Next Steps RequiredThe commit queue synchronization approach is not viable. Alternative solutions needed:
The race condition is definitively identified but requires an architectural solution rather than commit queue synchronization. |
- Add CRDT head tracking in crdt.ts and crdt-clock.ts - Add network request logging in loader.ts for CAR file loads - Add carLog state logging before validation - Add post-compact carLog logging in test Investigation findings: - Race condition between CAR file writes (async via commit queue) and DbMeta sends - Network requests fail when CAR files not yet written to remote stores - "missing car file" errors cause CAR files to be marked as stale - Validation fails because blocks become inaccessible Attempted fixes (both cause deadlocks): - Promise.all() on commit queue operations - commitQueue.waitIdle() before DbMeta send Next: Need alternative architectural solution for CAR write coordination 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <noreply@anthropic.com>
How to Reproduce the Race Condition with Enhanced LoggingRunning the Test with Full Logging# From fireproof repo root:
cd /Users/jchris/code/fp/fireproof
FP_DEBUG=CRDTClock,Loader,EncryptedBlockstore pnpm run test --project memory attachable-subscription.test.ts -t 'should trigger subscriptions on inbound syncing' 2>&1 | tee meta-corruption-debug.logWhat to Look For in the Logs1. Network Request Failures 🔍2. CarLog State Before Validation 🔍3. Validation Failures 🔍4. Empty CarLog After Test 🔍Key Pattern to Confirm
Current BranchThe enhanced logging is on branch This logging definitively proves the write-before-notify race condition in the CAR file synchronization during database attachment. |
Potential URL Routing Issue in CAR File StorageMemory Gateway Logging InvestigationRunning with CAR Files Stored SuccessfullyCAR files appear to be written to main database storage: CAR Files Retrieved Successfully by Main DBSame files can be read using similar URL format: CAR Files Not Found by Remote DBsRemote databases seem to be looking in different locations: Test ResultsThe assertion shows remote databases have completely empty carLogs: Possible URL Format DifferenceIt appears there may be a difference between:
This could explain why remote databases can't find CAR files that were successfully stored, leading to empty carLogs and validation failures. Worth investigating if this is a URL generation issue in the attachment process or memory gateway routing. |
|
@codex rebase to latest main |
For now, I can only help with PRs you've created. |
No description provided.