🤖 fix: add trace dumps for System 1 memory writer failures#1969
🤖 fix: add trace dumps for System 1 memory writer failures#1969
Conversation
b93c543 to
5f81afb
Compare
9b301d7 to
f2b2dab
Compare
f2b2dab to
ac2bd9b
Compare
ac2bd9b to
d89ca1e
Compare
d89ca1e to
4d218e5
Compare
8e5952c to
8ddbad0
Compare
4f4de49 to
d50f25c
Compare
|
@codex review Please re-review after rebase and hash alignment. |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: d50f25c4ac
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
|
@codex review Addressed latest failing checks after rebase (System1 settings tests + flake hash). Please re-review. |
|
@codex review Addressed the memory_write nullish schema comment and resolved the thread. Please re-review. |
|
@codex review Applied follow-up fix for failing integration test (duplicate General button match). Please re-review. |
|
@codex review Follow-up integration fix: updated thinkingPolicy test to match current Internal section heading in Agents settings. |
|
Codex Review: Didn't find any major issues. You're on a roll. ℹ️ About Codex in GitHubYour team has set up Codex to review pull requests in this repo. Reviews are triggered when you
If Codex has suggestions, it will comment; otherwise it will react with 👍. Codex can also answer questions or update the PR. Try commenting "@codex address that feedback". |
|
@codex review |
|
Codex Review: Didn't find any major issues. Chef's kiss. ℹ️ About Codex in GitHubYour team has set up Codex to review pull requests in this repo. Reviews are triggered when you
If Codex has suggestions, it will comment; otherwise it will react with 👍. Codex can also answer questions or update the PR. Try commenting "@codex address that feedback". |
|
@codex review |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: d41bc5e667
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
Change-Id: I3dc50fbc2228b22b8e072e7f63feb8deff060e1c Signed-off-by: Thomas Kosiewski <tk@coder.com>
|
@codex review |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: d67868d951
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
Change-Id: I56a62b2ab511e62c296e7f380ce180c5f09cae16 Signed-off-by: Thomas Kosiewski <tk@coder.com>
|
@codex review |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 29aae563c5
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
Change-Id: I5b1d1046e5ccada7addb5df3b6f318307a519d09 Signed-off-by: Thomas Kosiewski <tk@coder.com>
|
@codex review |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 97655d60a4
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
Change-Id: Idb80e1b748340b5721158591e2fed11c47cf436a Signed-off-by: Thomas Kosiewski <tk@coder.com>
|
@codex review |
Change-Id: I437f5d0f4b17c220826a52f1f66da79d337948b3 Signed-off-by: Thomas Kosiewski <tk@coder.com>
|
@codex review |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: acc5663cdb
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
Change-Id: Ie20cfe370afcc8c7a0a4d6fca0be1bb2498a7aaf Signed-off-by: Thomas Kosiewski <tk@coder.com>
|
@codex review |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 65eed73d9c
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
Change-Id: I0db4047426804c076705d8ca81e94f072af36a7c Signed-off-by: Thomas Kosiewski <tk@coder.com>
|
@codex review |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 7d101dcba3
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
Change-Id: I414ac38e1ae03bd8d29b337889361d03a4ae944c Signed-off-by: Thomas Kosiewski <tk@coder.com>
|
@codex review |
|
Codex Review: Didn't find any major issues. Another round soon, please! ℹ️ About Codex in GitHubYour team has set up Codex to review pull requests in this repo. Reviews are triggered when you
If Codex has suggestions, it will comment; otherwise it will react with 👍. Codex can also answer questions or update the PR. Try commenting "@codex address that feedback". |
|
@codex review |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 86b606cb2b
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
Ignore no_new_memories outcomes when memory_write failed in the same attempt so the policy retries with a reminder instead of treating the run as successful. Adds a regression test for mixed tool-call runs that include a failed memory_write followed by no_new_memories. Signed-off-by: Thomas Kosiewski <tk@coder.com> --- _Generated with `mux` • Model: `openai:gpt-5.3-codex` • Thinking: `xhigh` • Cost: `9.60`_ <!-- mux-attribution: model=openai:gpt-5.3-codex thinking=xhigh costs=59.60 --> Change-Id: I35c8b058fa6e1185292857007cd58c832cbd4a01
|
@codex review Addressed the no_new_memories mixed-tool-call edge case:
|
|
Codex Review: Didn't find any major issues. More of your lovely PRs please. ℹ️ About Codex in GitHubYour team has set up Codex to review pull requests in this repo. Reviews are triggered when you
If Codex has suggestions, it will comment; otherwise it will react with 👍. Codex can also answer questions or update the PR. Try commenting "@codex address that feedback". |
|
@codex review |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: ca39a2fa31
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| while (startIndex < events.length) { | ||
| const candidate = events.slice(startIndex); | ||
| const serialized = JSON.stringify(candidate); | ||
| if (serialized.length <= maxChars) { |
There was a problem hiding this comment.
Replace quadratic event trimming in memory writer
trimToCharBudget repeatedly does events.slice(startIndex) + JSON.stringify(...) inside a loop, so each run is O(n²) in history length; with long chats (especially tool-heavy transcripts) this can take many seconds before the model call even starts, causing frequent memory-writer timeouts or skipped updates in production workflows. Please switch to an incremental/bounded approach (e.g., walk from newest to oldest while tracking size once) so preprocessing stays linear.
Useful? React with 👍 / 👎.
Summary
Adds richer debug visibility for the System 1 memory writer:
memory_writetool call) or times out, writes a full execution trace (prompt/messages, step results, tool executions) todebug_obj/for offline inspection.Background
We saw cases where the memory writer appeared to “not run” (no
[system1][memory]logs) or would exit withtimedOut: trueand nomemory_writecall. Most lifecycle logs were debug-only, and failures lacked enough detail to understand what the model did.Implementation
MemoryWriterPolicy: adds explicit debug logs for early-return gates; threadstriggerMessageIdinto the runner.system1MemoryWriter: captures per-attempt messages,onStepFinishresults, and tool execution records; dumps a JSON trace to~/.mux*/debug_obj/<workspaceId>/system1_memory_writer/in debug mode when the run fails to update memory.Validation
make static-checktriggerMessageIdparameterRisks
Low. Changes are scoped to debug logging and failure-path diagnostics; the writer’s normal success path is unchanged.
📋 Implementation Plan
Debug: System 1 memory writer not running / missing
[system1][memory]logsContext / Why
You set Settings → System 1 → “Write Interval (messages)” to
1, expecting the background System 1 memory writer to run after each assistant turn and update the project memory file. You’re not seeing any[system1][memory]log lines and it looks like the writer never runs.From the current code:
experiments.system1 === true) and only runs for root workspaces (it skips child/subtask workspaces).infologs are emitted by thememory_writetool when a memory file is actually written.Evidence (repo)
src/node/services/aiService.tsstores a context at stream start and callsmemoryWriterPolicy.onAssistantStreamEnd(ctx)onstream-end.src/node/services/system1/memoryWriterPolicy.tsreturns early unless:ctx.system1Enabled === true!ctx.parentWorkspaceIdinterval = config.taskSettings.memoryWriterIntervalMessages(defaults to 2)sessions/<workspaceId>/system1-memory-writer-state.json.src/browser/utils/messages/sendOptions.tspassesexperiments.system1based onisExperimentEnabled(EXPERIMENT_IDS.SYSTEM_1).src/common/orpc/schemas/stream.tsdefinesexperiments.system1in the RPC schema.src/node/services/log.tssupportsMUX_LOG_LEVEL/MUX_DEBUG.src/node/services/tools/memory_write.tslogsinfoon successful writes.src/node/services/tools/memoryCommon.tswrites to<muxHome>/memories/<projectId>.md.src/common/constants/paths.tsdefines<muxHome>:~/.mux,~/.mux-dev(whenNODE_ENV=development), orMUX_ROOT.Approach A (recommended): add explicit “no changes” debug logs (~5–20 LoC)
src/node/services/system1/memoryWriterPolicy.ts, change the existing debug line"[system1][memory] Memory writer produced no output"to something explicit like"[system1][memory] Memory writer exited without updating memory (no memory_write call)".debug(so it only appears whenMUX_LOG_LEVEL=debug/MUX_DEBUG=1).timedOut,system1Model).debuglogs on the early-return gates inonAssistantStreamEnd:ctx.system1Enabled !== true)ctx.parentWorkspaceId)MUX_LOG_LEVEL=debug(orMUX_DEBUG=1), then send a message.Approach B: user-side diagnosis (0 LoC)
1) Confirm the hard gates
experiments.system1 !== true.parentWorkspaceIdis set).stream-end. If you interrupt/abort streams, it won’t schedule.2) Confirm the interval persisted to disk
~/.mux/~/.mux-dev/(whenNODE_ENV=development)$MUX_ROOT<muxHome>/config.jsoncontains:taskSettings.memoryWriterIntervalMessages: 13) Verify scheduling without logs (state file)
Even if you can’t see stdout/stderr, the scheduler persists a state file per workspace.
workspaceId:ls -lt <muxHome>/sessions | headmetadata.jsonuntil you find the workspace you’re testing.<muxHome>/sessions/<workspaceId>/system1-memory-writer-state.jsonExpected behavior with
interval=1:lastRunStartedAtupdates (timestamp)lastRunMessageIdupdatesturnsSinceLastRunreturns to0Interpretation:
onAssistantStreamEndisn’t running (most often: System 1 experiment still OFF, or streams are aborting).lastRunStartedAtnever set → interval not being read as1, or the run is permanently “in flight”.lastRunStartedAtchanges but no memory file changes → writer ran but didn’t callmemory_write(model/tool support or credentials issue).4) Verify memory output
<muxHome>/memories/*.md.ls -lt <muxHome>/memories | head5) Enable the right logs (debug)
To see the scheduler/runner messages (which include skip reasons), you need debug logging:
MUX_LOG_LEVEL=debugorMUX_DEBUG=1Important: for the desktop app, you must start mux from a shell that has those env vars (GUI launches won’t inherit shell env). Once enabled, look for:
[system1][memory] Skipping memory writer ...[system1][memory] Memory writer completed[system1][memory] Memory writer failedDebug dumps (only in debug mode) land in:
<muxHome>/debug_obj/6) Ensure the writer’s model supports tool calling
The memory writer uses:
agentAiDefaults.system1_memory_writer.modelString(if set)ctx.modelString)If that model/provider can’t do tool calling (or lacks credentials), the writer may never call
memory_write.Actions:
system1_memory_writer(via Settings → Agents or by editingconfig.json).Why you might see “nothing” even when it’s running
Most memory-writer logs are
debug. The onlyinfoline is from thememory_writetool after the model calls it. If the model never callsmemory_write(tool calling unsupported / credentials missing / etc.), you’ll see no[system1][memory]output without enabling debug logging.Generated with
mux• Model:openai:gpt-5.2• Thinking:high• Cost:$54.20