Conversation
When the dev CLI exits (e.g. ctrl+c via pnpm), runs that were mid-execution previously stayed stuck in EXECUTING status for up to 5 minutes until the heartbeat timeout fired. Now they are cancelled within seconds. The dev CLI spawns a lightweight detached watchdog process at startup. The watchdog monitors the CLI process ID and, when it detects the CLI has exited, calls a new POST /engine/v1/dev/disconnect endpoint to cancel all in-flight runs immediately (skipping PENDING_CANCEL since the worker is known to be dead). Watchdog design: - Fully detached (detached: true, stdio: ignore, unref()) so it survives even when pnpm sends SIGKILL to the process tree - Active run IDs maintained via atomic file write (.trigger/active-runs.json) - Single-instance guarantee via PID file (.trigger/watchdog.pid) - Safety timeout: exits after 24 hours to prevent zombie processes - On clean shutdown, the watchdog is killed (no disconnect needed) Disconnect endpoint: - Rate-limited: 5 calls/min per environment - Capped at 500 runs per call - Small counts (<= 25): cancelled inline with pMap concurrency 10 - Large counts: delegated to the bulk action system - Uses finalizeRun: true to skip PENDING_CANCEL and go straight to FINISHED Run engine change: - cancelRun() now respects finalizeRun when the run is in EXECUTING status, skipping the PENDING_CANCEL waiting state and going directly to FINISHED
🦋 Changeset detectedLatest commit: 245c6ad The changes in this PR will be included in the next version bump. This PR includes changesets to release 28 packages
Not sure what this means? Click here to learn what changesets are. Click here if you're a maintainer who wants to add another changeset to this PR |
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
WalkthroughAdds auto-cancellation of in-flight dev runs when the CLI exits. Implements POST /engine/v1/dev/disconnect (accepts runFriendlyIds, caps at 500; ≤25 handled inline, >25 enqueued as a BulkActionGroup) and request/response schemas (DevDisconnectRequestBody, DevDisconnectResponseBody). CLI client exposes dev.disconnect. CLI runtime spawns a detached watchdog (devWatchdog) that maintains an active-runs JSON and PID file, detects parent exit, and calls the disconnect endpoint. Adds a finalizeRun flag propagated through cancellation logic and forwarded in bulk cancellation handling. Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes 🚥 Pre-merge checks | ✅ 1 | ❌ 2❌ Failed checks (2 warnings)
✅ Passed checks (1 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
- Reject requests exceeding 500 runs with 400 instead of silently truncating - Propagate finalizeRun through bulk action params so large batches also skip PENDING_CANCEL - Inherit parent process.env when spawning watchdog (TLS/proxy settings) - Add PID file prefix (trigger-watchdog:) to prevent killing unrelated processes on PID reuse
There was a problem hiding this comment.
♻️ Duplicate comments (2)
packages/cli-v3/src/dev/devSupervisor.ts (1)
245-263:⚠️ Potential issue | 🔴 CriticalDon't signal a watchdog PID unless you can prove it still belongs to this instance.
Both shutdown kill paths treat a bare PID as identity. If the detached watchdog died earlier and the OS reused that PID, clean CLI shutdown can
SIGTERMan unrelated local process. Persist an instance token alongside the PID and verify it before signaling here, and apply the same ownership check topackages/cli-v3/src/dev/devWatchdog.ts:48-60.In Node.js, does `process.kill(pid, signal)` verify that the PID still belongs to the original spawned child process, or does it signal whatever process currently owns that PID?🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@packages/cli-v3/src/dev/devSupervisor.ts` around lines 245 - 263, The shutdown code (`#killWatchdog`) signals a raw PID which can be reused by the OS; change the PID file scheme to persist a short instance token alongside the PID (e.g. "trigger-watchdog:<pid>:<token>") when spawning the watchdog and store the same token in the supervisor instance. Before calling process.kill in `#killWatchdog` or the equivalent kill path in devWatchdog (the code that reads watchdogPidPath and calls process.kill), read and validate the token from the PID file against the supervisor's stored token and only send SIGTERM if the tokens match; also update watchdog spawn logic to write the token and ensure both places handle missing/invalid token gracefully without killing a PID that may belong to another process.packages/cli-v3/src/dev/devWatchdog.ts (1)
96-120:⚠️ Potential issue | 🟠 MajorDon't treat every resolved
fetch()as a successful disconnect.
fetch()only rejects on transport/abort failures, so 400/429/500 responses currently fall through,onParentDied()deletes the active-runs file, and the session loses immediate cancellation. This path also sends the fullrunFriendlyIdsarray at once, but/engine/v1/dev/disconnectrejects requests over 500 IDs. Checkresponse.ok, split into<=500batches, and retry transient failures before cleanup.In the Fetch API, does `fetch()` reject on HTTP 4xx/5xx responses or only on network/abort failures?🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@packages/cli-v3/src/dev/devWatchdog.ts` around lines 96 - 120, The callDisconnect function currently assumes any resolved fetch equals success; modify callDisconnect to check the fetch Response.ok and throw on non-ok so callers can detect failure, and implement batching of the runFriendlyIds into chunks of <=500 (split the array and POST each batch separately to /engine/v1/dev/disconnect). Add a small retry loop for transient failures (e.g., 5xx or 429 with backoff) per batch before giving up. In onParentDied, use readActiveRuns() to get IDs, call the updated callDisconnect and only proceed to cleanup() and process.exit(0) if all batches succeeded (or after exhausting retries), ensuring failed disconnects are retried instead of being treated as success.
🧹 Nitpick comments (1)
apps/webapp/app/v3/services/bulk/BulkActionV2.server.ts (1)
141-147: Parse persisted bulk params with zod instead ofas any.
group.paramsis untyped JSON from the database, and this branch now uses a loose object check plus ananycast to controlfinalizeRun. A small zod parse here would keep malformed stored params from silently changing cancellation behavior and stays consistent with the repo's validation pattern.As per coding guidelines, "Use zod for validation in packages/core and apps/webapp".
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@apps/webapp/app/v3/services/bulk/BulkActionV2.server.ts` around lines 141 - 147, Replace the loose object check and the `as any` cast by parsing `group.params` with a zod schema to validate the persisted bulk params before using them: create or reuse a Zod schema (e.g., RunListInputOptionsSchema or a small schema that includes `finalizeRun?: boolean`) to parse `group.params`, assign the parsed result to `rawParams`, and derive `finalizeRun` from the strongly-typed parsed object instead of `(rawParams as any).finalizeRun`; keep the call to `parseRunListInputOptions` using the validated `rawParams` so malformed stored params are rejected or defaulted consistently.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Duplicate comments:
In `@packages/cli-v3/src/dev/devSupervisor.ts`:
- Around line 245-263: The shutdown code (`#killWatchdog`) signals a raw PID which
can be reused by the OS; change the PID file scheme to persist a short instance
token alongside the PID (e.g. "trigger-watchdog:<pid>:<token>") when spawning
the watchdog and store the same token in the supervisor instance. Before calling
process.kill in `#killWatchdog` or the equivalent kill path in devWatchdog (the
code that reads watchdogPidPath and calls process.kill), read and validate the
token from the PID file against the supervisor's stored token and only send
SIGTERM if the tokens match; also update watchdog spawn logic to write the token
and ensure both places handle missing/invalid token gracefully without killing a
PID that may belong to another process.
In `@packages/cli-v3/src/dev/devWatchdog.ts`:
- Around line 96-120: The callDisconnect function currently assumes any resolved
fetch equals success; modify callDisconnect to check the fetch Response.ok and
throw on non-ok so callers can detect failure, and implement batching of the
runFriendlyIds into chunks of <=500 (split the array and POST each batch
separately to /engine/v1/dev/disconnect). Add a small retry loop for transient
failures (e.g., 5xx or 429 with backoff) per batch before giving up. In
onParentDied, use readActiveRuns() to get IDs, call the updated callDisconnect
and only proceed to cleanup() and process.exit(0) if all batches succeeded (or
after exhausting retries), ensuring failed disconnects are retried instead of
being treated as success.
---
Nitpick comments:
In `@apps/webapp/app/v3/services/bulk/BulkActionV2.server.ts`:
- Around line 141-147: Replace the loose object check and the `as any` cast by
parsing `group.params` with a zod schema to validate the persisted bulk params
before using them: create or reuse a Zod schema (e.g., RunListInputOptionsSchema
or a small schema that includes `finalizeRun?: boolean`) to parse
`group.params`, assign the parsed result to `rawParams`, and derive
`finalizeRun` from the strongly-typed parsed object instead of `(rawParams as
any).finalizeRun`; keep the call to `parseRunListInputOptions` using the
validated `rawParams` so malformed stored params are rejected or defaulted
consistently.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository UI
Review profile: CHILL
Plan: Pro
Run ID: e1428899-bda8-4065-a299-788c1fe8eecd
📒 Files selected for processing (4)
apps/webapp/app/routes/engine.v1.dev.disconnect.tsapps/webapp/app/v3/services/bulk/BulkActionV2.server.tspackages/cli-v3/src/dev/devSupervisor.tspackages/cli-v3/src/dev/devWatchdog.ts
🚧 Files skipped from review as they are similar to previous changes (1)
- apps/webapp/app/routes/engine.v1.dev.disconnect.ts
📜 Review details
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (26)
- GitHub Check: units / internal / 🧪 Unit Tests: Internal (6, 8)
- GitHub Check: units / webapp / 🧪 Unit Tests: Webapp (2, 8)
- GitHub Check: units / internal / 🧪 Unit Tests: Internal (2, 8)
- GitHub Check: units / internal / 🧪 Unit Tests: Internal (8, 8)
- GitHub Check: units / internal / 🧪 Unit Tests: Internal (3, 8)
- GitHub Check: units / webapp / 🧪 Unit Tests: Webapp (3, 8)
- GitHub Check: units / webapp / 🧪 Unit Tests: Webapp (7, 8)
- GitHub Check: units / internal / 🧪 Unit Tests: Internal (5, 8)
- GitHub Check: units / webapp / 🧪 Unit Tests: Webapp (6, 8)
- GitHub Check: units / internal / 🧪 Unit Tests: Internal (7, 8)
- GitHub Check: units / webapp / 🧪 Unit Tests: Webapp (4, 8)
- GitHub Check: units / webapp / 🧪 Unit Tests: Webapp (5, 8)
- GitHub Check: units / internal / 🧪 Unit Tests: Internal (4, 8)
- GitHub Check: units / internal / 🧪 Unit Tests: Internal (1, 8)
- GitHub Check: units / webapp / 🧪 Unit Tests: Webapp (8, 8)
- GitHub Check: units / webapp / 🧪 Unit Tests: Webapp (1, 8)
- GitHub Check: sdk-compat / Node.js 22.12 (ubuntu-latest)
- GitHub Check: sdk-compat / Node.js 20.20 (ubuntu-latest)
- GitHub Check: units / packages / 🧪 Unit Tests: Packages (1, 1)
- GitHub Check: sdk-compat / Deno Runtime
- GitHub Check: sdk-compat / Cloudflare Workers
- GitHub Check: e2e / 🧪 CLI v3 tests (windows-latest - pnpm)
- GitHub Check: sdk-compat / Bun Runtime
- GitHub Check: typecheck / typecheck
- GitHub Check: e2e / 🧪 CLI v3 tests (ubuntu-latest - pnpm)
- GitHub Check: e2e / 🧪 CLI v3 tests (windows-latest - npm)
🧰 Additional context used
📓 Path-based instructions (13)
**/*.{ts,tsx}
📄 CodeRabbit inference engine (.github/copilot-instructions.md)
**/*.{ts,tsx}: Use types over interfaces for TypeScript
Avoid using enums; prefer string unions or const objects instead
**/*.{ts,tsx}: In TypeScript SDK usage, always import from@trigger.dev/sdk, never from@trigger.dev/sdk/v3or use deprecated client.defineJob
Import from@trigger.dev/coresubpaths only, never from the root
Use the Run Engine 2.0 (@internal/run-engine) and redis-worker for all new work, not legacy V1 MarQS queue or deprecated V1 functions
Files:
packages/cli-v3/src/dev/devWatchdog.tsapps/webapp/app/v3/services/bulk/BulkActionV2.server.tspackages/cli-v3/src/dev/devSupervisor.ts
**/*.{ts,tsx,js,jsx}
📄 CodeRabbit inference engine (.github/copilot-instructions.md)
Use function declarations instead of default exports
Files:
packages/cli-v3/src/dev/devWatchdog.tsapps/webapp/app/v3/services/bulk/BulkActionV2.server.tspackages/cli-v3/src/dev/devSupervisor.ts
**/*.ts
📄 CodeRabbit inference engine (.cursor/rules/otel-metrics.mdc)
**/*.ts: When creating or editing OTEL metrics (counters, histograms, gauges), ensure metric attributes have low cardinality by using only enums, booleans, bounded error codes, or bounded shard IDs
Do not use high-cardinality attributes in OTEL metrics such as UUIDs/IDs (envId, userId, runId, projectId, organizationId), unbounded integers (itemCount, batchSize, retryCount), timestamps (createdAt, startTime), or free-form strings (errorMessage, taskName, queueName)
When exporting OTEL metrics via OTLP to Prometheus, be aware that the exporter automatically adds unit suffixes to metric names (e.g., 'my_duration_ms' becomes 'my_duration_ms_milliseconds', 'my_counter' becomes 'my_counter_total'). Account for these transformations when writing Grafana dashboards or Prometheus queries
Files:
packages/cli-v3/src/dev/devWatchdog.tsapps/webapp/app/v3/services/bulk/BulkActionV2.server.tspackages/cli-v3/src/dev/devSupervisor.ts
**/*.{js,ts,jsx,tsx,json,md,yaml,yml}
📄 CodeRabbit inference engine (AGENTS.md)
Format code using Prettier before committing
Files:
packages/cli-v3/src/dev/devWatchdog.tsapps/webapp/app/v3/services/bulk/BulkActionV2.server.tspackages/cli-v3/src/dev/devSupervisor.ts
packages/cli-v3/src/dev/**/*
📄 CodeRabbit inference engine (packages/cli-v3/CLAUDE.md)
Dev mode code should be located in
src/dev/and runs tasks locally in the user's Node.js process without containers
Files:
packages/cli-v3/src/dev/devWatchdog.tspackages/cli-v3/src/dev/devSupervisor.ts
{packages/core,apps/webapp}/**/*.{ts,tsx}
📄 CodeRabbit inference engine (.github/copilot-instructions.md)
Use zod for validation in packages/core and apps/webapp
Files:
apps/webapp/app/v3/services/bulk/BulkActionV2.server.ts
apps/webapp/app/**/*.{ts,tsx}
📄 CodeRabbit inference engine (.cursor/rules/webapp.mdc)
Access all environment variables through the
envexport ofenv.server.tsinstead of directly accessingprocess.envin the Trigger.dev webapp
Files:
apps/webapp/app/v3/services/bulk/BulkActionV2.server.ts
apps/webapp/**/*.{ts,tsx}
📄 CodeRabbit inference engine (.cursor/rules/webapp.mdc)
apps/webapp/**/*.{ts,tsx}: When importing from@trigger.dev/corein the webapp, use subpath exports from the package.json instead of importing from the root path
Follow the Remix 2.1.0 and Express server conventions when updating the main trigger.dev webapp
Files:
apps/webapp/app/v3/services/bulk/BulkActionV2.server.ts
apps/webapp/app/v3/services/**/*.server.{ts,tsx}
📄 CodeRabbit inference engine (.cursor/rules/webapp.mdc)
Organize services in the webapp following the pattern
app/v3/services/*/*.server.ts
Files:
apps/webapp/app/v3/services/bulk/BulkActionV2.server.ts
apps/**/*.{ts,tsx,js}
📄 CodeRabbit inference engine (CLAUDE.md)
When modifying only server components (apps/webapp/, apps/supervisor/, etc.) with no package changes, add a .server-changes/ file instead of a changeset
Files:
apps/webapp/app/v3/services/bulk/BulkActionV2.server.ts
apps/webapp/app/v3/**/*.{ts,tsx}
📄 CodeRabbit inference engine (CLAUDE.md)
In apps/webapp/app/v3/ directory, only modify V2 code paths - V1/V2 branching should only have V2 modified, as V1 code is legacy
Files:
apps/webapp/app/v3/services/bulk/BulkActionV2.server.ts
apps/webapp/**/*.server.ts
📄 CodeRabbit inference engine (apps/webapp/CLAUDE.md)
Access environment variables via the
envexport fromapp/env.server.ts, never useprocess.envdirectly
Files:
apps/webapp/app/v3/services/bulk/BulkActionV2.server.ts
apps/webapp/app/v3/services/**/*.server.ts
📄 CodeRabbit inference engine (apps/webapp/CLAUDE.md)
When editing services that branch on
RunEngineVersionto support both V1 and V2 (e.g.,cancelTaskRun.server.ts,batchTriggerV3.server.ts), only modify V2 code paths
Files:
apps/webapp/app/v3/services/bulk/BulkActionV2.server.ts
🧠 Learnings (12)
📓 Common learnings
Learnt from: CR
Repo: triggerdotdev/trigger.dev PR: 0
File: apps/webapp/CLAUDE.md:0-0
Timestamp: 2026-03-02T12:42:56.114Z
Learning: All run lifecycle operations (triggering, completing, cancelling, etc.) should be performed through the singleton run engine instance in `app/v3/runEngine.server.ts` via service calls
📚 Learning: 2026-03-02T12:43:34.140Z
Learnt from: CR
Repo: triggerdotdev/trigger.dev PR: 0
File: packages/cli-v3/CLAUDE.md:0-0
Timestamp: 2026-03-02T12:43:34.140Z
Learning: Applies to packages/cli-v3/src/commands/dev.ts : Implement `dev.ts` command in `src/commands/` for local development mode
Applied to files:
packages/cli-v3/src/dev/devWatchdog.tspackages/cli-v3/src/dev/devSupervisor.ts
📚 Learning: 2025-10-08T11:48:12.327Z
Learnt from: nicktrn
Repo: triggerdotdev/trigger.dev PR: 2593
File: packages/core/src/v3/workers/warmStartClient.ts:168-170
Timestamp: 2025-10-08T11:48:12.327Z
Learning: The trigger.dev runners execute only in Node 21 and 22 environments, so modern Node.js APIs like AbortSignal.any (introduced in v20.3.0) are supported.
Applied to files:
packages/cli-v3/src/dev/devWatchdog.ts
📚 Learning: 2026-03-02T12:42:56.114Z
Learnt from: CR
Repo: triggerdotdev/trigger.dev PR: 0
File: apps/webapp/CLAUDE.md:0-0
Timestamp: 2026-03-02T12:42:56.114Z
Learning: Applies to apps/webapp/app/v3/services/**/*.server.ts : When editing services that branch on `RunEngineVersion` to support both V1 and V2 (e.g., `cancelTaskRun.server.ts`, `batchTriggerV3.server.ts`), only modify V2 code paths
Applied to files:
apps/webapp/app/v3/services/bulk/BulkActionV2.server.ts
📚 Learning: 2025-07-12T18:06:04.133Z
Learnt from: matt-aitken
Repo: triggerdotdev/trigger.dev PR: 2264
File: apps/webapp/app/services/runsRepository.server.ts:172-174
Timestamp: 2025-07-12T18:06:04.133Z
Learning: In apps/webapp/app/services/runsRepository.server.ts, the in-memory status filtering after fetching runs from Prisma is intentionally used as a workaround for ClickHouse data delays. This approach is acceptable because the result set is limited to a maximum of 100 runs due to pagination, making the performance impact negligible.
Applied to files:
apps/webapp/app/v3/services/bulk/BulkActionV2.server.ts
📚 Learning: 2025-11-27T16:27:35.304Z
Learnt from: CR
Repo: triggerdotdev/trigger.dev PR: 0
File: .cursor/rules/writing-tasks.mdc:0-0
Timestamp: 2025-11-27T16:27:35.304Z
Learning: Applies to **/trigger/**/*.{ts,tsx,js,jsx} : Subscribe to run updates using `runs.subscribeToRun()` for realtime monitoring of task execution
Applied to files:
packages/cli-v3/src/dev/devSupervisor.ts
📚 Learning: 2026-03-02T12:42:41.110Z
Learnt from: CR
Repo: triggerdotdev/trigger.dev PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-02T12:42:41.110Z
Learning: Applies to **/*.{ts,tsx} : Use the Run Engine 2.0 (internal/run-engine) and redis-worker for all new work, not legacy V1 MarQS queue or deprecated V1 functions
Applied to files:
packages/cli-v3/src/dev/devSupervisor.ts
📚 Learning: 2025-08-19T09:49:07.011Z
Learnt from: julienvanbeveren
Repo: triggerdotdev/trigger.dev PR: 2417
File: apps/webapp/app/routes/api.v1.projects.$projectRef.envvars.$slug.import.ts:56-61
Timestamp: 2025-08-19T09:49:07.011Z
Learning: In the Trigger.dev codebase, environment variables should default to `isSecret: false` when not explicitly marked as secrets in the syncEnvVars functionality. This is the intended behavior for both regular variables and parent variables.
Applied to files:
packages/cli-v3/src/dev/devSupervisor.ts
📚 Learning: 2025-11-27T16:26:58.661Z
Learnt from: CR
Repo: triggerdotdev/trigger.dev PR: 0
File: .cursor/rules/webapp.mdc:0-0
Timestamp: 2025-11-27T16:26:58.661Z
Learning: Applies to apps/webapp/app/**/*.{ts,tsx} : Access all environment variables through the `env` export of `env.server.ts` instead of directly accessing `process.env` in the Trigger.dev webapp
Applied to files:
packages/cli-v3/src/dev/devSupervisor.ts
📚 Learning: 2025-11-27T16:26:37.432Z
Learnt from: CR
Repo: triggerdotdev/trigger.dev PR: 0
File: .github/copilot-instructions.md:0-0
Timestamp: 2025-11-27T16:26:37.432Z
Learning: Applies to packages/trigger-sdk/**/*.{ts,tsx} : In the Trigger.dev SDK (packages/trigger-sdk), prefer isomorphic code like fetch and ReadableStream instead of Node.js-specific code
Applied to files:
packages/cli-v3/src/dev/devSupervisor.ts
📚 Learning: 2026-03-02T12:43:34.140Z
Learnt from: CR
Repo: triggerdotdev/trigger.dev PR: 0
File: packages/cli-v3/CLAUDE.md:0-0
Timestamp: 2026-03-02T12:43:34.140Z
Learning: Applies to packages/cli-v3/src/build/**/* : Bundle worker code using the build system in `src/build/` based on configuration from `trigger.config.ts`
Applied to files:
packages/cli-v3/src/dev/devSupervisor.ts
📚 Learning: 2026-03-02T12:42:56.114Z
Learnt from: CR
Repo: triggerdotdev/trigger.dev PR: 0
File: apps/webapp/CLAUDE.md:0-0
Timestamp: 2026-03-02T12:42:56.114Z
Learning: Applies to apps/webapp/app/v3/*Worker.server.ts : New background job workers should use `trigger.dev/redis-worker` (e.g., `commonWorker.server.ts`, `alertsWorker.server.ts`, `batchTriggerWorker.server.ts`), not zodworker or graphile-worker
Applied to files:
packages/cli-v3/src/dev/devSupervisor.ts
|
Some places could use the trycatch helper function, and use early returns |
There was a problem hiding this comment.
♻️ Duplicate comments (1)
packages/cli-v3/src/dev/devSupervisor.ts (1)
261-270:⚠️ Potential issue | 🟠 MajorThe PID-file fallback still trusts stale PIDs when
watchdogProcessis missing.Line 269 allows
!knownPid, so a leftovertrigger-watchdog:<pid>file can stillSIGTERMan unrelated process after PID reuse. Only use the PID file once you can validate a per-spawn identity, or skip the fallback when there is no in-memory PID.Safer interim guard
- if (pid && (!knownPid || pid === knownPid)) { + if (knownPid && pid === knownPid) { process.kill(pid, "SIGTERM"); }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@packages/cli-v3/src/dev/devSupervisor.ts` around lines 261 - 270, The PID-file fallback in devSupervisor.ts currently allows killing a PID when knownPid is falsy, which can SIGTERM an unrelated reused PID; update the logic in the block that reads this.watchdogPidPath so you only attempt process.kill when you can validate the spawn identity—e.g., require knownPid to be present (remove the `!knownPid` branch) or compare a stored per-spawn token before calling process.kill; adjust the condition around parseInt(...) and the call to process.kill(pid, "SIGTERM") accordingly in the watchdogPidPath handling code.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Duplicate comments:
In `@packages/cli-v3/src/dev/devSupervisor.ts`:
- Around line 261-270: The PID-file fallback in devSupervisor.ts currently
allows killing a PID when knownPid is falsy, which can SIGTERM an unrelated
reused PID; update the logic in the block that reads this.watchdogPidPath so you
only attempt process.kill when you can validate the spawn identity—e.g., require
knownPid to be present (remove the `!knownPid` branch) or compare a stored
per-spawn token before calling process.kill; adjust the condition around
parseInt(...) and the call to process.kill(pid, "SIGTERM") accordingly in the
watchdogPidPath handling code.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository UI
Review profile: CHILL
Plan: Pro
Run ID: 3337bba1-d3fd-4554-b342-f0faa3fadba6
📒 Files selected for processing (1)
packages/cli-v3/src/dev/devSupervisor.ts
📜 Review details
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (23)
- GitHub Check: units / internal / 🧪 Unit Tests: Internal (8, 8)
- GitHub Check: units / internal / 🧪 Unit Tests: Internal (4, 8)
- GitHub Check: units / internal / 🧪 Unit Tests: Internal (1, 8)
- GitHub Check: units / internal / 🧪 Unit Tests: Internal (3, 8)
- GitHub Check: units / internal / 🧪 Unit Tests: Internal (6, 8)
- GitHub Check: units / internal / 🧪 Unit Tests: Internal (5, 8)
- GitHub Check: units / internal / 🧪 Unit Tests: Internal (2, 8)
- GitHub Check: units / internal / 🧪 Unit Tests: Internal (7, 8)
- GitHub Check: units / webapp / 🧪 Unit Tests: Webapp (1, 8)
- GitHub Check: units / webapp / 🧪 Unit Tests: Webapp (7, 8)
- GitHub Check: units / webapp / 🧪 Unit Tests: Webapp (8, 8)
- GitHub Check: units / webapp / 🧪 Unit Tests: Webapp (5, 8)
- GitHub Check: units / webapp / 🧪 Unit Tests: Webapp (2, 8)
- GitHub Check: units / webapp / 🧪 Unit Tests: Webapp (6, 8)
- GitHub Check: units / webapp / 🧪 Unit Tests: Webapp (4, 8)
- GitHub Check: units / webapp / 🧪 Unit Tests: Webapp (3, 8)
- GitHub Check: typecheck / typecheck
- GitHub Check: e2e / 🧪 CLI v3 tests (windows-latest - pnpm)
- GitHub Check: sdk-compat / Node.js 20.20 (ubuntu-latest)
- GitHub Check: sdk-compat / Cloudflare Workers
- GitHub Check: e2e / 🧪 CLI v3 tests (windows-latest - npm)
- GitHub Check: units / packages / 🧪 Unit Tests: Packages (1, 1)
- GitHub Check: sdk-compat / Deno Runtime
🧰 Additional context used
📓 Path-based instructions (5)
**/*.{ts,tsx}
📄 CodeRabbit inference engine (.github/copilot-instructions.md)
**/*.{ts,tsx}: Use types over interfaces for TypeScript
Avoid using enums; prefer string unions or const objects instead
**/*.{ts,tsx}: In TypeScript SDK usage, always import from@trigger.dev/sdk, never from@trigger.dev/sdk/v3or use deprecated client.defineJob
Import from@trigger.dev/coresubpaths only, never from the root
Use the Run Engine 2.0 (@internal/run-engine) and redis-worker for all new work, not legacy V1 MarQS queue or deprecated V1 functions
Files:
packages/cli-v3/src/dev/devSupervisor.ts
**/*.{ts,tsx,js,jsx}
📄 CodeRabbit inference engine (.github/copilot-instructions.md)
Use function declarations instead of default exports
Files:
packages/cli-v3/src/dev/devSupervisor.ts
**/*.ts
📄 CodeRabbit inference engine (.cursor/rules/otel-metrics.mdc)
**/*.ts: When creating or editing OTEL metrics (counters, histograms, gauges), ensure metric attributes have low cardinality by using only enums, booleans, bounded error codes, or bounded shard IDs
Do not use high-cardinality attributes in OTEL metrics such as UUIDs/IDs (envId, userId, runId, projectId, organizationId), unbounded integers (itemCount, batchSize, retryCount), timestamps (createdAt, startTime), or free-form strings (errorMessage, taskName, queueName)
When exporting OTEL metrics via OTLP to Prometheus, be aware that the exporter automatically adds unit suffixes to metric names (e.g., 'my_duration_ms' becomes 'my_duration_ms_milliseconds', 'my_counter' becomes 'my_counter_total'). Account for these transformations when writing Grafana dashboards or Prometheus queries
Files:
packages/cli-v3/src/dev/devSupervisor.ts
**/*.{js,ts,jsx,tsx,json,md,yaml,yml}
📄 CodeRabbit inference engine (AGENTS.md)
Format code using Prettier before committing
Files:
packages/cli-v3/src/dev/devSupervisor.ts
packages/cli-v3/src/dev/**/*
📄 CodeRabbit inference engine (packages/cli-v3/CLAUDE.md)
Dev mode code should be located in
src/dev/and runs tasks locally in the user's Node.js process without containers
Files:
packages/cli-v3/src/dev/devSupervisor.ts
🧠 Learnings (8)
📚 Learning: 2026-03-02T12:43:34.140Z
Learnt from: CR
Repo: triggerdotdev/trigger.dev PR: 0
File: packages/cli-v3/CLAUDE.md:0-0
Timestamp: 2026-03-02T12:43:34.140Z
Learning: Applies to packages/cli-v3/src/commands/dev.ts : Implement `dev.ts` command in `src/commands/` for local development mode
Applied to files:
packages/cli-v3/src/dev/devSupervisor.ts
📚 Learning: 2025-08-19T09:49:07.011Z
Learnt from: julienvanbeveren
Repo: triggerdotdev/trigger.dev PR: 2417
File: apps/webapp/app/routes/api.v1.projects.$projectRef.envvars.$slug.import.ts:56-61
Timestamp: 2025-08-19T09:49:07.011Z
Learning: In the Trigger.dev codebase, environment variables should default to `isSecret: false` when not explicitly marked as secrets in the syncEnvVars functionality. This is the intended behavior for both regular variables and parent variables.
Applied to files:
packages/cli-v3/src/dev/devSupervisor.ts
📚 Learning: 2025-11-27T16:26:58.661Z
Learnt from: CR
Repo: triggerdotdev/trigger.dev PR: 0
File: .cursor/rules/webapp.mdc:0-0
Timestamp: 2025-11-27T16:26:58.661Z
Learning: Applies to apps/webapp/app/**/*.{ts,tsx} : Access all environment variables through the `env` export of `env.server.ts` instead of directly accessing `process.env` in the Trigger.dev webapp
Applied to files:
packages/cli-v3/src/dev/devSupervisor.ts
📚 Learning: 2024-10-18T15:41:52.352Z
Learnt from: nicktrn
Repo: triggerdotdev/trigger.dev PR: 1418
File: packages/core/src/v3/errors.ts:364-371
Timestamp: 2024-10-18T15:41:52.352Z
Learning: In `packages/core/src/v3/errors.ts`, within the `taskRunErrorEnhancer` function, `error.message` is always defined, so it's safe to directly call `error.message.includes("SIGTERM")` without additional checks.
Applied to files:
packages/cli-v3/src/dev/devSupervisor.ts
📚 Learning: 2025-11-27T16:26:37.432Z
Learnt from: CR
Repo: triggerdotdev/trigger.dev PR: 0
File: .github/copilot-instructions.md:0-0
Timestamp: 2025-11-27T16:26:37.432Z
Learning: Applies to packages/trigger-sdk/**/*.{ts,tsx} : In the Trigger.dev SDK (packages/trigger-sdk), prefer isomorphic code like fetch and ReadableStream instead of Node.js-specific code
Applied to files:
packages/cli-v3/src/dev/devSupervisor.ts
📚 Learning: 2026-03-02T12:43:34.140Z
Learnt from: CR
Repo: triggerdotdev/trigger.dev PR: 0
File: packages/cli-v3/CLAUDE.md:0-0
Timestamp: 2026-03-02T12:43:34.140Z
Learning: Applies to packages/cli-v3/src/build/**/* : Bundle worker code using the build system in `src/build/` based on configuration from `trigger.config.ts`
Applied to files:
packages/cli-v3/src/dev/devSupervisor.ts
📚 Learning: 2025-11-27T16:27:35.304Z
Learnt from: CR
Repo: triggerdotdev/trigger.dev PR: 0
File: .cursor/rules/writing-tasks.mdc:0-0
Timestamp: 2025-11-27T16:27:35.304Z
Learning: Applies to **/trigger/**/*.{ts,tsx,js,jsx} : Subscribe to run updates using `runs.subscribeToRun()` for realtime monitoring of task execution
Applied to files:
packages/cli-v3/src/dev/devSupervisor.ts
📚 Learning: 2026-03-02T12:42:56.114Z
Learnt from: CR
Repo: triggerdotdev/trigger.dev PR: 0
File: apps/webapp/CLAUDE.md:0-0
Timestamp: 2026-03-02T12:42:56.114Z
Learning: Applies to apps/webapp/app/v3/*Worker.server.ts : New background job workers should use `trigger.dev/redis-worker` (e.g., `commonWorker.server.ts`, `alertsWorker.server.ts`, `batchTriggerWorker.server.ts`), not zodworker or graphile-worker
Applied to files:
packages/cli-v3/src/dev/devSupervisor.ts
🔇 Additional comments (1)
packages/cli-v3/src/dev/devSupervisor.ts (1)
175-182: Persistactive-runs.jsonexplicitly before killing the watchdog.Lines 175-182 assume
controller.stop()leaves the file current, butshutdown()never calls#updateActiveRunsFile()itself. IfDevRunController.stop()resolves without triggeringonFinished, the watchdog can still read stale run IDs during shutdown.Run this to confirm the
stop()contract:#!/bin/bash set -euo pipefail fd '^dev-run-controller\.ts$' packages/cli-v3/src/entryPoints -x sh -c ' file="$1" echo "== $file ==" rg -n -C4 "class DevRunController|async stop\\(|\\bstop\\(|onFinished" "$file" ' sh {}Expected result:
stop()should synchronously invoke theonFinishedpath, or otherwise guaranteeDevSupervisor.runControllersis drained before shutdown continues.
When the dev CLI exits (e.g. ctrl+c via pnpm), runs that were mid-execution
previously stayed stuck in EXECUTING status for up to 5 minutes until the
heartbeat timeout fired. Now they are cancelled within seconds.
The dev CLI spawns a lightweight detached watchdog process at startup. The
watchdog monitors the CLI process ID and, when it detects the CLI has exited,
calls a new POST /engine/v1/dev/disconnect endpoint to cancel all in-flight
runs immediately (skipping PENDING_CANCEL since the worker is known to be dead).
Watchdog design:
even when pnpm sends SIGKILL to the process tree
Disconnect endpoint:
Run engine change:
skipping the PENDING_CANCEL waiting state and going directly to FINISHED