Skip to content

Conversation

@giordano
Copy link
Collaborator

@giordano giordano commented Jan 19, 2026

julia> using Malt, ParallelTestRunner

julia> testsuite = Dict(
           "abort" => quote
               @info("I'm about to crash")
               abort() = ccall(:abort, Nothing, ())
               abort()
           end,
           "works" => quote
               println("This will work")
               @test true
           end,
           "silent" => quote
               @test true
           end,
       );

julia> runtests(ParallelTestRunner, ["--jobs=1"]; testsuite);
Running 1 tests in parallel. If this is too many, specify the `--jobs=N` argument to the tests, or set the `JULIA_CPU_THREADS` environment variable.
               │          │ ──────────────── CPU ──────────────── │
Test  (Worker) │ Time (s) │ GC (s) │ GC % │ Alloc (MB) │ RSS (MB) │
abort     (13) |         crashed at 2026-01-19T15:58:30.948
works     (14) │     0.07 │   0.00 │  0.0 │       5.40 │  1017.21 │
silent    (14) │     0.00 │   0.00 │  0.0 │       0.00 │  1017.21 │

Output generated during execution of 'abort':
┌ [ Info: I'm about to crash
│ 
│ [61478] signal 6 (-6): Aborted
│ in expression starting at none:1
│ unknown function (ip: 0x7f70552973dc) at /usr/lib/x86_64-linux-gnu/libc.so.6
│ gsignal at /usr/lib/x86_64-linux-gnu/libc.so.6 (unknown line)
│ abort at /usr/lib/x86_64-linux-gnu/libc.so.6 (unknown line)
│ abort at ./REPL[15]:4
│ unknown function (ip: 0x7f704df313cf) at (unknown file)
│ macro expansion at ./REPL[15]:5 [inlined]
│ macro expansion at /home/mose/.julia/dev/ParallelTestRunner/src/ParallelTestRunner.jl:280 [inlined]
│ macro expansion at /cache/build/tester-amdci4-14/julialang/julia-release-1-dot-12/usr/share/julia/stdlib/v1.12/Test/src/Test.jl:1776 [inlined]
│ macro expansion at /home/mose/.julia/dev/ParallelTestRunner/src/ParallelTestRunner.jl:280 [inlined]
│ macro expansion at /cache/build/tester-amdci4-14/julialang/julia-release-1-dot-12/usr/share/julia/stdlib/v1.12/Test/src/Test.jl:1776 [inlined]
│ macro expansion at /home/mose/.julia/dev/ParallelTestRunner/src/ParallelTestRunner.jl:279 [inlined]
│ macro expansion at ./timing.jl:689 [inlined]
│ top-level scope at /home/mose/.julia/dev/ParallelTestRunner/src/ParallelTestRunner.jl:278
│ jl_toplevel_eval_flex at /cache/build/tester-amdci4-14/julialang/julia-release-1-dot-12/src/toplevel.c:1024
│ ijl_toplevel_eval at /cache/build/tester-amdci4-14/julialang/julia-release-1-dot-12/src/toplevel.c:1047
│ ijl_toplevel_eval_in at /cache/build/tester-amdci4-14/julialang/julia-release-1-dot-12/src/toplevel.c:1092
│ eval at ./boot.jl:489
│ inner at /home/mose/.julia/dev/ParallelTestRunner/src/ParallelTestRunner.jl:271
│ runtest at /home/mose/.julia/dev/ParallelTestRunner/src/ParallelTestRunner.jl:302
│ unknown function (ip: 0x7f704df23460) at (unknown file)
│ jl_apply at /cache/build/tester-amdci4-14/julialang/julia-release-1-dot-12/src/julia.h:2391 [inlined]
│ jl_f_invokelatest at /cache/build/tester-amdci4-14/julialang/julia-release-1-dot-12/src/builtins.c:881
│ #handle##0 at /home/mose/.julia/dev/Malt/src/worker.jl:120
│ unknown function (ip: 0x7f704df2323f) at (unknown file)
│ jl_apply at /cache/build/tester-amdci4-14/julialang/julia-release-1-dot-12/src/julia.h:2391 [inlined]
│ start_task at /cache/build/tester-amdci4-14/julialang/julia-release-1-dot-12/src/task.c:1249
└ Allocations: 2975744 (Pool: 2975695; Big: 49); GC: 4

Output generated during execution of 'works':
[ This will work

Test Summary: | Pass  Error  Total  Time
  Overall     |    2      1      3  7.0s
    abort     |           1      1  4.6s
    works     |    1             1  0.0s
    silent    |    1             1  0.0s
    FAILURE

Error in testset abort:
Error During Test at none:1
  Got exception outside of a @test
  Malt.TerminatedWorkerException()
ERROR: Test run finished with errors

Note: still needs tests (can adapt the example above). Tests added

Requires JuliaPluto/Malt.jl#103. Fix #83.

@giordano
Copy link
Collaborator Author

So, the remaining failures now are only due to buffering: the tests at

"color" => quote
printstyled("Roses Are Red"; color=:red)
end
"no color" => quote
print("Violets are ")
printstyled("blue"; color=:blue)
end
rely on the fact that the previous IO capturing mechanism was able to capture also from print-like function, without a newline which would flush the stream.

Unless someone knows a magic way to run julia unbuffered like python -u without using external tools like unbuffer/stdbuf (but decade-old issues like JuliaLang/julia#13050 don't inspire much hope), only thing I can think of is to append a println to those tests and live with the fact that lone prints without newlines/flushing won't be captured anymore. But that's hopefully a non-super-common situation, and I'd argue the ability of getting the output of crashed workers overcomes that limitation.

@giordano
Copy link
Collaborator Author

giordano commented Jan 19, 2026

@vchuravy @maleadt with the caveat that we need to wait for the upstream PR JuliaPluto/Malt.jl#103, I'd appreciate some feedback about the design of this PR and its tradeoffs.

@giordano giordano requested review from maleadt and vchuravy January 26, 2026 09:56
@giordano
Copy link
Collaborator Author

This is now ready for review!


# Adapted from `Malt._stdio_loop`
function stdio_loop(worker::PTRWorker)
@async while !eof(worker.w.stdout) && Malt.isrunning(worker)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Obligatory: Why @async and not @spawn

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No specific reason, I had followed the original Malt._stdio_loop. I pushed a change to replace @async -> Threads.@spawn

wrkr = Malt.Worker(; exename, exeflags, env)
WORKER_IDS[wrkr.proc_pid] = length(WORKER_IDS) + 1
io = IOBuffer()
wrkr = PTRWorker(Malt.Worker(; exename, exeflags, env, monitor_stdout=false, monitor_stderr=false), io)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we fold this into the constructor?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done! Also included the worker id as a field of the worker (with a global atomic counter), no need to keep a dictionary around for doing the mapping.

break
end
end
@async while !eof(worker.w.stderr) && Malt.isrunning(worker)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Slight order weirdness... What happens if Malt.isrunning is true before we have consumed all the text?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Uhm, to this one I don't have a good answer

@giordano giordano merged commit 39c1c38 into main Jan 26, 2026
42 of 43 checks passed
@giordano giordano deleted the mg/crash-output branch January 26, 2026 15:56
@giordano giordano restored the mg/crash-output branch January 26, 2026 15:57
@giordano
Copy link
Collaborator Author

It's so nice to be able to see the output of a segfault 🥳

@giordano giordano deleted the mg/crash-output branch January 27, 2026 01:50
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Print worker crash output

4 participants