Tag intermediate initializationError retries with test.final_status=skip in JUnit XML#11010
Tag intermediate initializationError retries with test.final_status=skip in JUnit XML#11010cbeauchesne wants to merge 14 commits intomasterfrom
initializationError retries with test.final_status=skip in JUnit XML#11010Conversation
buildSrc/src/main/kotlin/dd-trace-java.configure-tests.gradle.kts
Outdated
Show resolved
Hide resolved
buildSrc/src/main/kotlin/dd-trace-java.configure-tests.gradle.kts
Outdated
Show resolved
Hide resolved
buildSrc/src/main/kotlin/dd-trace-java.configure-tests.gradle.kts
Outdated
Show resolved
Hide resolved
eb73143 to
fc75f5f
Compare
|
Hi! 👋 Thanks for your pull request! 🎉 To help us review it, please make sure to:
If you need help, please check our contributing guidelines. |
initializationError retries with test.final_status=skip in JUnit XML
| System.err.println("File not found: " + xmlFile); | ||
| System.exit(1); | ||
| } | ||
| var doc = DocumentBuilderFactory.newInstance().newDocumentBuilder().parse(xmlFile); |
There was a problem hiding this comment.
❔ question: Should we add some flags about entity resolution (for example) here to prevent security issue?
There was a problem hiding this comment.
Which security issue do you have in mind ? The entire workflow and data are derivated from the public content of this repo, and the script itself can be modified during a PR.
There was a problem hiding this comment.
I think @PerfectSlayer refers to XML external entity, or other tricks: https://cheatsheetseries.owasp.org/cheatsheets/XML_External_Entity_Prevention_Cheat_Sheet.html#java.
There was a problem hiding this comment.
If it's those, everything that runs here is produced by the PR content, including the script that execute the command. So i'm don't think that there is any increase in the surface attack.
.gitlab/TagInitializationErrors.java
Outdated
| * | ||
| * <p>Gradle generates synthetic "initializationError" testcases in JUnit reports for setup methods. | ||
| * When a setup is retried and eventually succeeds, multiple testcases are created, with only the | ||
| * last one passing. All intermediate attempts are marked skip so Test Optimization is not misled. |
There was a problem hiding this comment.
🎯 suggestion: It would help if you describe the expected changes here. You can re-use stuff from the PR description 😉
There was a problem hiding this comment.
I wonder if you could provide in the javadoc a sample of the non modified junit test file, and the expected output.
Also noteworthy to know, since this code is running on Java 25 it's possible to use markdown javadoc: https://blog.jetbrains.com/idea/2025/04/markdown-in-java-docs-shut-up-and-take-my-comments/
BenchmarksStartupParameters
See matching parameters
SummaryFound 0 performance improvements and 0 performance regressions! Performance is the same for 59 metrics, 12 unstable metrics. Startup time reports for petclinicgantt
title petclinic - global startup overhead: candidate=1.61.0-SNAPSHOT~1b5a33e89e, baseline=1.61.0-SNAPSHOT~2365c1251f
dateFormat X
axisFormat %s
section tracing
Agent [baseline] (1.058 s) : 0, 1057855
Total [baseline] (10.987 s) : 0, 10986971
Agent [candidate] (1.055 s) : 0, 1055317
Total [candidate] (11.024 s) : 0, 11024040
section appsec
Agent [baseline] (1.257 s) : 0, 1257392
Total [baseline] (11.179 s) : 0, 11178789
Agent [candidate] (1.246 s) : 0, 1245904
Total [candidate] (11.101 s) : 0, 11101366
section iast
Agent [baseline] (1.223 s) : 0, 1223490
Total [baseline] (11.295 s) : 0, 11295000
Agent [candidate] (1.226 s) : 0, 1225781
Total [candidate] (11.171 s) : 0, 11171281
section profiling
Agent [baseline] (1.192 s) : 0, 1191876
Total [baseline] (11.092 s) : 0, 11092040
Agent [candidate] (1.181 s) : 0, 1180697
Total [candidate] (11.016 s) : 0, 11016099
gantt
title petclinic - break down per module: candidate=1.61.0-SNAPSHOT~1b5a33e89e, baseline=1.61.0-SNAPSHOT~2365c1251f
dateFormat X
axisFormat %s
section tracing
crashtracking [baseline] (1.186 ms) : 0, 1186
crashtracking [candidate] (1.197 ms) : 0, 1197
BytebuddyAgent [baseline] (632.758 ms) : 0, 632758
BytebuddyAgent [candidate] (632.255 ms) : 0, 632255
AgentMeter [baseline] (29.493 ms) : 0, 29493
AgentMeter [candidate] (29.484 ms) : 0, 29484
GlobalTracer [baseline] (249.091 ms) : 0, 249091
GlobalTracer [candidate] (248.97 ms) : 0, 248970
AppSec [baseline] (32.102 ms) : 0, 32102
AppSec [candidate] (32.029 ms) : 0, 32029
Debugger [baseline] (60.123 ms) : 0, 60123
Debugger [candidate] (60.188 ms) : 0, 60188
Remote Config [baseline] (594.757 µs) : 0, 595
Remote Config [candidate] (612.158 µs) : 0, 612
Telemetry [baseline] (8.064 ms) : 0, 8064
Telemetry [candidate] (8.031 ms) : 0, 8031
Flare Poller [baseline] (8.258 ms) : 0, 8258
Flare Poller [candidate] (6.492 ms) : 0, 6492
section appsec
crashtracking [baseline] (1.205 ms) : 0, 1205
crashtracking [candidate] (1.186 ms) : 0, 1186
BytebuddyAgent [baseline] (666.575 ms) : 0, 666575
BytebuddyAgent [candidate] (660.952 ms) : 0, 660952
AgentMeter [baseline] (12.195 ms) : 0, 12195
AgentMeter [candidate] (12.118 ms) : 0, 12118
GlobalTracer [baseline] (250.783 ms) : 0, 250783
GlobalTracer [candidate] (248.341 ms) : 0, 248341
IAST [baseline] (24.698 ms) : 0, 24698
IAST [candidate] (24.541 ms) : 0, 24541
AppSec [baseline] (185.499 ms) : 0, 185499
AppSec [candidate] (183.478 ms) : 0, 183478
Debugger [baseline] (66.774 ms) : 0, 66774
Debugger [candidate] (66.216 ms) : 0, 66216
Remote Config [baseline] (608.969 µs) : 0, 609
Remote Config [candidate] (607.85 µs) : 0, 608
Telemetry [baseline] (8.799 ms) : 0, 8799
Telemetry [candidate] (8.688 ms) : 0, 8688
Flare Poller [baseline] (3.715 ms) : 0, 3715
Flare Poller [candidate] (3.576 ms) : 0, 3576
section iast
crashtracking [baseline] (1.195 ms) : 0, 1195
crashtracking [candidate] (1.191 ms) : 0, 1191
BytebuddyAgent [baseline] (800.836 ms) : 0, 800836
BytebuddyAgent [candidate] (801.372 ms) : 0, 801372
AgentMeter [baseline] (11.386 ms) : 0, 11386
AgentMeter [candidate] (11.393 ms) : 0, 11393
GlobalTracer [baseline] (238.946 ms) : 0, 238946
GlobalTracer [candidate] (238.659 ms) : 0, 238659
IAST [baseline] (25.894 ms) : 0, 25894
IAST [candidate] (25.927 ms) : 0, 25927
AppSec [baseline] (32.418 ms) : 0, 32418
AppSec [candidate] (28.838 ms) : 0, 28838
Debugger [baseline] (57.366 ms) : 0, 57366
Debugger [candidate] (62.65 ms) : 0, 62650
Remote Config [baseline] (521.218 µs) : 0, 521
Remote Config [candidate] (526.37 µs) : 0, 526
Telemetry [baseline] (14.222 ms) : 0, 14222
Telemetry [candidate] (14.863 ms) : 0, 14863
Flare Poller [baseline] (4.109 ms) : 0, 4109
Flare Poller [candidate] (4.036 ms) : 0, 4036
section profiling
crashtracking [baseline] (1.188 ms) : 0, 1188
crashtracking [candidate] (1.189 ms) : 0, 1189
BytebuddyAgent [baseline] (696.185 ms) : 0, 696185
BytebuddyAgent [candidate] (689.818 ms) : 0, 689818
AgentMeter [baseline] (9.261 ms) : 0, 9261
AgentMeter [candidate] (9.095 ms) : 0, 9095
GlobalTracer [baseline] (208.336 ms) : 0, 208336
GlobalTracer [candidate] (206.424 ms) : 0, 206424
AppSec [baseline] (32.788 ms) : 0, 32788
AppSec [candidate] (32.46 ms) : 0, 32460
Debugger [baseline] (64.776 ms) : 0, 64776
Debugger [candidate] (65.168 ms) : 0, 65168
Remote Config [baseline] (578.144 µs) : 0, 578
Remote Config [candidate] (556.026 µs) : 0, 556
Telemetry [baseline] (9.472 ms) : 0, 9472
Telemetry [candidate] (7.762 ms) : 0, 7762
Flare Poller [baseline] (3.629 ms) : 0, 3629
Flare Poller [candidate] (3.576 ms) : 0, 3576
ProfilingAgent [baseline] (93.946 ms) : 0, 93946
ProfilingAgent [candidate] (93.502 ms) : 0, 93502
Profiling [baseline] (94.513 ms) : 0, 94513
Profiling [candidate] (94.062 ms) : 0, 94062
Startup time reports for insecure-bankgantt
title insecure-bank - global startup overhead: candidate=1.61.0-SNAPSHOT~1b5a33e89e, baseline=1.61.0-SNAPSHOT~2365c1251f
dateFormat X
axisFormat %s
section tracing
Agent [baseline] (1.055 s) : 0, 1055157
Total [baseline] (8.813 s) : 0, 8812545
Agent [candidate] (1.059 s) : 0, 1058590
Total [candidate] (8.842 s) : 0, 8841538
section iast
Agent [baseline] (1.234 s) : 0, 1234329
Total [baseline] (9.524 s) : 0, 9524377
Agent [candidate] (1.227 s) : 0, 1226564
Total [candidate] (9.532 s) : 0, 9532061
gantt
title insecure-bank - break down per module: candidate=1.61.0-SNAPSHOT~1b5a33e89e, baseline=1.61.0-SNAPSHOT~2365c1251f
dateFormat X
axisFormat %s
section tracing
crashtracking [baseline] (1.194 ms) : 0, 1194
crashtracking [candidate] (1.183 ms) : 0, 1183
BytebuddyAgent [baseline] (633.06 ms) : 0, 633060
BytebuddyAgent [candidate] (632.979 ms) : 0, 632979
AgentMeter [baseline] (29.42 ms) : 0, 29420
AgentMeter [candidate] (29.394 ms) : 0, 29394
GlobalTracer [baseline] (248.74 ms) : 0, 248740
GlobalTracer [candidate] (248.613 ms) : 0, 248613
AppSec [baseline] (31.964 ms) : 0, 31964
AppSec [candidate] (32.079 ms) : 0, 32079
Debugger [baseline] (59.348 ms) : 0, 59348
Debugger [candidate] (59.191 ms) : 0, 59191
Remote Config [baseline] (603.488 µs) : 0, 603
Remote Config [candidate] (592.443 µs) : 0, 592
Telemetry [baseline] (8.046 ms) : 0, 8046
Telemetry [candidate] (8.033 ms) : 0, 8033
Flare Poller [baseline] (6.655 ms) : 0, 6655
Flare Poller [candidate] (10.458 ms) : 0, 10458
section iast
crashtracking [baseline] (1.209 ms) : 0, 1209
crashtracking [candidate] (1.189 ms) : 0, 1189
BytebuddyAgent [baseline] (806.863 ms) : 0, 806863
BytebuddyAgent [candidate] (802.96 ms) : 0, 802960
AgentMeter [baseline] (11.616 ms) : 0, 11616
AgentMeter [candidate] (11.397 ms) : 0, 11397
GlobalTracer [baseline] (241.349 ms) : 0, 241349
GlobalTracer [candidate] (238.962 ms) : 0, 238962
IAST [baseline] (26.25 ms) : 0, 26250
IAST [candidate] (26.056 ms) : 0, 26056
AppSec [baseline] (30.756 ms) : 0, 30756
AppSec [candidate] (31.379 ms) : 0, 31379
Debugger [baseline] (60.365 ms) : 0, 60365
Debugger [candidate] (59.104 ms) : 0, 59104
Remote Config [baseline] (532.611 µs) : 0, 533
Remote Config [candidate] (528.609 µs) : 0, 529
Telemetry [baseline] (14.861 ms) : 0, 14861
Telemetry [candidate] (14.464 ms) : 0, 14464
Flare Poller [baseline] (3.957 ms) : 0, 3957
Flare Poller [candidate] (4.276 ms) : 0, 4276
LoadParameters
See matching parameters
SummaryFound 0 performance improvements and 4 performance regressions! Performance is the same for 15 metrics, 17 unstable metrics.
Request duration reports for petclinicgantt
title petclinic - request duration [CI 0.99] : candidate=1.61.0-SNAPSHOT~1b5a33e89e, baseline=1.61.0-SNAPSHOT~2365c1251f
dateFormat X
axisFormat %s
section baseline
no_agent (18.171 ms) : 17987, 18354
. : milestone, 18171,
appsec (18.973 ms) : 18780, 19166
. : milestone, 18973,
code_origins (17.97 ms) : 17792, 18147
. : milestone, 17970,
iast (17.911 ms) : 17733, 18088
. : milestone, 17911,
profiling (19.006 ms) : 18818, 19194
. : milestone, 19006,
tracing (18.087 ms) : 17911, 18264
. : milestone, 18087,
section candidate
no_agent (18.183 ms) : 17997, 18369
. : milestone, 18183,
appsec (18.652 ms) : 18464, 18839
. : milestone, 18652,
code_origins (17.564 ms) : 17391, 17736
. : milestone, 17564,
iast (19.902 ms) : 19710, 20095
. : milestone, 19902,
profiling (19.365 ms) : 19175, 19555
. : milestone, 19365,
tracing (17.97 ms) : 17792, 18149
. : milestone, 17970,
Request duration reports for insecure-bankgantt
title insecure-bank - request duration [CI 0.99] : candidate=1.61.0-SNAPSHOT~1b5a33e89e, baseline=1.61.0-SNAPSHOT~2365c1251f
dateFormat X
axisFormat %s
section baseline
no_agent (1.257 ms) : 1244, 1269
. : milestone, 1257,
iast (3.289 ms) : 3242, 3335
. : milestone, 3289,
iast_FULL (5.951 ms) : 5890, 6012
. : milestone, 5951,
iast_GLOBAL (3.643 ms) : 3585, 3701
. : milestone, 3643,
profiling (2.116 ms) : 2098, 2135
. : milestone, 2116,
tracing (1.916 ms) : 1900, 1932
. : milestone, 1916,
section candidate
no_agent (1.252 ms) : 1241, 1264
. : milestone, 1252,
iast (3.445 ms) : 3392, 3497
. : milestone, 3445,
iast_FULL (5.911 ms) : 5851, 5970
. : milestone, 5911,
iast_GLOBAL (3.584 ms) : 3525, 3642
. : milestone, 3584,
profiling (2.563 ms) : 2537, 2589
. : milestone, 2563,
tracing (1.862 ms) : 1847, 1878
. : milestone, 1862,
DacapoParameters
See matching parameters
SummaryFound 1 performance improvements and 0 performance regressions! Performance is the same for 10 metrics, 1 unstable metrics.
Execution time for tomcatgantt
title tomcat - execution time [CI 0.99] : candidate=1.61.0-SNAPSHOT~1b5a33e89e, baseline=1.61.0-SNAPSHOT~2365c1251f
dateFormat X
axisFormat %s
section baseline
no_agent (1.487 ms) : 1476, 1499
. : milestone, 1487,
appsec (3.827 ms) : 3605, 4050
. : milestone, 3827,
iast (2.274 ms) : 2205, 2343
. : milestone, 2274,
iast_GLOBAL (2.33 ms) : 2261, 2400
. : milestone, 2330,
profiling (2.532 ms) : 2317, 2747
. : milestone, 2532,
tracing (2.095 ms) : 2041, 2149
. : milestone, 2095,
section candidate
no_agent (1.489 ms) : 1477, 1501
. : milestone, 1489,
appsec (2.539 ms) : 2485, 2594
. : milestone, 2539,
iast (2.28 ms) : 2210, 2349
. : milestone, 2280,
iast_GLOBAL (2.317 ms) : 2248, 2386
. : milestone, 2317,
profiling (2.115 ms) : 2060, 2171
. : milestone, 2115,
tracing (2.08 ms) : 2026, 2133
. : milestone, 2080,
Execution time for biojavagantt
title biojava - execution time [CI 0.99] : candidate=1.61.0-SNAPSHOT~1b5a33e89e, baseline=1.61.0-SNAPSHOT~2365c1251f
dateFormat X
axisFormat %s
section baseline
no_agent (15.652 s) : 15652000, 15652000
. : milestone, 15652000,
appsec (14.808 s) : 14808000, 14808000
. : milestone, 14808000,
iast (18.549 s) : 18549000, 18549000
. : milestone, 18549000,
iast_GLOBAL (17.894 s) : 17894000, 17894000
. : milestone, 17894000,
profiling (15.368 s) : 15368000, 15368000
. : milestone, 15368000,
tracing (14.909 s) : 14909000, 14909000
. : milestone, 14909000,
section candidate
no_agent (15.085 s) : 15085000, 15085000
. : milestone, 15085000,
appsec (14.889 s) : 14889000, 14889000
. : milestone, 14889000,
iast (18.263 s) : 18263000, 18263000
. : milestone, 18263000,
iast_GLOBAL (17.572 s) : 17572000, 17572000
. : milestone, 17572000,
profiling (15.11 s) : 15110000, 15110000
. : milestone, 15110000,
tracing (14.702 s) : 14702000, 14702000
. : milestone, 14702000,
|
.gitlab/TagInitializationErrors.java
Outdated
| * | ||
| * <p>Gradle generates synthetic "initializationError" testcases in JUnit reports for setup methods. | ||
| * When a setup is retried and eventually succeeds, multiple testcases are created, with only the | ||
| * last one passing. All intermediate attempts are marked skip so Test Optimization is not misled. |
There was a problem hiding this comment.
I wonder if you could provide in the javadoc a sample of the non modified junit test file, and the expected output.
Also noteworthy to know, since this code is running on Java 25 it's possible to use markdown javadoc: https://blog.jetbrains.com/idea/2025/04/markdown-in-java-docs-shut-up-and-take-my-comments/
| if (!modified) return; | ||
| var transformer = TransformerFactory.newInstance().newTransformer(); | ||
| transformer.setOutputProperty(OutputKeys.ENCODING, "UTF-8"); | ||
| transformer.transform(new DOMSource(doc), new StreamResult(xmlFile)); |
There was a problem hiding this comment.
note: This modifies the file in-place. What happens if the app fails? Does it leaves invalid documents?
| System.err.println("File not found: " + xmlFile); | ||
| System.exit(1); | ||
| } | ||
| var doc = DocumentBuilderFactory.newInstance().newDocumentBuilder().parse(xmlFile); |
There was a problem hiding this comment.
I think @PerfectSlayer refers to XML external entity, or other tricks: https://cheatsheetseries.owasp.org/cheatsheets/XML_External_Entity_Prevention_Cheat_Sheet.html#java.
Co-authored-by: Brice Dutheil <brice.dutheil@gmail.com>
Co-authored-by: Brice Dutheil <brice.dutheil@gmail.com>
Co-authored-by: Brice Dutheil <brice.dutheil@gmail.com>
Motivation
When a JUnit setup method (e.g. @BeforeAll) fails and is retried via Gradle's retry plugin, Gradle generates a synthetic
<testcase name="initializationError">for each attempt. If the final retry succeeds, the build passes, but Test Optimization receives all intermediate failure entries with no indication that they were retried, making them appear as genuine failures in the dashboard.What Does This Do
Add a doLast post-processor to every Test task that rewrites the JUnit XML reports after execution. For any suite with multiple
initializationErrortestcases (i.e. retries occurred), all entries except the last one are tagged with:The last entry is left unmodified, allowing Test Optimization to apply its default status inference based on the actual outcome. Files with only one (or zero) initializationError testcases are not modified.
The post-processor runs as a doLast action directly on the test task, keeping it within the task's up-to-date and caching scope so it doesn't interfere with downstream consumers of the JUnit reports.
Additional Notes
Contributor Checklist
type:and (comp:orinst:) labels in addition to any other useful labelsclose,fix, or any linking keywords when referencing an issueUse
solvesinstead, and assign the PR milestone to the issueJira ticket: [PROJ-IDENT]
Note: Once your PR is ready to merge, add it to the merge queue by commenting
/merge./merge -ccancels the queue request./merge -f --reason "reason"skips all merge queue checks; please use this judiciously, as some checks do not run at the PR-level. For more information, see this doc.