Skip to content

Turing WPI support#1335

Merged
sbryngelson merged 5 commits intoMFlowCode:masterfrom
dgvacarevelo:master
Apr 3, 2026
Merged

Turing WPI support#1335
sbryngelson merged 5 commits intoMFlowCode:masterfrom
dgvacarevelo:master

Conversation

@dgvacarevelo
Copy link
Copy Markdown
Contributor

@dgvacarevelo dgvacarevelo commented Mar 24, 2026

Description

This PR adds support for the Turing supercomputer, an HPC cluster hosted by Worcester Polytechnic Institute (WPI).
The changes include:

  • Added Turing to the available system list in toolchain/bootstrap/modules.sh, and included the corresponding environment modules in toolchain/modules.
  • Introduced a new template, toolchain/templates/turing.mako, with the appropriate settings to ensure CPU-based execution

Type of change

  • Bug fix
  • New feature
  • Refactor
  • Documentation
  • Other: describe

Testing

Ran examples/3D_performance_test, examples/3D_lagrange_bubblescreen, and examples/3D_lagrange_shbubcollapse with the new template and different number of processors. Confirmed no errors during module loading or job submission

Checklist

  • I added or updated tests for new behavior
  • I updated documentation if user-facing behavior changed

See the developer guide for full coding standards.

GPU changes (expand if you modified src/simulation/)
  • GPU results match CPU results
  • Tested on NVIDIA GPU or AMD GPU

AI code reviews

Reviews are not triggered automatically. To request a review, comment on the PR:

  • @coderabbitai review — incremental review (new changes only)
  • @coderabbitai full review — full review from scratch
  • /review — Qodo review
  • /improve — Qodo code suggestions
  • @claude full review — Claude full review (also triggers on PR open/reopen/ready)
  • Add label claude-full-review — Claude full review via label

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai bot commented Mar 24, 2026

📝 Walkthrough

Walkthrough

This pull request adds WPI Turing support to the toolchain infrastructure. A new interactive menu option is exposed in the bootstrap modules selection prompt. The toolchain modules configuration introduces a new cluster group with CPU-variant modules pinning GCC/12.1.0, OpenMPI/4.1.3, and Python/3.13.5, and GPU-variant modules pinning NVHPC/24.3, Python/3.13.5, with compiler environment variables and MPI configuration. A new Mako template generates HPC batch job scripts with conditional Slurm directives, module loading, and per-target execution supporting both MPI and non-MPI modes with optional GPU resource allocation.

🚥 Pre-merge checks | ✅ 3
✅ Passed checks (3 passed)
Check name Status Explanation
Title check ✅ Passed The title 'Turing WPI support' clearly and concisely describes the main change: adding support for the Turing supercomputer at Worcester Polytechnic Institute.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.
Description check ✅ Passed The PR description follows the required template structure with all major sections completed, including a clear description of changes, type of change selection, testing information, and the checklist.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🧹 Nitpick comments (1)
toolchain/templates/turing.mako (1)

17-21: Hardcoded single GPU and memory constraints may limit flexibility.

The GPU configuration hardcodes:

  • --gres=gpu:1 - Only 1 GPU regardless of job requirements
  • --mem=208G - Fixed memory allocation
  • -C "A30|A100" - Specific GPU constraints

This may be appropriate for Turing's node configuration, but consider whether multi-GPU jobs are needed.

Please confirm with WPI cluster administrators that:

  1. Single GPU per job is the intended use pattern
  2. 208G is the appropriate memory allocation for GPU nodes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 2454b169-22c8-4900-b9d8-ae3d855e67c8

📥 Commits

Reviewing files that changed from the base of the PR and between 90d4b6e and faa0693.

📒 Files selected for processing (3)
  • toolchain/bootstrap/modules.sh
  • toolchain/modules
  • toolchain/templates/turing.mako

@codecov
Copy link
Copy Markdown

codecov bot commented Mar 24, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 64.67%. Comparing base (2d15ba9) to head (91ae402).
⚠️ Report is 1 commits behind head on master.

Additional details and impacted files
@@           Coverage Diff           @@
##           master    #1335   +/-   ##
=======================================
  Coverage   64.67%   64.67%           
=======================================
  Files          70       70           
  Lines       18251    18251           
  Branches     1504     1504           
=======================================
  Hits        11804    11804           
  Misses       5492     5492           
  Partials      955      955           

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@sbryngelson sbryngelson marked this pull request as draft March 24, 2026 19:12
@dgvacarevelo dgvacarevelo marked this pull request as ready for review April 3, 2026 12:36
@qodo-code-review
Copy link
Copy Markdown
Contributor

Review Summary by Qodo

Add Turing WPI supercomputer support

✨ Enhancement

Grey Divider

Walkthroughs

Description
• Add Turing supercomputer support for WPI HPC cluster
• Create CPU-only template with SLURM batch configuration
• Register Turing system in bootstrap module selection menu
• Configure environment modules for gcc, OpenMPI, and Python
Diagram
flowchart LR
  A["Bootstrap System Selection"] -->|"Add Turing option"| B["modules.sh"]
  C["New Turing Template"] -->|"CPU-only SLURM config"| D["turing.mako"]
  E["Environment Modules"] -->|"gcc, OpenMPI, Python"| F["modules file"]
  B --> G["WPI Turing System"]
  D --> G
  F --> G
Loading

Grey Divider

File Changes

1. toolchain/bootstrap/modules.sh ✨ Enhancement +2/-1

Add Turing to system selection menu

• Added Turing (t) to the system selection menu with WPI label
• Updated log output to display Turing as an available option
• Modified input prompt to include 't' as a valid system choice

toolchain/bootstrap/modules.sh


2. toolchain/templates/turing.mako ✨ Enhancement +51/-0

Create Turing CPU-only SLURM template

• Created new SLURM batch template for Turing supercomputer
• Configured CPU-only execution mode (no GPU support)
• Set up job submission with nodes, tasks, and resource allocation
• Included module loading, profiler support, and MPI execution via srun

toolchain/templates/turing.mako


3. toolchain/modules ⚙️ Configuration changes +4/-0

Configure Turing environment modules

• Added Turing system identifier and WPI label
• Registered slurm as dependency for all Turing configurations
• Configured CPU modules with gcc/12.1.0, openmpi/4.1.3, and python/3.13.5

toolchain/modules


Grey Divider

Qodo Logo

@qodo-code-review
Copy link
Copy Markdown
Contributor

qodo-code-review bot commented Apr 3, 2026

Code Review by Qodo

🐞 Bugs (3) 📘 Rule violations (0) 📎 Requirement gaps (0) 🎨 UX Issues (0)

Grey Divider


Action required

1. GPU mode missing module set 🐞 Bug ≡ Correctness
Description
toolchain/templates/turing.mako uses gpu_enabled to select -m g, but toolchain/modules defines no
t-gpu entry, so mfc.sh load will omit compiler/MPI modules and MPI runs can fail due to missing
runtime libraries. This is triggered whenever a user runs with --gpu on Turing (gpu_enabled=true).
Code

toolchain/templates/turing.mako[30]

+. ./mfc.sh load -c t -m ${'g' if gpu_enabled else 'c'}
Evidence
The Turing template chooses mode g when gpu_enabled is true. gpu_enabled is derived from the
CLI --gpu mode, so it can be true on any system. The module loader composes modules from
<slug>-all and <slug>-<mode>; because t-gpu is not defined in toolchain/modules, only t-all
will be applied for GPU mode, omitting the t-cpu toolchain modules (gcc/openmpi/python) needed for
compilation/runtime linking.

toolchain/templates/turing.mako[28-33]
toolchain/modules[106-108]
toolchain/bootstrap/modules.sh[79-115]
toolchain/mfc/run/run.py[93-117]

Agent prompt
The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

### Issue description
`turing.mako` selects `-m g` when `gpu_enabled` is true, but there is no `t-gpu` module set in `toolchain/modules`. As a result, GPU mode on Turing will only load `t-all` (currently just `slurm`) and will miss required compiler/MPI modules.

### Issue Context
`gpu_enabled` is driven by the `--gpu` CLI flag, so users can unintentionally hit this even on a CPU-only cluster.

### Fix Focus Areas
- toolchain/templates/turing.mako[28-33]
- toolchain/modules[106-108]
- toolchain/bootstrap/modules.sh[105-114]

### Suggested fix options
1) If Turing is CPU-only: change the template to always load CPU modules (`-m c`) and optionally error if `gpu_enabled` is requested.
2) If GPU runs are intended: add `t-gpu ...` in `toolchain/modules` (either alias to `t-cpu` or add the real GPU toolchain modules) so `mfc.sh load -c t -m g` loads a complete environment.

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools



Remediation recommended

2. Help text omits Turing 🐞 Bug ⚙ Maintainability
Description
toolchain/bootstrap/modules.sh adds Turing to the interactive system selection prompt, but
show_help()’s documented -c/--computer options do not include Turing, making `./mfc.sh load
--help` inaccurate for the new feature.
Code

toolchain/bootstrap/modules.sh[R49-50]

+    log   "$C""WPI $W:   Turing   (t)"
+    log_n "($G""a$W/$G""f$W/$G""s$W/$G""w$W/$B""tuo$W/$C""b$W/$C""e$CR/$C""d/$C""dai$CR/$Y""p$CR/$R""r$CR/$B""cc$CR/$B""c$CR/$B""n$CR/$BR""o"$CR"/$OR""h"$CR/$C""t""$CR"): "
Evidence
The interactive selection list includes the new WPI/Turing entry, but the help text printed by
show_help() still lists the older set of systems and does not mention Turing, so users relying on
--help won’t discover the new supported cluster slug.

toolchain/bootstrap/modules.sh[4-17]
toolchain/bootstrap/modules.sh[38-52]

Agent prompt
The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

### Issue description
The interactive system prompt includes Turing (t), but `show_help()` in `toolchain/bootstrap/modules.sh` does not list it under `-c/--computer` options.

### Issue Context
This PR adds a new supported cluster; help output should be updated in the same change to keep CLI documentation correct.

### Fix Focus Areas
- toolchain/bootstrap/modules.sh[4-17]

### Suggested fix
Add `Turing (t)` (and optionally the org label) to the `show_help()` option list for `--computer`.

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools


3. Hardcoded Slurm MPI type 🐞 Bug ☼ Reliability
Description
toolchain/templates/turing.mako hardcodes srun --mpi=pmi2, which is not used by any other template
and can reduce portability across Slurm configurations. If Turing’s Slurm is configured with a
different MPI plugin, MPI launches will fail.
Code

toolchain/templates/turing.mako[R41-43]

+        (set -x; ${profiler}                                    \
+            srun  --mpi=pmi2   --ntasks=${nodes*tasks_per_node} \
+            "${target.get_install_binpath(case)}")
Evidence
Only the new Turing template specifies an explicit --mpi=... flag; other templates rely on Slurm
defaults (plain srun ...). This makes Turing uniquely sensitive to the availability/name of the
PMI2 plugin.

toolchain/templates/turing.mako[38-44]
toolchain/templates/default.mako[54-58]
toolchain/templates/frontier.mako[62-75]

Agent prompt
The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

### Issue description
`turing.mako` forces `srun --mpi=pmi2`, which is inconsistent with other templates and can break on Slurm installs where PMI2 is not available/enabled.

### Issue Context
Other baked-in templates typically omit `--mpi=` and let Slurm choose the correct MPI plugin.

### Fix Focus Areas
- toolchain/templates/turing.mako[38-44]

### Suggested fix options
1) Remove `--mpi=pmi2` and rely on Slurm defaults.
2) Make the MPI type configurable via an argument (e.g., template variable) with a safe default.
3) If PMI2 is required specifically on Turing, add an in-script check or comment explaining why it’s required.

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools


Grey Divider

ⓘ The new review experience is currently in Beta. Learn more

Grey Divider

Qodo Logo


ok ":) Loading modules:\n"
cd "${MFC_ROOT_DIR}"
. ./mfc.sh load -c t -m ${'g' if gpu_enabled else 'c'}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Action required

1. Gpu mode missing module set 🐞 Bug ≡ Correctness

toolchain/templates/turing.mako uses gpu_enabled to select -m g, but toolchain/modules defines no
t-gpu entry, so mfc.sh load will omit compiler/MPI modules and MPI runs can fail due to missing
runtime libraries. This is triggered whenever a user runs with --gpu on Turing (gpu_enabled=true).
Agent Prompt
### Issue description
`turing.mako` selects `-m g` when `gpu_enabled` is true, but there is no `t-gpu` module set in `toolchain/modules`. As a result, GPU mode on Turing will only load `t-all` (currently just `slurm`) and will miss required compiler/MPI modules.

### Issue Context
`gpu_enabled` is driven by the `--gpu` CLI flag, so users can unintentionally hit this even on a CPU-only cluster.

### Fix Focus Areas
- toolchain/templates/turing.mako[28-33]
- toolchain/modules[106-108]
- toolchain/bootstrap/modules.sh[105-114]

### Suggested fix options
1) If Turing is CPU-only: change the template to always load CPU modules (`-m c`) and optionally error if `gpu_enabled` is requested.
2) If GPU runs are intended: add `t-gpu ...` in `toolchain/modules` (either alias to `t-cpu` or add the real GPU toolchain modules) so `mfc.sh load -c t -m g` loads a complete environment.

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The template and modules targets CPU-only simulations, as GPU support on Turing remains experimental and restricted to single-processor runs.

@sbryngelson sbryngelson merged commit 0f83387 into MFlowCode:master Apr 3, 2026
141 of 148 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Development

Successfully merging this pull request may close these issues.

2 participants