Support Nvidia-Cuda execution provider for wasi-nn onnx backend #12044

zhen9910 · 2025-11-18T20:04:30Z

As discussed in #8547, the existing wasi-nn onnx backend only uses the default CPU execution provider. This PR added onnx-cuda based wasi-nn GPU execution target in wasmtime-wasi-nn onnx backend.

devigned · 2025-11-19T12:31:47Z

Looks like the bump to ort caused cargo vet to be angry, which is to be expected.

+1 to the addition of GPU support for the ONNX backend \o/

@abrown, you might be interested in this one.

zhen9910 · 2025-12-10T20:13:51Z

@abrown could you please review the PR and add the cargo vet entries?

alexcrichton · 2025-12-11T12:13:11Z

@abrown is in the process of handing off wasi-nn maintenance/work to @jlb6740 and @rahulchaphalkar so as a heads up this might take a moment @zhen9910 to allocate time to work on it.

zhen9910 · 2025-12-11T18:54:09Z

@abrown is in the process of handing off wasi-nn maintenance/work to @jlb6740 and @rahulchaphalkar so as a heads up this might take a moment @zhen9910 to allocate time to work on it.

I see, thanks for update

zhen9910 · 2026-01-13T01:51:11Z

@alexcrichton could you take a look this PR, which is rebased against main with cargo vet and ort changes

alexcrichton · 2026-01-13T19:55:00Z

I unfortunately am not equipped myself to review/maintain wasi-nn. Historically that was Andrew, who's now handed off to @jlb6740 and @rahulchaphalkar, but they (I suspect) have other priorities to balance too. @zhen9910 if you'd like to reach out to them directly I'm sure they'd be happy to help work on a path forward

rahulchaphalkar · 2026-01-13T21:48:41Z

Thanks @zhen9910 for the contribution and also waiting for someone to take a look at it, and @alexcrichton for summarizing some of the changes happening here. I'm reviewing this.

rahulchaphalkar

Looks good, a couple of comments/question about the included Example.

rahulchaphalkar · 2026-01-13T21:51:38Z

crates/wasi-nn/src/backend/onnx.rs

+            #[cfg(feature = "onnx-cuda")]
+            {
+                // Use CUDA execution provider for GPU acceleration
+                tracing::debug!("Configuring ONNX Nvidia CUDA execution provider for GPU target");


NIT: debug messages for CPU and GPU can be of similar form, e.g. Using CPU/Nvidia GPU/CUDA execution provider or similar, or more verbose if you want.

rahulchaphalkar · 2026-01-13T21:58:30Z

crates/wasi-nn/examples/classification-component-onnx/README.md

+./target/debug/wasmtime run \
+    -Snn \
+    --dir ./crates/wasi-nn/examples/classification-component-onnx/fixture/::fixture \
+    ./crates/wasi-nn/examples/classification-component-onnx/target/wasm32-wasip1/debug/classification-component-onnx.wasm \


This should've read wasm32-wasip2 instead of wasm32-wasip1 in the original Readme, as this fails with below error:

Error: failed to run main module `./crates/wasi-nn/examples/classification-component-onnx/target/wasm32-wasip1/debug/classification-component-onnx.wasm` Caused by: 0: failed to instantiate "./crates/wasi-nn/examples/classification-component-onnx/target/wasm32-wasip1/debug/classification-component-onnx.wasm" 1: unknown import: `wasi:nn/[email protected]::[resource-drop]tensor` has not been defined

So thiis needs to be p2 for cpu and gpu both. (Build with cargo build --target wasm32-wasip2)

The original Readme used cargo component build, which will build wasm component for wasm32-wasip1 by default. so, the readme will be updated with cargo component build --target wasm32-wasip2 for wasm32-wasip2

rahulchaphalkar · 2026-01-13T22:00:42Z

crates/wasi-nn/examples/classification-component-onnx/README.md

+    -Snn \
+    --dir ./crates/wasi-nn/examples/classification-component-onnx/fixture/::fixture \
+    ./crates/wasi-nn/examples/classification-component-onnx/target/wasm32-wasip1/debug/classification-component-onnx.wasm \
+    gpu


What is the expected behavior of this example when run on a system w/o GPU/Cuda? I ran on my system without an Nvidia GPU, and it seemed to run without complaining, or without explicitly falling back to CPU or failing.

./target/debug/wasmtime run \ -Snn \ --dir ./crates/wasi-nn/examples/classification-component-onnx/fixture/::fixture \ ./crates/wasi-nn/examples/classification-component-onnx/target/wasm32-wasip2/debug/classification-component-onnx.wasm \ gpu Read ONNX model, size in bytes: 4956208 Using GPU (CUDA) execution target from argument Loaded graph into wasi-nn with ExecutionTarget::Gpu target Created wasi-nn execution context. Read ONNX Labels, # of labels: 1000 Executed graph inference Retrieved output data with length: 4000 Index: n02099601 golden retriever - Probability: 0.9948673 Index: n02088094 Afghan hound, Afghan - Probability: 0.002528982 Index: n02102318 cocker spaniel, English cocker spaniel, cocker - Probability: 0.001098644

From ort, if the GPU execution provider is requested but the device does not have a GPU or the necessary CUDA drivers are missing, ONNX Runtime will fall back to the CPU execution provider. The application will continue to run, but inference will happen on the CPU. When ort log is enabled, we can see a warning like: No execution providers from session options registered successfully; may fall back to CPU.

But this fallback log is not propagated from ort to wasi_nn, which cause confusing. I then add a feature ort-tracing to enable ort logging for wasi_nn, and this can be used by the users to verify this fallback behavior, please see if this is fine.

configure_execution_providers() is also updated to fallback to CPU if GPU/TPU is not enabled/supported to get a consistent fallback behavior.

zhen9910 · 2026-01-14T18:10:06Z

Thanks @zhen9910 for the contribution and also waiting for someone to take a look at it, and @alexcrichton for summarizing some of the changes happening here. I'm reviewing this.

thanks @rahulchaphalkar @alexcrichton for the update! I will take a look and address the comments.

zhen9910 requested review from a team as code owners November 18, 2025 20:04

zhen9910 requested review from fitzgen and removed request for a team November 18, 2025 20:04

zhen9910 mentioned this pull request Nov 18, 2025

Support additional Execution Providers in ONNX wasi-nn backend #8547

Open

zhen9910 force-pushed the zkong/update-ort-and-onnx-gpu branch from 482de2b to 75caf05 Compare November 19, 2025 00:03

fitzgen requested review from abrown and removed request for fitzgen November 19, 2025 20:51

Support Nvidia-Cuda execution provider for wasi-nn onnx backend

c919981

zhen9910 force-pushed the zkong/update-ort-and-onnx-gpu branch from 75caf05 to c919981 Compare January 12, 2026 20:16

rahulchaphalkar suggested changes Jan 13, 2026

View reviewed changes

update about onnx runtime's fallback behavior

4ec4eb2

zhen9910 force-pushed the zkong/update-ort-and-onnx-gpu branch from 163c6a8 to 4ec4eb2 Compare January 15, 2026 07:36

Support Nvidia-Cuda execution provider for wasi-nn onnx backend #12044

Are you sure you want to change the base?

Support Nvidia-Cuda execution provider for wasi-nn onnx backend #12044

Conversation

zhen9910 commented Nov 18, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

devigned commented Nov 19, 2025

Uh oh!

zhen9910 commented Dec 10, 2025

Uh oh!

alexcrichton commented Dec 11, 2025

Uh oh!

zhen9910 commented Dec 11, 2025

Uh oh!

zhen9910 commented Jan 13, 2026

Uh oh!

alexcrichton commented Jan 13, 2026

Uh oh!

rahulchaphalkar commented Jan 13, 2026

Uh oh!

rahulchaphalkar left a comment

Choose a reason for hiding this comment

Uh oh!

rahulchaphalkar Jan 13, 2026

Choose a reason for hiding this comment

Uh oh!

zhen9910 Jan 15, 2026

Choose a reason for hiding this comment

Uh oh!

rahulchaphalkar Jan 13, 2026

Choose a reason for hiding this comment

Uh oh!

zhen9910 Jan 15, 2026

Choose a reason for hiding this comment

Uh oh!

rahulchaphalkar Jan 13, 2026

Choose a reason for hiding this comment

Uh oh!

zhen9910 Jan 15, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

zhen9910 commented Jan 14, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

zhen9910 commented Nov 18, 2025 •

edited

Loading

zhen9910 Jan 15, 2026 •

edited

Loading