Add stable diffusion backend #572

ilopezluna · 2026-01-15T10:22:49Z

Based on #544
I started a new branch because the previous PR relied on gocontainerregistry. Given that, it was easier to start from scratch and use the previous PR as a reference rather than updating the old one.

In this PR, we also add support for packaging Diffusers models. For now, we only support the DDUF format (similar to GGUF: a single file containing everything), as it was the easiest to implement. However, support for other formats can be added later without much difficulty.

The backend is currently enabled only on Linux, but it also works on Darwin. That said, it requires a Python runtime, so we should first design a solution that works well in Docker Desktop.

I have packaged and pushed ignaciolopezluna020/stable-diffusion:latest and ignaciolopezluna020/stable-diffusion:Q4.

You can try by:

Building the image:
docker build --target final-diffusers -t model-runner:diffusers .

Then running it:
docker run -it --rm --gpus all -p 13434:13434 -e MODEL_RUNNER_PORT=13434 model-runner:diffusers

Pull model:
MODEL_RUNNER_HOST=http://localhost:13434 docker model pull ignaciolopezluna020/stable-diffusion:q4

And run it:

curl -s -X POST http://localhost:13434/engines/diffusers/v1/images/generations \
  -H "Content-Type: application/json" \
  -d '{
    "model": "ignaciolopezluna020/stable-diffusion:Q4",
    "prompt": "A cat sitting on a couch",
    "size": "512x512"
  }' | jq -r '.data[0].b64_json' | base64 -d > cat.png

Note: the image is being built from llama.cpp one, it does not include vllm

… handling

… format

…distribution is designed

gemini-code-assist · 2026-01-15T10:23:24Z

Summary of Changes

Hello @ilopezluna, I'm Gemini Code Assist¹! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request significantly expands the system's capabilities by integrating a new backend specifically designed for Stable Diffusion models. It introduces support for the DDUF format, allowing for streamlined packaging and deployment of Diffusers-based image generation models. The new backend exposes an OpenAI Images API-compatible interface, making it easier to interact with image generation functionalities. This change lays the groundwork for future enhancements in multimodal model support, particularly in the realm of generative AI for images.

Highlights

New Stable Diffusion Backend: Introduces a new backend for Stable Diffusion models, enabling image generation capabilities within the system.
DDUF Format Support: Adds support for packaging and running Diffusers models using the DDUF (Diffusers Unified Format), a single-file archive similar to GGUF.
CLI package Command Extension: The package command in the CLI now includes a --dduf <path> option, allowing users to package DDUF files into Docker model OCI artifacts.
OpenAI Images API Compatibility: A new FastAPI-based Python server is implemented for Diffusers, providing endpoints compatible with the OpenAI Images API for image generation requests.
Docker Image Integration: The Dockerfile has been updated to include a dedicated build stage for Diffusers, installing necessary Python dependencies like diffusers, torch, transformers, and fastapi into a virtual environment.

🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console.

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature	Command	Description
Code Review	`/gemini review`	Performs a code review for the current pull request in its current state.
Pull Request Summary	`/gemini summary`	Provides a summary of the current pull request in its current state.
Comment	@gemini-code-assist	Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help	`/gemini help`	Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution. ↩

pkg/inference/backends/diffusers/diffusers.go

gemini-code-assist

Code Review

This pull request introduces a new backend for Stable Diffusion models, leveraging the diffusers library. It adds support for packaging models in the DDUF (Diffusers Unified Format) via a new --dduf flag in the package command. The changes span across the Dockerfile to build the new environment, the CLI for packaging, and the core model distribution and inference scheduling components to integrate the new format and backend. The implementation includes a new Python-based server for handling image generation requests, compatible with the OpenAI Images API.

My review focuses on improving the Docker build process for better reproducibility and efficiency, and points out a key inconsistency in the backend implementation regarding model naming that should be addressed. The changes are otherwise well-structured and comprehensive.

Dockerfile

pkg/inference/backends/diffusers/diffusers.go

Dockerfile

…y and config files

…tion

sourcery-ai

Hey - I've found 2 issues, and left some high level feedback:

In the Dockerfile you changed LD_LIBRARY_PATH from /app/lib:$LD_LIBRARY_PATH to just /app/lib; this will discard any LD_LIBRARY_PATH set by the runtime (including GPU driver paths), so consider appending instead of overwriting to avoid subtle runtime linking issues.
The default error message in bundle.Unpack still says "neither GGUF nor safetensors" even though DDUF/diffusers is now supported; updating this message to include the new format will make debugging unsupported bundles less confusing.

Prompt for AI Agents

Please address the comments from this code review:

## Overall Comments
- In the Dockerfile you changed `LD_LIBRARY_PATH` from `/app/lib:$LD_LIBRARY_PATH` to just `/app/lib`; this will discard any LD_LIBRARY_PATH set by the runtime (including GPU driver paths), so consider appending instead of overwriting to avoid subtle runtime linking issues.
- The default error message in `bundle.Unpack` still says "neither GGUF nor safetensors" even though DDUF/diffusers is now supported; updating this message to include the new format will make debugging unsupported bundles less confusing.

## Individual Comments

### Comment 1
<location> `pkg/distribution/internal/bundle/unpack.go:45-47` </location>
<code_context>
+		if err := unpackDDUF(bundle, model); err != nil {
+			return nil, fmt.Errorf("unpack DDUF file: %w", err)
+		}
 	default:
 		return nil, fmt.Errorf("no supported model weights found (neither GGUF nor safetensors)")
 	}
</code_context>

<issue_to_address>
**suggestion:** Error message in Unpack is now stale and doesn’t mention diffusers/DDUF format.

Given that `detectModelFormat` can now return `types.FormatDiffusers` when DDUF files are present, this fallback message is outdated. If we hit the default branch for a diffusers model (e.g., due to misconfiguration), the error will incorrectly claim only GGUF/safetensors are supported. Please update the message to also mention diffusers/DDUF support.

```suggestion
	default:
		return nil, fmt.Errorf("no supported model weights found (expected GGUF, safetensors, or diffusers/DDUF)")
	}
```
</issue_to_address>

### Comment 2
<location> `pkg/inference/platform/platform.go:21-24` </location>
<code_context>
 	return runtime.GOOS == "linux"
 }
+
+// SupportsDiffusers returns true if diffusers is supported on the current platform.
+// Diffusers is supported on Linux (for Docker/CUDA) and macOS (for MPS/Apple Silicon).
+func SupportsDiffusers() bool {
+	// return runtime.GOOS == "linux" || runtime.GOOS == "darwin"
+	return runtime.GOOS == "linux" // Support for macOS disabled for now until we design a solution to distribute it via Docker Desktop.
+}
</code_context>

<issue_to_address>
**nitpick:** SupportsDiffusers comment and implementation diverge regarding macOS support.

The comment says diffusers is supported on Linux and macOS, but the function only returns true on Linux (with the darwin branch commented out). This mismatch can confuse callers and maintainers. Please either update the comment to Linux-only for now, or note that macOS support is intentionally disabled until the distribution approach is finalized.
</issue_to_address>

Sourcery is free for open source - if you like our reviews please consider sharing them ✨

_{Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.}

pkg/distribution/internal/bundle/unpack.go

pkg/inference/platform/platform.go

ilopezluna · 2026-01-15T13:01:50Z

Dockerfile

 ENV MODEL_RUNNER_PORT=12434
 ENV LLAMA_SERVER_PATH=/app/bin
 ENV HOME=/home/modelrunner
 ENV MODELS_PATH=/models


I've removed $LD_LIBRARY_PATH because I didn't find any usage. Please let me know if I'm missing anything here 🙏

Co-authored-by: sourcery-ai[bot] <58596630+sourcery-ai[bot]@users.noreply.github.com>

pkg/distribution/format/dduf.go

doringeman · 2026-01-15T13:34:25Z

pkg/distribution/internal/bundle/unpack.go

+	ddufFilename := filepath.Base(ddufPaths[0])
+	// Ensure the filename has the .dduf extension for proper detection by diffusers server
+	if !strings.HasSuffix(strings.ToLower(ddufFilename), ".dduf") {
+		ddufFilename = ddufFilename + ".dduf"


Is this really needed? Why don't we error out if doesn't have .dduf?

This is during unpacking into a bundle. We use the SHA of the blob as a filename, so no .dduf extension. The diffusers fails if the file provided has no .dduf extension :(

Dockerfile

doringeman

We need this so don't display that warning on pull.

diff --git a/pkg/distribution/distribution/client.go b/pkg/distribution/distribution/client.go
index d153e96d..e1815d15 100644
--- a/pkg/distribution/distribution/client.go
+++ b/pkg/distribution/distribution/client.go
@@ -638,9 +638,9 @@ func (c *Client) GetBundle(ref string) (types.ModelBundle, error) {
 
 func GetSupportedFormats() []types.Format {
 	if platform.SupportsVLLM() {
-		return []types.Format{types.FormatGGUF, types.FormatSafetensors}
+		return []types.Format{types.FormatGGUF, types.FormatSafetensors, types.FormatDiffusers}
 	}
-	return []types.Format{types.FormatGGUF}
+	return []types.Format{types.FormatGGUF, types.FormatDiffusers}
 }
 
 func checkCompat(image types.ModelArtifact, log *logrus.Entry, reference string, progressWriter io.Writer) error {

MODEL_RUNNER_HOST=http://localhost:13434 docker model pull ignaciolopezluna020/stable-diffusion:Q4
Warning: vLLM backend currently only implemented for x86_64 NVIDIA platforms
a1670237613e: Pull complete [==================================================>]  2.828GB/2.828GB
Model pulled successfully

doringeman · 2026-01-15T14:02:22Z

…nused code

…mages

…to add-stable-diffusion-backend

… function

# Conflicts: # Makefile

doringeman · 2026-01-15T14:55:28Z

You should add the new image-generation mode to the following files as well (search for completion/embedding/reranking):

./cmd/cli/commands/configure_flags.go
./cmd/cli/commands/compose_test.go
./pkg/inference/scheduling/loader.go

ilopezluna added 5 commits January 14, 2026 13:41

feat(diffusers): implement diffusers backend for image generation

630e8c7

feat(diffusers): add support for DDUF (Diffusers Unified Format) file…

1993372

… handling

feat(dduf): implement DDUF format support and enhance model loading

c886d00

feat(dduf): calculate total size of files and add human-readable size…

3f7394e

… format

feat(platform): restrict Diffusers support to Linux only until macOS …

4c22678

…distribution is designed

github-advanced-security bot found potential problems Jan 15, 2026

View reviewed changes

pkg/inference/backends/diffusers/diffusers.go Fixed Show fixed Hide fixed

gemini-code-assist bot reviewed Jan 15, 2026

View reviewed changes

Dockerfile Outdated Show resolved Hide resolved

pkg/inference/backends/diffusers/diffusers.go Outdated Show resolved Hide resolved

Dockerfile Outdated Show resolved Hide resolved

ilopezluna changed the title ~~Add stable diffusion backend~~ [WIP] Add stable diffusion backend Jan 15, 2026

ilopezluna added 5 commits January 15, 2026 11:39

feat(diffusers): add support for DDUF file type handling in repositor…

c89429a

…y and config files

feat(diffusers): sanitize log output for Diffusers arguments

a749093

feat(docker): streamline Python server code copying in Dockerfile

e916b40

feat(docker): specify exact versions for Python packages in Dockerfile

d54ea4a

feat(model): add DDUF file support to packaging command and documenta…

2daa296

…tion

ilopezluna changed the title ~~[WIP] Add stable diffusion backend~~ Add stable diffusion backend Jan 15, 2026

ilopezluna marked this pull request as ready for review January 15, 2026 12:51

sourcery-ai bot reviewed Jan 15, 2026

View reviewed changes

pkg/distribution/internal/bundle/unpack.go Show resolved Hide resolved

pkg/inference/platform/platform.go Show resolved Hide resolved

ilopezluna commented Jan 15, 2026

View reviewed changes

Update pkg/distribution/internal/bundle/unpack.go

cd090d5

Co-authored-by: sourcery-ai[bot] <58596630+sourcery-ai[bot]@users.noreply.github.com>

ilopezluna requested a review from a team January 15, 2026 13:03

doringeman reviewed Jan 15, 2026

View reviewed changes

ilopezluna added 6 commits January 15, 2026 15:26

refactor(dduf): replace formatDDUFSize with formatSize and clean up u…

400f1a1

…nused code

feat(docker): add support for building and running Diffusers Docker i…

f62e4e5

…mages

Merge remote-tracking branch 'origin/add-stable-diffusion-backend' in…

09d7117

…to add-stable-diffusion-backend

feat(client): add support for Diffusers format in GetSupportedFormats…

a4f76e9

… function

feat(docker): enhance GPU support for additional Docker image variants

237b9dd

Merge branch 'main' into add-stable-diffusion-backend

7d16258

# Conflicts: # Makefile

feat: add support for image-generation mode in backend operations

200f84b

Add stable diffusion backend #572

Are you sure you want to change the base?

Add stable diffusion backend #572

Conversation

ilopezluna commented Jan 15, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

gemini-code-assist bot commented Jan 15, 2026

Summary of Changes

Highlights

Footnotes

Uh oh!

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

Uh oh!

Uh oh!

Uh oh!

sourcery-ai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

ilopezluna Jan 15, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

doringeman Jan 15, 2026

Choose a reason for hiding this comment

Uh oh!

ilopezluna Jan 15, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

doringeman left a comment

Choose a reason for hiding this comment

Uh oh!

doringeman commented Jan 15, 2026

Uh oh!

doringeman commented Jan 15, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

ilopezluna commented Jan 15, 2026 •

edited

Loading

ilopezluna Jan 15, 2026 •

edited

Loading