-
Notifications
You must be signed in to change notification settings - Fork 82
Add stable diffusion backend #572
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
…distribution is designed
Summary of ChangesHello @ilopezluna, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request significantly expands the system's capabilities by integrating a new backend specifically designed for Stable Diffusion models. It introduces support for the DDUF format, allowing for streamlined packaging and deployment of Diffusers-based image generation models. The new backend exposes an OpenAI Images API-compatible interface, making it easier to interact with image generation functionalities. This change lays the groundwork for future enhancements in multimodal model support, particularly in the realm of generative AI for images. Highlights
🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console. Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review
This pull request introduces a new backend for Stable Diffusion models, leveraging the diffusers library. It adds support for packaging models in the DDUF (Diffusers Unified Format) via a new --dduf flag in the package command. The changes span across the Dockerfile to build the new environment, the CLI for packaging, and the core model distribution and inference scheduling components to integrate the new format and backend. The implementation includes a new Python-based server for handling image generation requests, compatible with the OpenAI Images API.
My review focuses on improving the Docker build process for better reproducibility and efficiency, and points out a key inconsistency in the backend implementation regarding model naming that should be addressed. The changes are otherwise well-structured and comprehensive.
…y and config files
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hey - I've found 2 issues, and left some high level feedback:
- In the Dockerfile you changed
LD_LIBRARY_PATHfrom/app/lib:$LD_LIBRARY_PATHto just/app/lib; this will discard any LD_LIBRARY_PATH set by the runtime (including GPU driver paths), so consider appending instead of overwriting to avoid subtle runtime linking issues. - The default error message in
bundle.Unpackstill says "neither GGUF nor safetensors" even though DDUF/diffusers is now supported; updating this message to include the new format will make debugging unsupported bundles less confusing.
Prompt for AI Agents
Please address the comments from this code review:
## Overall Comments
- In the Dockerfile you changed `LD_LIBRARY_PATH` from `/app/lib:$LD_LIBRARY_PATH` to just `/app/lib`; this will discard any LD_LIBRARY_PATH set by the runtime (including GPU driver paths), so consider appending instead of overwriting to avoid subtle runtime linking issues.
- The default error message in `bundle.Unpack` still says "neither GGUF nor safetensors" even though DDUF/diffusers is now supported; updating this message to include the new format will make debugging unsupported bundles less confusing.
## Individual Comments
### Comment 1
<location> `pkg/distribution/internal/bundle/unpack.go:45-47` </location>
<code_context>
+ if err := unpackDDUF(bundle, model); err != nil {
+ return nil, fmt.Errorf("unpack DDUF file: %w", err)
+ }
default:
return nil, fmt.Errorf("no supported model weights found (neither GGUF nor safetensors)")
}
</code_context>
<issue_to_address>
**suggestion:** Error message in Unpack is now stale and doesn’t mention diffusers/DDUF format.
Given that `detectModelFormat` can now return `types.FormatDiffusers` when DDUF files are present, this fallback message is outdated. If we hit the default branch for a diffusers model (e.g., due to misconfiguration), the error will incorrectly claim only GGUF/safetensors are supported. Please update the message to also mention diffusers/DDUF support.
```suggestion
default:
return nil, fmt.Errorf("no supported model weights found (expected GGUF, safetensors, or diffusers/DDUF)")
}
```
</issue_to_address>
### Comment 2
<location> `pkg/inference/platform/platform.go:21-24` </location>
<code_context>
return runtime.GOOS == "linux"
}
+
+// SupportsDiffusers returns true if diffusers is supported on the current platform.
+// Diffusers is supported on Linux (for Docker/CUDA) and macOS (for MPS/Apple Silicon).
+func SupportsDiffusers() bool {
+ // return runtime.GOOS == "linux" || runtime.GOOS == "darwin"
+ return runtime.GOOS == "linux" // Support for macOS disabled for now until we design a solution to distribute it via Docker Desktop.
+}
</code_context>
<issue_to_address>
**nitpick:** SupportsDiffusers comment and implementation diverge regarding macOS support.
The comment says diffusers is supported on Linux and macOS, but the function only returns true on Linux (with the darwin branch commented out). This mismatch can confuse callers and maintainers. Please either update the comment to Linux-only for now, or note that macOS support is intentionally disabled until the distribution approach is finalized.
</issue_to_address>Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.
| ENV MODEL_RUNNER_PORT=12434 | ||
| ENV LLAMA_SERVER_PATH=/app/bin | ||
| ENV HOME=/home/modelrunner | ||
| ENV MODELS_PATH=/models |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've removed $LD_LIBRARY_PATH because I didn't find any usage. Please let me know if I'm missing anything here 🙏
Co-authored-by: sourcery-ai[bot] <58596630+sourcery-ai[bot]@users.noreply.github.com>
| ddufFilename := filepath.Base(ddufPaths[0]) | ||
| // Ensure the filename has the .dduf extension for proper detection by diffusers server | ||
| if !strings.HasSuffix(strings.ToLower(ddufFilename), ".dduf") { | ||
| ddufFilename = ddufFilename + ".dduf" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this really needed? Why don't we error out if doesn't have .dduf?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is during unpacking into a bundle. We use the SHA of the blob as a filename, so no .dduf extension. The diffusers fails if the file provided has no .dduf extension :(
doringeman
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We need this so don't display that warning on pull.
diff --git a/pkg/distribution/distribution/client.go b/pkg/distribution/distribution/client.go
index d153e96d..e1815d15 100644
--- a/pkg/distribution/distribution/client.go
+++ b/pkg/distribution/distribution/client.go
@@ -638,9 +638,9 @@ func (c *Client) GetBundle(ref string) (types.ModelBundle, error) {
func GetSupportedFormats() []types.Format {
if platform.SupportsVLLM() {
- return []types.Format{types.FormatGGUF, types.FormatSafetensors}
+ return []types.Format{types.FormatGGUF, types.FormatSafetensors, types.FormatDiffusers}
}
- return []types.Format{types.FormatGGUF}
+ return []types.Format{types.FormatGGUF, types.FormatDiffusers}
}
func checkCompat(image types.ModelArtifact, log *logrus.Entry, reference string, progressWriter io.Writer) error {
MODEL_RUNNER_HOST=http://localhost:13434 docker model pull ignaciolopezluna020/stable-diffusion:Q4
Warning: vLLM backend currently only implemented for x86_64 NVIDIA platforms
a1670237613e: Pull complete [==================================================>] 2.828GB/2.828GB
Model pulled successfully
…to add-stable-diffusion-backend
# Conflicts: # Makefile
|
You should add the new
|

Based on #544
I started a new branch because the previous PR relied on
gocontainerregistry. Given that, it was easier to start from scratch and use the previous PR as a reference rather than updating the old one.In this PR, we also add support for packaging Diffusers models. For now, we only support the DDUF format (similar to GGUF: a single file containing everything), as it was the easiest to implement. However, support for other formats can be added later without much difficulty.
The backend is currently enabled only on Linux, but it also works on Darwin. That said, it requires a Python runtime, so we should first design a solution that works well in Docker Desktop.
I have packaged and pushed
ignaciolopezluna020/stable-diffusion:latestandignaciolopezluna020/stable-diffusion:Q4.You can try by:
Building the image:
docker build --target final-diffusers -t model-runner:diffusers .Then running it:
docker run -it --rm --gpus all -p 13434:13434 -e MODEL_RUNNER_PORT=13434 model-runner:diffusersPull model:
MODEL_RUNNER_HOST=http://localhost:13434 docker model pull ignaciolopezluna020/stable-diffusion:q4And run it:
Note: the image is being built from llama.cpp one, it does not include vllm