[API] Adding support for serving of DiffUsers models. by shenoyvvarun · Pull Request #489 · sgl-project/ome

shenoyvvarun · 2026-01-06T11:48:21Z

What this PR does

add diffusion pipeline support to BaseModelSpec and ServingRuntime supported formats (incl. deep-copy + frontend types)
extend runtime selector matcher to check diffusion pipelines and add coverage; doc the new compatibility dimension
add Qwen-Image ClusterBaseModel, SRT runtime (diffusers pipeline metadata), and sample InferenceService wired to both.

Key Design choice

There is no standard convention for model_index.json (unlike the config.json for transformers based models). All models seem to have the following fields

  "_class_name": "StableDiffusionPipeline",
  "_diffusers_version": "0.27.0"
}

Other fields depends on the model but, these are the most common. (I can remove this based on feedback)

{
  "scheduler": [],
  "text_encoder": [],
  "tokenizer": [],
  "transformer": [],
  "vae": []
}

Why we need it

To support serving of Diffusion models in OME.

Fixes #

How to test

Created the runtime, basemodel and inferenceservice (tested with QwenImage, QwenImageEdit)

e.g.

kubectl get clusterbasemodel
NAME         DISABLED   VERSION   VENDOR   FRAMEWORK   FRAMEWORKVERSION   MODELFORMAT   ARCHITECTURE   CAPABILITIES    SIZE   COMPARTMENTID   READY   AGE
qwen-image   false      1.0.0     Qwen     diffusers   0.34.0.dev0        diffusers                    TEXT_TO_IMAGE                          Ready   5h15m

Checklist

Tests added/updated (if applicable)
Docs updated (if applicable)
make test passes locally

Followups

UI changes (if needed)

gemini-code-assist · 2026-01-06T11:48:45Z

Summary of Changes

Hello @shenoyvvarun, I'm Gemini Code Assist¹! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request significantly expands the platform's capabilities by integrating native support for DiffUsers models. It introduces new API fields to precisely define diffusion pipeline components within model and runtime specifications, and enhances the runtime selection mechanism to intelligently match models with compatible serving environments. This foundational work enables seamless deployment and management of advanced generative AI models, exemplified by the inclusion of a Qwen-Image model configuration.

Highlights

Diffusers Model Support: Introduced comprehensive API support for serving DiffUsers models, enabling the platform to manage and deploy diffusion-based generative AI models.
API Extension for Diffusion Pipelines: Extended BaseModelSpec and ServingRuntime Custom Resource Definitions (CRDs) to include a new diffusionPipeline field, allowing detailed metadata about diffusion model components (e.g., scheduler, text encoder, tokenizer, transformer, VAE) to be specified.
Enhanced Runtime Selector Matching: Updated the runtime selector matcher logic to incorporate diffusionPipeline compatibility checks, ensuring that models are matched with serving runtimes that support their specific diffusion pipeline configurations.
Example Deployment for Qwen-Image: Added sample ClusterBaseModel, ClusterServingRuntime, and InferenceService configurations for the Qwen-Image model, demonstrating how to deploy a DiffUsers model using the new API extensions.
Makefile Refactoring: Refactored the Makefile to streamline Go test execution by introducing a reusable macro and centralizing test environment configuration, improving maintainability and readability.

🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console.

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature	Command	Description
Code Review	`/gemini review`	Performs a code review for the current pull request in its current state.
Pull Request Summary	`/gemini summary`	Provides a summary of the current pull request in its current state.
Comment	@gemini-code-assist	Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help	`/gemini help`	Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution. ↩

gemini-code-assist

Code Review

This pull request introduces support for serving DiffUsers models, a significant enhancement. The changes span across API definitions, CRDs, runtime selection logic, and include new sample configurations. The overall implementation is solid, but I've identified a potential logic issue in the new runtime matching code that could lead to incorrect compatibility checks. I've also pointed out a couple of minor inconsistencies for improvement. My feedback aims to improve the correctness and maintainability of the new functionality.

pkg/runtimeselector/matcher.go

config/runtimes/srt/Qwen/Qwen-Image-rt.yaml

pkg/apis/ome/v1beta1/model.go

shenoyvvarun requested review from CatherineSue, XinyueZhang369, beiguo218, pallasathena92 and slin1237 as code owners January 6, 2026 11:48

gemini-code-assist bot reviewed Jan 6, 2026

View reviewed changes

pkg/runtimeselector/matcher.go Show resolved Hide resolved

config/runtimes/srt/Qwen/Qwen-Image-rt.yaml Outdated Show resolved Hide resolved

pkg/apis/ome/v1beta1/model.go Outdated Show resolved Hide resolved

[API] Adding support for serving of DiffUsers models.

2bcde70

shenoyvvarun force-pushed the vasheno/diffusion-support branch from 6c85ee9 to 2bcde70 Compare January 7, 2026 01:07

pallasathena92 approved these changes Jan 7, 2026

View reviewed changes

slin1237 approved these changes Jan 8, 2026

View reviewed changes

slin1237 merged commit 9f23ca5 into sgl-project:main Jan 8, 2026
25 checks passed

shenoyvvarun mentioned this pull request Feb 5, 2026

[Feature] Model Agent support diffusers models. #520

Open

3 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Comments

[API] Adding support for serving of DiffUsers models.#489

[API] Adding support for serving of DiffUsers models.#489
slin1237 merged 1 commit intosgl-project:mainfrom
shenoyvvarun:vasheno/diffusion-support

shenoyvvarun commented Jan 6, 2026 •

edited

Loading

Uh oh!

gemini-code-assist bot commented Jan 6, 2026

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Comments

Conversation

shenoyvvarun commented Jan 6, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What this PR does

Key Design choice

Why we need it

How to test

Checklist

Followups

Uh oh!

gemini-code-assist bot commented Jan 6, 2026

Summary of Changes

Highlights

Footnotes

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

shenoyvvarun commented Jan 6, 2026 •

edited

Loading