Skip to content

Comments

[Feature] Model Agent support diffusers models.#520

Open
shenoyvvarun wants to merge 2 commits intosgl-project:mainfrom
shenoyvvarun:vasheno/diffUsersModelAgentSupport
Open

[Feature] Model Agent support diffusers models.#520
shenoyvvarun wants to merge 2 commits intosgl-project:mainfrom
shenoyvvarun:vasheno/diffUsersModelAgentSupport

Conversation

@shenoyvvarun
Copy link
Contributor

@shenoyvvarun shenoyvvarun commented Feb 5, 2026

What this PR does

PR #489 added API support for diffusion models API and also matcher support to automatically pick the right runtime. This adds the support for custom model import by adding support in OME model-agent.

Changes

  • Added model agent support for diffusers by parsing model_index.json, mapping pipeline components into DiffusionPipelineSpec, and updating BaseModel/ClusterBaseModel spec with the diffusion pipeline metadata.
  • Made diffusion parsing take precedence over config.json, set modelFramework/modelFormat to diffusers, and added helpers for component/class parsing plus tests for diffusion parsing and model_index preference.

Revision 2

  • Added GenericDiffusionModelConfig (interface HuggingFaceDiffusionModel) and methods to return parameterSize for diffusion model.
  • Added a palcehodler logic for model capabilities.

Minor bug fix

  • ignore .iml file.
  • adjust Hugging Face repo file listing URL construction. The current logic of escaping path for repoId results in 404s
meta-llama/Llama-3.3-70B-Instruct

meta-llama%2FLlama-3.3-70B-Instruct

Why we need it

  • Needed for custom model import.

How to test

  • Applied the config/models/Qwen/Qwen-Image.yaml
  • Model agent works on it and result
kubectl get clusterbasemodel qwen-image -o yaml
apiVersion: ome.io/v1beta1
kind: ClusterBaseModel
metadata:
  annotations:
    kubectl.kubernetes.io/last-applied-configuration: |
      {"apiVersion":"ome.io/v1beta1","kind":"ClusterBaseModel","metadata":{"annotations":{},"name":"qwen-image"},"spec":{"disabled":false,"displayName":"qwen.qwen-image","storage":{"key":"hf-token","path":"/raid/models/Qwen/Qwen-Image","storageUri":"hf://Qwen/Qwen-Image"},"vendor":"Qwen","version":"1.0.0"}}
  creationTimestamp: "2026-02-19T10:08:21Z"
  finalizers:
  - clusterbasemodels.ome.io/finalizer
  generation: 4
  name: qwen-image
  resourceVersion: "34964"
  uid: d0e7e113-94b1-492b-a4cc-a528705ffa47
spec:
  diffusionPipeline:
    className: QwenImagePipeline
    scheduler:
      library: diffusers
      type: FlowMatchEulerDiscreteScheduler
    textEncoder:
      library: transformers
      type: Qwen2_5_VLForConditionalGeneration
    tokenizer:
      library: transformers
      type: Qwen2Tokenizer
    transformer:
      library: diffusers
      type: QwenImageTransformer2DModel
    vae:
      library: diffusers
      type: AutoencoderKLQwenImage
  disabled: false
  displayName: qwen.qwen-image
  maxTokens: 128000
  modelArchitecture: QwenImagePipeline
  modelCapabilities:
  - TEXT_TO_IMAGE
  - IMAGE_TEXT_TO_IMAGE
  modelConfiguration:
    architecture: QwenImagePipeline
    context_length: 128000
    has_vision: true
    model_size_bytes: 115397841100
    model_type: diffusers
    parameter_count: 28.85B
    torch_dtype: ""
    transformers_version: ""
  modelFormat:
    name: diffusers
    operator: Equal
    version: 0.34.0.dev0
    weight: 1
  modelFramework:
    name: diffusers
    operator: Equal
    version: 0.34.0.dev0
    weight: 1
  modelParameterSize: 28.85B
  modelType: diffusers
  storage:
    key: hf-token
    path: /raid/models/Qwen/Qwen-Image
    storageUri: hf://Qwen/Qwen-Image
  vendor: Qwen
  version: 1.0.0
status:
  nodesReady:
  - test-worker
  state: Ready

  • Also tested transformers model to ensure nothing is broken.

Checklist

  • Tests added/updated (if applicable)
  • Docs updated (if applicable)
  • make test passes locally

@github-actions github-actions bot added models Model configuration changes model-agent Model agent changes tests Test changes config Configuration changes labels Feb 5, 2026
@gemini-code-assist
Copy link
Contributor

Summary of Changes

Hello @shenoyvvarun, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request significantly enhances the model agent's capabilities by introducing native support for diffusers models. It enables automatic detection and configuration of these models by parsing their model_index.json files, which now take precedence over traditional config.json files for diffusion models. This change is crucial for facilitating custom model imports and streamlining the integration of a wider range of AI models. Additionally, it includes minor bug fixes related to .gitignore and Hugging Face repository URL construction.

Highlights

  • Diffusers Model Support: The model agent now supports diffusers models by parsing model_index.json, automatically extracting pipeline components and metadata.
  • Metadata Precedence: model_index.json parsing takes precedence over config.json for model metadata extraction, ensuring correct handling of diffusers specific configurations.
  • Hugging Face URL Fix: Corrected an issue where url.PathEscape was incorrectly applied to Hugging Face repository IDs, resolving 404 errors during file listing.
  • Ignored Files: Added .iml files to the .gitignore to prevent unnecessary tracking of IDE-specific files.

🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console.

Changelog
  • .gitignore
    • Added *.iml to the list of ignored files.
  • config/models/Qwen/Qwen-Image.yaml
    • Removed explicit modelFormat, modelFramework, and diffusionPipeline specifications, allowing the model agent to dynamically infer these properties.
  • pkg/hfutil/hub/repo.go
    • Modified the ListRepoFiles function to remove url.PathEscape from the config.RepoID when constructing the API URL for RepoTypeModel, resolving a 404 error.
  • pkg/modelagent/config_parser.go
    • Introduced DefaultModelIndexFileName constant for model_index.json.
    • Updated ParseModelConfig to first search for and parse model_index.json for diffusion models, falling back to config.json if not found.
    • Added findModelIndexFile function to locate model_index.json within a model directory.
    • Implemented loadDiffusionPipelineSpec, parseDiffusionPipelineSpec, parseDiffusersComponent, and parseJSONStringField to extract and structure diffusion pipeline metadata from model_index.json.
    • Modified updateModelSpec to correctly handle and apply the newly parsed DiffusionPipeline specification.
  • pkg/modelagent/config_parser_test.go
    • Added TestParseDiffusionPipelineSpec to verify the correct parsing of model_index.json content into a DiffusionPipelineSpec.
    • Added TestParseModelConfig_PrefersModelIndex to confirm that model_index.json is prioritized over config.json during model configuration parsing.
  • pkg/modelagent/model_data.go
    • Added a DiffusionPipeline field to the ModelMetadata struct to store parsed diffusion model pipeline specifications.
Activity
  • Enhanced the model agent to support diffusers models, building upon existing API support for diffusion models.
  • Implemented logic to parse model_index.json files, extract pipeline components, and update model specifications accordingly.
  • Ensured that model_index.json takes precedence over config.json for diffusion model metadata.
  • Addressed a minor bug by ignoring .iml files in the .gitignore.
  • Fixed an issue in Hugging Face repository URL construction to prevent incorrect path escaping.
  • Validated the changes by applying a Qwen/Qwen-Image.yaml configuration and verifying the resulting ClusterBaseModel output.
  • Confirmed that existing transformers model functionality remains unbroken after these changes.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request adds support for diffusers models in the model agent by parsing model_index.json. The changes include prioritizing model_index.json over config.json, updating the model spec with diffusion pipeline metadata, and adding corresponding parsing logic and tests. A bug fix for Hugging Face repository URL construction is also included. The changes are well-implemented and tested. I have two suggestions: one to address a potential bug by ensuring consistency in URL escaping for different repository types, and another to refactor duplicated code for better maintainability.

switch repoType {
case RepoTypeModel:
apiURL = fmt.Sprintf("%s/api/models/%s/tree/%s?recursive=true", endpoint, url.PathEscape(config.RepoID), url.QueryEscape(revision))
apiURL = fmt.Sprintf("%s/api/models/%s/tree/%s?recursive=true", endpoint, config.RepoID, url.QueryEscape(revision))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

While removing url.PathEscape for RepoTypeModel is correct, this introduces an inconsistency. RepoTypeDataset and RepoTypeSpace still use url.PathEscape, but their repository IDs can also contain slashes (e.g., mozilla-foundation/common_voice_11_0) which would lead to the same 404 error. It is recommended to apply this fix consistently across all repository types that can have slashes in their IDs to prevent future bugs.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The comment is fair, this is dataset and space too. I can make similar changes for others too based on reviwer feedback.

Comment on lines +175 to +204
// findModelIndexFile searches for the model_index.json file in the model directory
// It checks the root directory and common subdirectories
func (p *ModelConfigParser) findModelIndexFile(modelDir string) (string, error) {
// Check the root directory first
rootIndexPath := filepath.Join(modelDir, DefaultModelIndexFileName)
if _, err := os.Stat(rootIndexPath); err == nil {
return rootIndexPath, nil
}

// If not found in rootdir, do a recursive search (limited to avoid deep searching)
var indexPath string
err := filepath.Walk(modelDir, func(path string, info os.FileInfo, err error) error {
if err != nil {
return nil // Skip error files
}
if !info.IsDir() && info.Name() == DefaultModelIndexFileName {
indexPath = path
return filepath.SkipDir // Found it, stop searching
}
return nil
})
if err != nil {
return "", err
}

if indexPath == "" {
return "", fmt.Errorf("model_index.json not found in %s", modelDir)
}
return indexPath, nil
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The new findModelIndexFile function has very similar logic to the existing findConfigFile function. To improve maintainability and avoid code duplication, consider refactoring these two functions into a single, generic findFile helper function. This new function could take the filename and a list of common subdirectories to check as arguments.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is similar but, there is slight difference. The only logic that is shared is the sub-directory lookup. I feel it makes the code look clean but, more than willing to change the code based on feedback.

@github-actions github-actions bot added auth Authentication and authorization changes runtime Runtime configuration changes labels Feb 19, 2026
@shenoyvvarun shenoyvvarun force-pushed the vasheno/diffUsersModelAgentSupport branch from ae20c4e to 5b7ec88 Compare February 19, 2026 10:18
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

auth Authentication and authorization changes config Configuration changes model-agent Model agent changes models Model configuration changes runtime Runtime configuration changes tests Test changes

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant