Skip to content

Conversation

@keturn
Copy link
Contributor

@keturn keturn commented Jul 29, 2023

InvokeAI 3.0.1 tends to download full-width tensor files even when it's configured to run them in float16: #4127

One step toward diagnosing and correcting this is to make it more visible what the types of tensor files are. The presence of .fp16 in the file name is a good hint, but I think we should verify.

Ultimately, there should be UI for this in the web interface. For now I've hacked it in to invokeai-model-install --list-models.

[code example]

If you want to, you can see a per-file breakdown (for submodels) instead of a single item for the whole diffusers multimodel.

from invokeai.app.services.config import InvokeAIAppConfig
from invokeai.backend.model_management import ModelManager

config = InvokeAIAppConfig.get_config()
mm = ModelManager(config.model_conf_path)

import itertools
from pathlib import Path
from invokeai.backend.model_management.models.base import calc_file_format_and_dtype

# TODO: How we do figure out which tensor files will actually be loaded?

def print_tensor_types_in_directory(model_path: Path):
    extensions = ['safetensors', 'ckpt', 'bin']
    if model_path.is_dir():
        tensor_paths = list(itertools.chain.from_iterable(model_path.glob(f"**/*.{ext}") for ext in extensions))
    else:
        tensor_paths = [model_path]
    
    for path in tensor_paths:
        file_format, dtype = calc_file_format_and_dtype(path)

        if model_path.is_dir():
            relative_path = path.relative_to(model_path)
        else:
            relative_path = path.name
        
        type_str = str(dtype).rsplit('.')[-1]
        print(f"  {file_format[0].upper()} {type_str}: {relative_path}")

for name in mm.model_names():
    print(f"{name[0]} [{name[1]}/{name[2]}]:")
    model = mm._instantiate(*name)
    print_tensor_types_in_directory(model.model_path)
    print()

Output:

stable-diffusion-xl-refiner-1-0 [BaseModelType.StableDiffusionXLRefiner/ModelType.Main]:
  S float32: text_encoder_2/model.safetensors
  S float32: vae/diffusion_pytorch_model.safetensors
  S float32: unet/diffusion_pytorch_model.safetensors

normal_bae [BaseModelType.StableDiffusion1/ModelType.ControlNet]:
  S float16: diffusion_pytorch_model.fp16.safetensors

tile [BaseModelType.StableDiffusion1/ModelType.ControlNet]:
  P float32: diffusion_pytorch_model.bin

This PR is a proof of concept, and based on #4059. I expect that will have to change around somewhat after the other model manager fixes for 3.0.1 land.

TODO

  • handle non-safetensor files? (are pickles banned yet?)

What type of PR is this? (check all applicable)

  • Refactor
  • Feature
  • Bug Fix
  • Optimization
  • Documentation Update
  • Community Node Submission

Have you discussed this change with the InvokeAI team?

  • Hello

Have you updated all relevant documentation?

  • Yes
  • No

Related Tickets & Documents

  • Related Issue #
  • Closes #

QA Instructions, Screenshots, Recordings

Added/updated tests?

  • Yes
  • No : please replace this line with details on why tests have not been included

@keturn
Copy link
Contributor Author

keturn commented Jul 29, 2023

Some questions too about the UI & API for this. Like what do we do about submodels? Is there ever a time when, say, a text encoder is at float32 while the unet is float16?

If so, how should the supermodel represent its type?

@keturn keturn added enhancement New feature or request model manager labels Jul 29, 2023
@lstein
Copy link
Collaborator

lstein commented Jul 30, 2023

Some questions too about the UI & API for this. Like what do we do about submodels? Is there ever a time when, say, a text encoder is at float32 while the unet is float16?

If so, how should the supermodel represent its type?

I haven't seen mixed-precision models, but in theory there's no reason why they couldn't exist. I've created them by accident and they were fully functional. I guess if you had to, you'd create a type called "mixed" for the supermodel.

@keturn
Copy link
Contributor Author

keturn commented Aug 1, 2023

updated to support pickles and single-file models

@keturn
Copy link
Contributor Author

keturn commented Aug 22, 2023

API Questions

  • I ended up including serialization format (pickle vs safetensors) as well. What's the desired interface for this?
    • Different methods for each?
    • Combine serialization + data type to one struct?
    • Combine also with things that return a “model format”?
  • Should we stick with returning a torch.dtype, or do we need something more runtime-agnostic in anticipation of having non-pytorch models?
  • needing to describe a type as "mixed" points toward using a new type?
  • How to avoid running dtype-detection code when we don't want to? I put it in ModelManager.list_models so it would show up in the cli --list-models output, and that method already returned a loose dict of fields. But other things use that method too. The dtype-detection code is pretty fast, but it does require reading at least a bit from each file, which is slower than things that only need a stat or directory listing.
  • How to expose this to the web API?

@bghira
Copy link

bghira commented Sep 1, 2023

i can state with certainty i've never bothered to label fp16 versions of the model, and only ever created wide versions as they're intended to continue fine-tuning from.

@Millu
Copy link
Contributor

Millu commented Nov 3, 2023

@keturn this would be reliant on the MM refactor right?

Have concerns about slowdown, but if it's only slow during model download, that wouldn't be an issue

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request model manager

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants