Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update outdated pytorch_lightning import to prevent boot crash #12391

Closed

Conversation

olivierlacan
Copy link
Contributor

@olivierlacan olivierlacan commented Aug 7, 2023

Given this environment:

  • Python 3.10.11 (main, Apr 20 2023, 19:02:41) [GCC 11.2.0]
  • stable-diffusion-webui: v1.5.1

The imports prior to the changes in this PR results in this error:

ModuleNotFoundError: No module named 'pytorch_lightning.utilities.distributed'

Full stack trace:

Traceback (most recent call last):
  File "/workspace/sd/launch.py", line 39, in <module>
    main()
  File "/workspace/sd/launch.py", line 35, in main
    start()
  File "/workspace/sd/modules/launch_utils.py", line 390, in start
    import webui
  File "/workspace/sd/webui.py", line 54, in <module>
    from modules.call_queue import wrap_gradio_gpu_call, wrap_queued_call, queue_lock  # noqa: F401
  File "/workspace/sd/modules/call_queue.py", line 6, in <module>
    from modules import shared, progress, errors
  File "/workspace/sd/modules/shared.py", line 21, in <module>
    from ldm.models.diffusion.ddpm import LatentDiffusion
  File "/workspace/sd/repositories/stable-diffusion-stability-ai/ldm/models/diffusion/ddpm.py", line 20, in <module>
    from pytorch_lightning.utilities.distributed import rank_zero_only
ModuleNotFoundError: No module named 'pytorch_lightning.utilities.distributed'

This addresses #11458 which documents the bug but no fix was made to the repo itself (yet).

I'm not sure it's a safe change but it never hurts to ask. For instance I don't know if this is a backward-compatible change.

Importantly this does not resolve the issue fully

Even after this change, a crash occurs due to:

Traceback (most recent call last):
  File "/workspace/sd/launch.py", line 39, in <module>
    main()
  File "/workspace/sd/launch.py", line 35, in main
    start()
  File "/workspace/sd/modules/launch_utils.py", line 390, in start
    import webui
  File "/workspace/sd/webui.py", line 54, in <module>
    from modules.call_queue import wrap_gradio_gpu_call, wrap_queued_call, queue_lock  # noqa: F401
  File "/workspace/sd/modules/call_queue.py", line 6, in <module>
    from modules import shared, progress, errors
  File "/workspace/sd/modules/shared.py", line 21, in <module>
    from ldm.models.diffusion.ddpm import LatentDiffusion
  File "/workspace/sd/repositories/stable-diffusion-stability-ai/ldm/models/diffusion/ddpm.py", line 20, in <module>
    from pytorch_lightning.utilities.distributed import rank_zero_only
ModuleNotFoundError: No module named 'pytorch_lightning.utilities.distributed'

AFAIK that's a downloaded file, so perhaps the version of stable-diffusion-stability-ai needs to be updated? Not sure.

It looks like the current (as of today culprit code in the stablediffusion repository still uses this outdated namespace as well.

Manually editing this code to also use rank_only in the import namespace manages to get the boot process to the next stage:

Downloading: "https://huggingface.co/runwayml/stable-diffusion-v1-5/resolve/main/v1-5-pruned-emaonly.safetensors" to /workspace/sd/models/Stable-diffusion/v1-5-pruned-emaonly.safetensors

 22%|████████████████████

It's likely this PR might need to wait for a while and a better temporary solution would be to pin down the pytorch-lightning dependency to an earlier version that still offers distributed in the import namespace. Since pytorch-lightning 2.0.0(which gets installed with the existing requirements.txt) is major version bump I'm guessing 1.9.x could be safe?

The existing imports results in this error:
ModuleNotFoundError: No module named 'pytorch_lightning.utilities.distributed'

This is documented in AUTOMATIC1111#11458 but no fix was
made to the repo itself.

I'm not sure it's a safe change but it never hurts to ask.
@olivierlacan olivierlacan changed the base branch from master to dev August 7, 2023 19:34
@olivierlacan
Copy link
Contributor Author

Going to close this for now since it doesn't resolve the underlying issue.

Also crucially, I run into another crash which is much harder to decipher:

Python 3.10.11 (main, Apr 20 2023, 19:02:41) [GCC 11.2.0]
Version: v1.5.1
Commit hash: 68f336bd994bed5442ad95bad6b6ad5564a5409a
Launching Web UI with arguments:
no module 'xformers'. Processing without...
no module 'xformers'. Processing without...
No module 'xformers'. Proceeding without it.
Loading weights [6ce0161689] from /workspace/sd/models/Stable-diffusion/v1-5-pruned-emaonly.safetensors
Traceback (most recent call last):
  File "/workspace/sd/launch.py", line 39, in <module>
    main()
  File "/workspace/sd/launch.py", line 35, in main
    start()
  File "/workspace/sd/modules/launch_utils.py", line 394, in start
    webui.webui()
  File "/workspace/sd/webui.py", line 393, in webui
    shared.demo = modules.ui.create_ui()
  File "/workspace/sd/modules/ui.py", line 504, in create_ui
    modules.scripts.scripts_txt2img.setup_ui_for_section(category)
  File "/workspace/sd/modules/scripts.py", line 433, in setup_ui_for_section
    self.create_script_ui(script)
  File "/workspace/sd/modules/scripts.py", line 383, in create_script_ui
    import modules.api.models as api_models
  File "/workspace/sd/modules/api/models.py", line 112, in <module>
    ).generate_model()
  File "/workspace/sd/modules/api/models.py", line 97, in generate_model
    DynamicModel.__config__.allow_population_by_field_name = True
  File "/root/env/lib/python3.10/site-packages/pydantic/_internal/_model_construction.py", line 205, in __getattr__
    raise AttributeError(item)
AttributeError: __config__
Creating model from config: /workspace/sd/configs/v1-inference.yaml
LatentDiffusion: Running in eps-prediction mode
DiffusionWrapper has 859.52 M params.
Applying attention optimization: Doggettx... done.
Model loaded in 2.6s (load weights from disk: 0.3s, create model: 0.6s, apply weights to model: 0.6s, apply half(): 0.7s, move model to device: 0.3s).

@AUTOMATIC1111
Copy link
Owner

the repo uses pytorch_lightning 1.9.4 which does not have this error

@olivierlacan
Copy link
Contributor Author

olivierlacan commented Aug 7, 2023

the repo uses pytorch_lightning 1.9.4 which does not have this error

There are no version specifications for the UI in requirements.txt for pytorch-lightning, so it doesn't protect it from downloading verison 2.0.0 since requirements_version.txt doesn't seem to used when running ./webui.sh AFAIK.

Should it be?

Edit: since not having run pip install -r requirements_versions.txt also means you're missing open-clip-torch 2.20.0, would it make sense to run into on boot?

@dhwz
Copy link
Contributor

dhwz commented Aug 8, 2023

There are no version specifications for the UI in requirements.txt for pytorch-lightning, so it doesn't protect it from downloading verison 2.0.0 since requirements_version.txt doesn't seem to used when running ./webui.sh AFAIK.

Something is wrong your side then of course packages are installed by requirements_version.txt, requirements.txt is only checking if package is already installed. So something else must have installed a different version.

@AUTOMATIC1111
Copy link
Owner

checking if package is already installed is requirements_version.txt too - requirements.txt is only for people who want to experimenting it with running on different versions of python.

@olivierlacan
Copy link
Contributor Author

Closing as unnecessary then. Sorry for the trouble. I think I need a better conda isolation for package most likely.

@olivierlacan olivierlacan deleted the fix/pytorch-lightning branch August 14, 2023 07:17
@Vendetta-S
Copy link

Sorry for necro this post but the issue is still prevalent in 1.6, even when pytorch_lightning is 1.9.4
image_2023-09-17_121946627

@timomohr
Copy link

timomohr commented Oct 3, 2023

pytorch-lightning==1.6.5 works for me

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants