Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Segmentation fault during prompt expansion on AMD GPU #1019

Closed
sourajit02 opened this issue Nov 24, 2023 · 1 comment
Closed

Segmentation fault during prompt expansion on AMD GPU #1019

sourajit02 opened this issue Nov 24, 2023 · 1 comment

Comments

@sourajit02
Copy link

sourajit02 commented Nov 24, 2023

Describe the problem
Trying to generate an image leads to a segmentation fault.
GPU: AMD 7900XTX
Installation method: using python venv
Running a debugger shows that the segmentation fault occurs in modules/expansion.py when trying to execute model.generate() on the AutoModelForCausalLM model:
image

Full Console Log
(fooocus_env) ~ > python .apps/Fooocus/entry_with_update.py
Already up-to-date
Update succeeded.
[System ARGV] ['.apps/Fooocus/entry_with_update.py']
Python 3.11.6 (main, Nov 14 2023, 09:36:21) [GCC 13.2.1 20230801]
Fooocus version: 2.1.824
Running on local URL: http://127.0.0.1:7865

To create a public link, set share=True in launch().
Total VRAM 24560 MB, total RAM 63484 MB
Set vram state to: NORMAL_VRAM
Disabling smart memory management
Device: cuda:0 AMD Radeon RX 7900 XTX : native
VAE dtype: torch.float32
Using sub quadratic optimization for cross attention, if you have memory or speed issues try using: --use-split-cross-attention
13:56:59 INFO: Opening in existing instance
Refiner unloaded.
model_type EPS
adm 2816
Using split attention in VAE
Working with z of shape (1, 4, 32, 32) = 4096 dimensions.
Using split attention in VAE
extra keys {'cond_stage_model.clip_l.text_projection', 'cond_stage_model.clip_g.transformer.text_model.embeddings.position_ids', 'cond_stage_model.clip_l.logit_scale'}
Base model loaded: /home/s/.apps/Fooocus/models/checkpoints/juggernautXL_version6Rundiffusion.safetensors
Request to load LoRAs [['sd_xl_offset_example-lora_1.0.safetensors', 0.1], ['None', 1.0], ['None', 1.0], ['None', 1.0], ['None', 1.0]] for model [/home/s/.apps/Fooocus/models/checkpoints/juggernautXL_version6Rundiffusion.safetensors].
Loaded LoRA [/home/s/.apps/Fooocus/models/loras/sd_xl_offset_example-lora_1.0.safetensors] for UNet [/home/s/.apps/Fooocus/models/checkpoints/juggernautXL_version6Rundiffusion.safetensors] with 788 keys at weight 0.1.
Fooocus V2 Expansion: Vocab with 642 words.
Fooocus Expansion engine loaded for cuda:0, use_fp16 = True.
Requested to load SDXLClipModel
Requested to load GPT2LMHeadModel
Loading 2 new models
[Fooocus Model Management] Moving model(s) has taken 0.47 seconds
App started successful. Use the app with http://127.0.0.1:7865/ or 127.0.0.1:7865
[Parameters] Adaptive CFG = 7
[Parameters] Sharpness = 2
[Parameters] ADM Scale = 1.5 : 0.8 : 0.3
[Parameters] CFG = 4.0
[Parameters] Seed = 752634906692628473
[Parameters] Sampler = dpmpp_2m_sde_gpu - karras
[Parameters] Steps = 30 - 15
[Fooocus] Initializing ...
[Fooocus] Loading models ...
Refiner unloaded.
[Fooocus] Processing prompts ...
[Fooocus] Preparing Fooocus text #1 ...
zsh: segmentation fault (core dumped) python .apps/Fooocus/entry_with_update.py

@sourajit02
Copy link
Author

sourajit02 commented Nov 24, 2023

EDIT: I got it to work. Steps taken:

Install latest versions of torch, torchvision and torchaudio from rocm:
pip install --pre torch torchvision torchaudio --index-url https://download.pytorch.org/whl/nightly/rocm5.7

Run with environment variable HSA_OVERRIDE_GFX_VERSION=11.0.0

Most likely image generation will still work with rocm5.6, the envvar should be the actual fix.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant