Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

runpod version #80

Open
misiek75 opened this issue Oct 5, 2022 · 10 comments
Open

runpod version #80

misiek75 opened this issue Oct 5, 2022 · 10 comments

Comments

@misiek75
Copy link

misiek75 commented Oct 5, 2022

Will there be a version to run on a runpod ?

@TheLastBen
Copy link
Owner

I'll look into it

@0xdevalias
Copy link

0xdevalias commented Nov 7, 2022

I'm also very interested in this, and am currently exploring it myself. Will try and remember to get back here with any notes/tips/etc.


The first issue i've run into is xformers on an RTX 3090

Related refs:

I needed to install g++:

apt-get update
apt-get install g++

Then I could run the following code to compile a wheel:

!pip install --upgrade setuptools

!git clone https://github.com/facebookresearch/xformers
%cd xformers
!git submodule update --init --recursive

!pip install -r requirements.txt

!python setup.py sdist bdist_wheel --universal

Which output into xformers/dist/, and weirdly seemed to run super quick.. not the 40min+ that most sources seem to imply it will take:

⇒ ls -la xformers/dist/
total 624
drwxr-xr-x  2 root root     98 Nov  7 02:38 .
drwxr-xr-x 16 root root   4096 Nov  7 02:37 ..
-rw-r--r--  1 root root 317106 Nov  7 02:38 xformers-0.0.14.dev0-cp310-cp310-linux_x86_64.whl
-rw-r--r--  1 root root 314751 Nov  7 02:37 xformers-0.0.14.dev0.tar.gz

This was on a RunPod machine using an RTX 3090

Might be able to include this in the repo like what was done in #144 / #149 (assuming I did it right)?

The output from nvidia-smi is:

Mon Nov  7 02:46:48 2022
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 515.65.01    Driver Version: 515.65.01    CUDA Version: 11.7     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                               |                      |               MIG M. |
|===============================+======================+======================|
|   0  NVIDIA GeForce ...  On   | 00000000:81:00.0 Off |                  N/A |
| 30%   34C    P8    21W / 350W |      1MiB / 24576MiB |      0%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+

+-----------------------------------------------------------------------------+
| Processes:                                                                  |
|  GPU   GI   CI        PID   Type   Process name                  GPU Memory |
|        ID   ID                                                   Usage      |
|=============================================================================|
|  No running processes found                                                 |
+-----------------------------------------------------------------------------+

Though unfortunately that truncates the name of the card.

Seemingly a simpler/better command to run is nvidia-smi --list-gpus which outputs:

GPU 0: NVIDIA GeForce RTX 3090 (UUID: GPU-465df265-51b2-3536-e533-9c0fb6baf9b6)

We can tweak the code in the notebook to make use of this as follows:

- s = getoutput('nvidia-smi')
+ s = getoutput('nvidia-smi --list-gpus')
if 'T4' in s:
  gpu = 'T4'
elif 'P100' in s:
  gpu = 'P100'
elif 'V100' in s:
  gpu = 'V100'
elif 'A100' in s:
  gpu = 'A100'
+ elif 'RTX 3090' in s:
+   gpu = 'RTX 3090'

while True:
    try: 
-        gpu=='T4'or gpu=='P100'or gpu=='V100'or gpu=='A100'
+        gpu=='T4'or gpu=='P100'or gpu=='V100'or gpu=='A100' or gpu=='RTX 3090'
        break
    except:
        pass
    print('�[1;31mit seems that your GPU is not supported at the moment')
    time.sleep(5)

And then a little further down we can install the wheel we just made (which would be changed to the URL in this repo if it's uploaded/included):

+ elif (gpu=='RTX 3090'):
+   %pip install -q xformers/dist/xformers-0.0.14.dev0-cp310-cp310-linux_x86_64.whl

I'm also not really sure why this section is in a while True, as that is just going to spin forever. Maybe to force the notebook to stop if people just try and run all the cells at once?


When it comes to downloading the model, I see you're basically installing git LGS/etc, but you can actually just make an authenticated wget call to download it:

Ok, just played around with this a bit more, and looks like we can add the auth to wget files as follows:

# Get YOURTOKEN from https://huggingface.co/settings/tokens
wget 'https://Bearer:YOURTOKEN@huggingface.co/runwayml/stable-diffusion-inpainting/resolve/main/sd-v1-5-inpainting.ckpt'

Originally posted by @0xdevalias in huggingface/huggingface_hub#1105 (comment)


There are a few cells in the notebook (eg. image upload) that import from google.colab that obviously aren't going to work outside of there:

ModuleNotFoundError: No module named 'google.colab'

There are also some paths configured that make more sense in colab/similar than they do for RunPod, such as:

  • --Session_dir=/content/gdrive/MyDrive/Fast-Dreambooth/Sessions/mysessionname

@magickaito
Copy link

I'm also very interested in this, and am currently exploring it myself. Will try and remember to get back here with any notes/tips/etc.

The first issue i've run into is xformers on an RTX 3090

Related refs:

I needed to install g++:

apt-get update
apt-get install g++

Then I could run the following code to compile a wheel:

!pip install --upgrade setuptools

!git clone https://github.com/facebookresearch/xformers
%cd xformers
!git submodule update --init --recursive

!pip install -r requirements.txt

!python setup.py sdist bdist_wheel --universal

Which output into xformers/dist/, and weirdly seemed to run super quick.. not the 40min+ that most sources seem to imply it will take:

⇒ ls -la xformers/dist/
total 624
drwxr-xr-x  2 root root     98 Nov  7 02:38 .
drwxr-xr-x 16 root root   4096 Nov  7 02:37 ..
-rw-r--r--  1 root root 317106 Nov  7 02:38 xformers-0.0.14.dev0-cp310-cp310-linux_x86_64.whl
-rw-r--r--  1 root root 314751 Nov  7 02:37 xformers-0.0.14.dev0.tar.gz

This was on a RunPod machine using an RTX 3090

Might be able to include this in the repo like what was done in #144 / #149 (assuming I did it right)?

The output from nvidia-smi is:

Mon Nov  7 02:46:48 2022
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 515.65.01    Driver Version: 515.65.01    CUDA Version: 11.7     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                               |                      |               MIG M. |
|===============================+======================+======================|
|   0  NVIDIA GeForce ...  On   | 00000000:81:00.0 Off |                  N/A |
| 30%   34C    P8    21W / 350W |      1MiB / 24576MiB |      0%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+

+-----------------------------------------------------------------------------+
| Processes:                                                                  |
|  GPU   GI   CI        PID   Type   Process name                  GPU Memory |
|        ID   ID                                                   Usage      |
|=============================================================================|
|  No running processes found                                                 |
+-----------------------------------------------------------------------------+

Though unfortunately that truncates the name of the card.

Seemingly a simpler/better command to run is nvidia-smi --list-gpus which outputs:

GPU 0: NVIDIA GeForce RTX 3090 (UUID: GPU-465df265-51b2-3536-e533-9c0fb6baf9b6)

We can tweak the code in the notebook to make use of this as follows:

- s = getoutput('nvidia-smi')
+ s = getoutput('nvidia-smi --list-gpus')
if 'T4' in s:
  gpu = 'T4'
elif 'P100' in s:
  gpu = 'P100'
elif 'V100' in s:
  gpu = 'V100'
elif 'A100' in s:
  gpu = 'A100'
+ elif 'RTX 3090' in s:
+   gpu = 'RTX 3090'

while True:
    try: 
-        gpu=='T4'or gpu=='P100'or gpu=='V100'or gpu=='A100'
+        gpu=='T4'or gpu=='P100'or gpu=='V100'or gpu=='A100' or gpu=='RTX 3090'
        break
    except:
        pass
    print('�[1;31mit seems that your GPU is not supported at the moment')
    time.sleep(5)

And then a little further down we can install the wheel we just made (which would be changed to the URL in this repo if it's uploaded/included):

+ elif (gpu=='RTX 3090'):
+   %pip install -q xformers/dist/xformers-0.0.14.dev0-cp310-cp310-linux_x86_64.whl

I'm also not really sure why this section is in a while True, as that is just going to spin forever. Maybe to force the notebook to stop if people just try and run all the cells at once?

When it comes to downloading the model, I see you're basically installing git LGS/etc, but you can actually just make an authenticated wget call to download it:

Ok, just played around with this a bit more, and looks like we can add the auth to wget files as follows:

# Get YOURTOKEN from https://huggingface.co/settings/tokens
wget 'https://Bearer:YOURTOKEN@huggingface.co/runwayml/stable-diffusion-inpainting/resolve/main/sd-v1-5-inpainting.ckpt'

Originally posted by @0xdevalias in huggingface/huggingface_hub#1105 (comment)

There are a few cells in the notebook (eg. image upload) that import from google.colab that obviously aren't going to work outside of there:

ModuleNotFoundError: No module named 'google.colab'

There are also some paths configured that make more sense in colab/similar than they do for RunPod, such as:

  • --Session_dir=/content/gdrive/MyDrive/Fast-Dreambooth/Sessions/mysessionname

do you have a working jupyter notebook for RunPod? Looking foward to try this faster method on Runpod.

@0xdevalias
Copy link

do you have a working jupyter notebook for RunPod? Looking foward to try this faster method on Runpod.

Not at this stage. I spent a couple days playing around with things to understand the code better last week, ran into some issues, but am fairly sure I figured enough to be able to pull together a simple notebook for it. Just need to find the spare time to do so now. Hoping maybe can look at it again tomorrow all going to plan.

@0xdevalias
Copy link

Got to spend some time looking at some more of the bits and pieces for this today, including a rather lengthy deepdive into getting some xformers compilation stuff figured out, which I somewhat summarised in this followup TLDR:

So after that giant deepdive, it seems that the TL;DR for resolving the following error:

UnsatisfiableError: The following specifications were found to be incompatible with each other:

Output in format: Requested package -> Available versionsThe following specifications were found to be incompatible with your system:

  - feature:/linux-64::__glibc==2.27=0
  - feature:|@/linux-64::__glibc==2.27=0

Is to ensure that you have all of the required -c conda repositories enabled when doing your conda install. I suspect any of these should work correctly, depending on your needs:

PyTorch 1.13, Cuda 11.7.1, xformers:

conda install -n "$CONDA_ENV_NAME" -c xformers/label/dev -c pytorch -c nvidia/label/cuda-11.7.1 xformers=*=py310_cu11.7_pyt1.13
conda install -n "$CONDA_ENV_NAME" -c xformers/label/dev -c pytorch -c nvidia/label/cuda-11.7.1 xformers
conda install -n "$CONDA_ENV_NAME" -c xformers/label/dev -c pytorch -c nvidia/label/cuda-11.7.1 pytorch=1.13 pytorch-cuda=11.7 xformers

..snip..

Originally posted by @0xdevalias in facebookresearch/xformers#390 (comment)

After that I was able to correctly compile xformers, run the little test script to confirm it looks like it worked, and generate a pre-compiled wheel from an RTC 3090 server that was correctly ~80mb or so as expected.

Next time I get a chance to work on this I want to take what I learned there, and figure it back into my main streamlined colab, hopefully getting much closer to having a full working solution.

@roperi
Copy link

roperi commented Jan 23, 2023

@0xdevalias

Had any luck installing the correct wheel on a runpod?

@0xdevalias
Copy link

0xdevalias commented Jan 24, 2023

@roperi It seems like I got it working in my last post (Ref), including instructions.

Though I never actually got back to integrating that into a larger workflow; haven't had time/motivation to look at SD things in ages unfortunately.

@roperi
Copy link

roperi commented Jan 24, 2023

@0xdevalias
Oh, thanks. I didn't realise you were referring to runpod in that post.

@0xdevalias
Copy link

No worries. Hope it helps! :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants