Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Launch requirements #18

Open
maxbaluev opened this issue Nov 17, 2022 · 11 comments
Open

Launch requirements #18

maxbaluev opened this issue Nov 17, 2022 · 11 comments

Comments

@maxbaluev
Copy link

Who can share what equipment specifications are needed for each of the model sizes?

@ohmygoobness
Copy link

bump, would like to know the VRAM requirements of each model

@lewiswatson55
Copy link

For interest, tested using my 3090ti w/ 24gb dedicated:

The standard model (inference), immediately got cuda memory issue... Ran model in fp16 and runs! Seems to hover around 15gb (standard, 16fp). Not scientific test though haha.

@nealmcb
Copy link

nealmcb commented Nov 24, 2022

Here is one more anecdotal data point:

Getting started with the Galactica language model - Prog.World says:

The “basic” version consumes about 11 GB of memory.
.... in the “standard” version, our laptop simply ran out of memory

FWIW, they also say:

Galactica currently works with Python versions 3.8 and 3.9. Model installation is not possible with version 3.10 and above. This limitation is currently due to a library requirement prompt source.

@lewiswatson55
Copy link

lewiswatson55 commented Nov 24, 2022

Haven't tested w/o using 16fp. Or had time to do more tests, but standard using 16fp is giving some fantastic results. I'd imagine basic will be similarly good, though.

edit: article looks like a good set up guide tho - mentions couple issues I also had to work out

@nealmcb
Copy link

nealmcb commented Nov 25, 2022

Note that the "base" model works in a free colab notebook, after selecting Runtime/Change runtime time and picking "GPU".

@vladislavivanistsev
Copy link

Note that the "base" model works in a free colab notebook, after selecting Runtime/Change runtime time and picking "GPU".

Would you be so kind and share an example of a colab notebook?

@nealmcb
Copy link

nealmcb commented Nov 27, 2022

@vladislavivanistsev Here is an example of a colab notebook that you should be able to run for free with a GPU runtime: galactica on Colab

I also note, from the paper:

For training the largest 120B model, we use 128 NVIDIA A100 80GB nodes. For inference Galactica 120B
requires a single A100 node.

@agisga
Copy link

agisga commented Jan 11, 2023

Ran model in fp16 and runs! Seems to hover around 15gb (standard, 16fp). Not scientific test though haha.

My experience is similar: the (standard, 16fp) model runs for me with no issues across two GPUs (one RTX 2080 Ti + one GTX 1080 Ti), using about 15gb total (8gb+7gb). Not scientific test either :) but wanted to mention in case it helps someone.

@hwasiti
Copy link

hwasiti commented Jan 12, 2023

Where do you specify that the model should be fp16 and not fp32?

@hwasiti
Copy link

hwasiti commented Jan 12, 2023

Ah
Seems something like this:

model = gal.load_model("huge", num_gpus=4, dtype='float16')

@kno10
Copy link

kno10 commented Mar 22, 2023

Please add a column listing the inference memory requirements for the models, so people can easier judge how much GPU RAM they need for the versions.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

8 participants