-
Notifications
You must be signed in to change notification settings - Fork 9
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Questions about Hardware requirement #12
Comments
Hi, thanks for your interest! I am currently trying to investigate this quantization problem. |
Same here. Is there an approximate estimation for VRAM usage? (<20GB or ~24GB) |
Excuse me, but when the model inference on 1 * RTX4090, running
python cli_demo_sat.py --from_pretrained cogcom-base-17b --local_tokenizer tokenizer --english --quant 4
, the output will be CUDA out of memory. I wonder if it needs more GPU, or I need to add some arguments? Thank you!The text was updated successfully, but these errors were encountered: