-
Notifications
You must be signed in to change notification settings - Fork 899
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix torch multiprocessing error in gptneox conversion script #587
base: main
Are you sure you want to change the base?
Conversation
can you provide some buggy cases that the current script can not deal with? |
@AkiyamaYummy The command I am using is: It fails at the above line #L141 with error because I don't have that directory. I see in the gptneox_guide, it suggests to clone the model
To get around the error, I added the check and ran the same command.
It runs in to the error which is similar to #443
Therefore, I introduced the change in this PR to fix the errors. |
@byshiue @AkiyamaYummy |
By using Of course, you can also make this script more convenient by being compatible with this scenario. PS: Some of us, including myself, prefer to maintain Huggingface's model files on our own because Huggingface defaults to downloading models to its cache folder, and I don't want large files to take up a lot of my hard disk resources without being visible in my workspace. |
@AkiyamaYummy that makes sense, my change will not break that behavior. |
@byshiue @AkiyamaYummy reminder for review. |
Fixes:
Context has already been set
error from torch multiprocessing similar to [BUG FIX] place multi-processing init to main method #443device_count
is incorrectly set in the example script.