-
Notifications
You must be signed in to change notification settings - Fork 14
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Swiglu Issue #17
Comments
Hi what is the error? |
Hey, i think the swiglu issue is when you try to use GPT2LMHead for loading the model, nonetheless i shifted to load the model using this snippet: tokenizer = AutoTokenizer.from_pretrained("gpt2") My question here is can i use these weights directly to test this model accuracy or lets say perplexity on PILE test set, just inference and testing in eval mode? |
@simran-arora @seyuboglu Really sorry for pinging you guys here, but can you guide me a little bit on this? |
Not sure how you got this, but I saw this error b/c the script was trying to load the HuggingFace The fix was explicitly installing
and making sure you use |
There were a few other errors I got, but they were all fixed by following the installs listed on this other Issue: #3 (comment) |
Hey There, appreciate what you guys are doing, its great work.
I'm trying to access the model weights from HF using transformer Library but stuck due to a swiglu error, any help regarding that would be really great, also secondly where can i find direct implementation of the attn-360 or 1.4b variant, i have 1 billion token dataset extracted from pile that i want to try an off the shelf training on attn-360 models!
The text was updated successfully, but these errors were encountered: