Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Swiglu Issue #17

Open
AsadMir10 opened this issue Aug 2, 2024 · 5 comments
Open

Swiglu Issue #17

AsadMir10 opened this issue Aug 2, 2024 · 5 comments

Comments

@AsadMir10
Copy link

AsadMir10 commented Aug 2, 2024

Hey There, appreciate what you guys are doing, its great work.
I'm trying to access the model weights from HF using transformer Library but stuck due to a swiglu error, any help regarding that would be really great, also secondly where can i find direct implementation of the attn-360 or 1.4b variant, i have 1 billion token dataset extracted from pile that i want to try an off the shelf training on attn-360 models!

@simran-arora
Copy link
Collaborator

Hi what is the error?
The implementation configs are provided in train/configs/experiments/reference/

@AsadMir10
Copy link
Author

AsadMir10 commented Oct 7, 2024

Hey, i think the swiglu issue is when you try to use GPT2LMHead for loading the model, nonetheless i shifted to load the model using this snippet:
import torch
from transformers import AutoTokenizer
from based.models.transformer.gpt import GPTLMHeadModel

tokenizer = AutoTokenizer.from_pretrained("gpt2")
model = GPTLMHeadModel.from_pretrained_hf("hazyresearch/attn-360m").to("cuda")

My question here is can i use these weights directly to test this model accuracy or lets say perplexity on PILE test set, just inference and testing in eval mode?

@AsadMir10
Copy link
Author

@simran-arora @seyuboglu Really sorry for pinging you guys here, but can you guide me a little bit on this?

@Miking98
Copy link

Miking98 commented Nov 19, 2024

Not sure how you got this, but I saw this error b/c the script was trying to load the HuggingFace transformers attention file rather than the based version.

The fix was explicitly installing based per the readme:

# clone the repository
git clone git@github.com:HazyResearch/based.git
cd based

# install torch
pip install torch==2.1.2 torchvision==0.16.2 torchaudio==2.1.2 --index-url https://download.pytorch.org/whl/cu118 # due to observed causal-conv1d dependency

# install based package
pip install -e .

and making sure you use from based.models.gpt import GPTLMHeadModel instead of a generic transformers.AutoModel

@Miking98
Copy link

There were a few other errors I got, but they were all fixed by following the installs listed on this other Issue: #3 (comment)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants