-
Notifications
You must be signed in to change notification settings - Fork 294
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add LLAMA #1446
Conversation
I get the following error when running the model locally with run.py:
Can you help try and see if you can reproduce? |
The limitations look fine; I prefer not downloading the checkpoint. |
Ok this should work
|
@xuzhao9 has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!
@xuzhao9 has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator. |
Co-authored-by: Xu Zhao <xzhao9@fb.com>
|
||
|
||
if device == "cuda": | ||
torch.set_default_device("cuda") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am wondering why torch.set_default_device('cuda')
is needed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's convenient in general to make sure everything runs on the same device that way you dont need to add to(device) calls for inputs and models - but it works for any device so just made that clearer now
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks!
@xuzhao9 has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator. |
1 similar comment
@xuzhao9 has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator. |
self.model_args = ModelArgs(vocab_size=32) | ||
self.model = Transformer(self.model_args) | ||
|
||
torch.set_default_device(device) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am curious why we don't need to explicitly move self.model
and self.example_inputs
to the device here? For example:
self.model = Transformer(self.model_args).to(self.device)
self.example_inputs = (torch.tensor([[1, 1], [1,1]], dtype=torch.int).to(self.device), 1)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed
@xuzhao9 has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator. |
1 similar comment
@xuzhao9 has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator. |
This weeekend the internet was buzzing with news of this cpp llama implementation ggerganov/llama.cpp#33 (comment) - It seems like it might not be that tricky to also validate correctness for this model by loading in real checkpoints and tokenizer despite minor differences in the model Might need to do a round 2 pass on this PR soon |
Fixes #1443
Notable things I had to do - @ezyang hopefully these are OK
Some other things I can improve in an another PR
I can run the code now