LLaMA #21796

michaelroyzen · 2023-02-24T22:00:33Z

Model description

New model series from Facebook (7B, 33B, 66B) that is broadly competitive with Flan-PALM-540B.

https://research.facebook.com/publications/llama-open-and-efficient-foundation-language-models/

Open source status

The model implementation is available
The model weights are available

Provide useful links for the implementation

No response

sayantan1410 · 2023-02-26T05:54:18Z

Hello, @michaelroyzen , I want to work on this issue, can you please clarify this:-

The objective of this issue is to add the Llama model to the 🤗 models section right ?
The inference code for the Llama models is open sourced and weights and tokenizers are available as you mentioned.
I can try to work on this issue, Please let me know if this issue is open for working and should I proceed or not.

Eric-Wallace-WebHost · 2023-02-26T18:07:32Z

Hello @sayantan1410. At this moment the code for inference is available, but to get the weights you need to fill out the request form from their github. It'd be great for you to work on this, but it would require doing so with a hypothethical set of weights, given that they have not started actually releasing weights to people who asked for it just yet.

sayantan1410 · 2023-02-26T18:42:14Z

Hello @Eric-Wallace-WebHost , I have actually filled up the form for the weights and the tokenizers but since I don't have any related publications so probably, I will not get that. But for now, I will try to work with some hypothetical weights until the weights are released !

Sea-Snell · 2023-02-26T23:22:37Z

Also will there be a Jax implementation? It would be super helpful. I can help contribute to it as well

young-geng · 2023-03-01T12:02:13Z

I can contribute as well for the Jax implementation! Also I'm not sure if we can just use their pytorch code, since it is released under GPLv3 instead of the Apache License of transformers.

honglu2875 · 2023-03-02T06:13:24Z

I have the weights. Haven't checked out the rules and I'm gonna assume I can't share it, but if you guys have an implementation I would love to help by testing it out.

sgugger · 2023-03-02T12:54:40Z

At this stage we don't know if there is going to be an implementation in Transformers due to:

inaccessibility of weights (no one who got them is allowed to share them on the Hub)
different license of the code

We are looking if the Meta folks would be happy to release the weights in a gated repo on the Hub and if the code will be in Transformers or just put as code on the Hub because of the license. @thomasw21 is working on a PyTorch port that our research team will use in any case.

So stay tuned!

henk717 · 2023-03-02T17:55:40Z

At this stage we don't know if there is going to be an implementation in Transformers due to:

inaccessibility of weights (no one who got them is allowed to share them on the Hub)

Even if there is no permission to have the weights on the hub, usually transformers models are released with the conversion scripts done for the conversion. Even an implementation combined with the needed conversion script can be useful, because then researchers can convert the model to HF if needed and still use it within their HF based projects without having to reinvent the wheel.

hughbzhang · 2023-03-02T20:47:20Z

+1 to henk717. Would be super useful even if there was just a way to plug in your own weights and use the existing transformers library!

AmericanPresidentJimmyCarter · 2023-03-04T16:20:32Z

It looks like the weights are right here.

https://huggingface.co/nyanko7/LLaMA-7B
https://huggingface.co/ricecake/LLaMA/tree/main
https://huggingface.co/datasets/nyanko7/LLaMA-65B

License is here:

https://docs.google.com/forms/d/e/1FAIpQLSfqNECQnMkycAp2jP4Z9TFX0cGR4uf7b_fBxjY_OjhJILlKGA/viewform

zphang · 2023-03-04T20:25:39Z

Working on this today!

elephantpanda · 2023-03-06T08:35:46Z

Are weights actually copyrightable? Technically, they are just a list of numbers generated by a machine and hence don't fall under US copyright laws.

I say, just upload the weights and call Meta's bluff.

dustydecapod · 2023-03-06T13:37:49Z

Are weights actually copyrightable? Technically, they are just a list of numbers generated by a machine and hence don't fall under US copyright laws.

I say, just upload the weights and call Meta's bluff.

lots of people are way ahead of you on this.

elephantpanda · 2023-03-06T16:12:30Z

Can someone make an ONNX version? I tried to convert it but I ran out of RAM.

I would quite like to try it with Onnxruntime. Even though I think this uses far more VRAM than using torch. Also onnxruntime has a memory leak with external weight files. But still...

simhallq · 2023-03-07T19:58:45Z

I'm interested in fine-tuning LLaMa for creating text embeddings, anyone have any tips for how to do it with the LLaMa architecture? Can I just add a pooling layer at the end?

https://github.com/nebuly-ai/nebullvm/tree/main/apps/accelerate/chatllama

Here's code for RLHF training btw

Sea-Snell · 2023-03-09T10:04:22Z

I have a working Jax implementation here

michaelroyzen added the New model label Feb 24, 2023

zphang mentioned this issue Mar 5, 2023

LLaMA Implementation #21955

Closed

5 tasks

josephrocca mentioned this issue Mar 10, 2023

Enhancement: Add text-generation-webui as a back-end to support local models josephrocca/OpenCharacters#6

Open

peakji mentioned this issue Mar 17, 2023

Does this work for LLaMA models? hyperonym/basaran#57

Closed

michaelroyzen closed this as completed Mar 21, 2023

0amp mentioned this issue Apr 5, 2023

Add LLaMA support TransformerLensOrg/TransformerLens#234

Merged

7 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

LLaMA #21796

LLaMA #21796

michaelroyzen commented Feb 24, 2023

sayantan1410 commented Feb 26, 2023

Eric-Wallace-WebHost commented Feb 26, 2023

sayantan1410 commented Feb 26, 2023

Sea-Snell commented Feb 26, 2023

young-geng commented Mar 1, 2023

honglu2875 commented Mar 2, 2023 •

edited

Loading

sgugger commented Mar 2, 2023 •

edited

Loading

henk717 commented Mar 2, 2023

hughbzhang commented Mar 2, 2023

AmericanPresidentJimmyCarter commented Mar 4, 2023

zphang commented Mar 4, 2023

elephantpanda commented Mar 6, 2023 •

edited

Loading

dustydecapod commented Mar 6, 2023

elephantpanda commented Mar 6, 2023 •

edited

Loading

simhallq commented Mar 7, 2023

Sea-Snell commented Mar 9, 2023

LLaMA #21796

LLaMA #21796

Comments

michaelroyzen commented Feb 24, 2023

Model description

Open source status

Provide useful links for the implementation

sayantan1410 commented Feb 26, 2023

Eric-Wallace-WebHost commented Feb 26, 2023

sayantan1410 commented Feb 26, 2023

Sea-Snell commented Feb 26, 2023

young-geng commented Mar 1, 2023

honglu2875 commented Mar 2, 2023 • edited Loading

sgugger commented Mar 2, 2023 • edited Loading

henk717 commented Mar 2, 2023

hughbzhang commented Mar 2, 2023

AmericanPresidentJimmyCarter commented Mar 4, 2023

zphang commented Mar 4, 2023

elephantpanda commented Mar 6, 2023 • edited Loading

dustydecapod commented Mar 6, 2023

elephantpanda commented Mar 6, 2023 • edited Loading

simhallq commented Mar 7, 2023

Sea-Snell commented Mar 9, 2023

honglu2875 commented Mar 2, 2023 •

edited

Loading

sgugger commented Mar 2, 2023 •

edited

Loading

elephantpanda commented Mar 6, 2023 •

edited

Loading

elephantpanda commented Mar 6, 2023 •

edited

Loading