Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Assume tied weights if lm_head/output weights is missing. #5824

Merged
merged 1 commit into from
Mar 8, 2024

Conversation

dmahurin
Copy link
Contributor

@dmahurin dmahurin commented Mar 1, 2024

This supports model configurations with "tie_word_embeddings", by using the embd_tokens weights if output/lm_head weights are missing (as they will be when weights are tied).

With this change, a tied model like the following can be converted to gguf.
https://huggingface.co/BEE-spoke-data/smol_llama-81M-tied

@cebtenzzre
Copy link
Collaborator

This change conflicts with the move towards only duplicating the tensors in memory at GGUF load time. See #4978, #5631, #5650, and #5670. I would prefer if we did something similar for Llama.

@dmahurin
Copy link
Contributor Author

dmahurin commented Mar 2, 2024

@cebtenzzre . Great, I did not see those changes, and this change was intended to be a quick work-around until tied-weights are more properly supported, which sounds like is happening. I will look at those changes.

This is to support model configurations with "tie_word_embeddings" set to true.
@dmahurin
Copy link
Contributor Author

dmahurin commented Mar 2, 2024

@cebtenzzre, Change updated in llama.cpp, with LLAMA tied-weights now being similar to other tied-weights.

@ggerganov ggerganov merged commit e457fb3 into ggerganov:master Mar 8, 2024
60 checks passed
@dmahurin dmahurin deleted the tied-weights branch March 8, 2024 13:38
hazelnutcloud pushed a commit to hazelnutcloud/llama.cpp that referenced this pull request Mar 10, 2024
…rganov#5824)

This is to support model configurations with "tie_word_embeddings" set to true.

Co-authored-by: Don Mahurin <2797413+dmahurin@users.noreply.github.com>
NeoZhangJianyu pushed a commit to NeoZhangJianyu/llama.cpp that referenced this pull request Mar 12, 2024
…rganov#5824)

This is to support model configurations with "tie_word_embeddings" set to true.

Co-authored-by: Don Mahurin <2797413+dmahurin@users.noreply.github.com>
jordankanter pushed a commit to jordankanter/llama.cpp that referenced this pull request Mar 13, 2024
…rganov#5824)

This is to support model configurations with "tie_word_embeddings" set to true.

Co-authored-by: Don Mahurin <2797413+dmahurin@users.noreply.github.com>
hodlen pushed a commit to hodlen/llama.cpp that referenced this pull request Apr 1, 2024
…rganov#5824)

This is to support model configurations with "tie_word_embeddings" set to true.

Co-authored-by: Don Mahurin <2797413+dmahurin@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants