Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Switch embed to llama_get_embeddings_seq #1263

Merged
merged 2 commits into from
Mar 9, 2024
Merged

Conversation

iamlemec
Copy link
Contributor

@iamlemec iamlemec commented Mar 8, 2024

Due to updates in ggml-org/llama.cpp#5796, sequence level embeddings are now output through a separate channel from token level embeddings, and they are accessed with llama_get_embeddings_seq.

@abetlen
Copy link
Owner

abetlen commented Mar 9, 2024

@iamlemec thank you! Just so I understand, the sequence level embeddings are the ones that are pooled up to the end of the from the last processed batch?

Also, I think the new function in llama_cpp.py is duplicated by accident.

llama_cpp/llama_cpp.py Outdated Show resolved Hide resolved
@iamlemec
Copy link
Contributor Author

iamlemec commented Mar 9, 2024

Oh yeah, I didn't see you added it already! Yup, it's the pooled embeddings by sequence for the last batch. So for both mean pooling and cls (first token) it works, and for non-pooling it returns null.

Co-authored-by: Andrei <abetlen@gmail.com>
@abetlen abetlen merged commit 2811014 into abetlen:main Mar 9, 2024
16 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants