Hi, is there any plans for word embeddings? #56

hlhr202 · 2023-03-22T06:46:22Z

Noob here, excuse me for my stupid feature request.
I noticed that someone in llama.cpp is working on word embedding from hidden layers. I m just asking is there any possibility to implement an embedding mode for llama-rs? thx

What i found is this commit

hlhr202 · 2023-03-24T17:22:15Z

Hi, I would like to add llama.cpp PR here for reference. just noticed they merged the embedding function
https://github.com/ggerganov/llama.cpp/pull/282/files

setzer22 · 2023-03-24T19:07:46Z

Hi @hlhr202! 👋

Thanks for bringing this to our attention. The code here doesn't look hard at all to port! We will add it to the repo since it makes sense to have a way for people to extract embeddings.

But I'd like to understand (just to satisfy my curiosity). Why are the LLaMA embeddings useful? Is this the same thing as regular word embeddings from any other model? That is, capture the semantics of a word as a vector to allow computing similarity metrics? Do you have a use case for extracting the embeddings that would help us understand the possibilities better? 😄

Not saying this is a requirement for the PR, I just want to learn if there are different use cases for this that I'm not aware of.

setzer22 · 2023-03-24T20:34:28Z

Please check out #72. I implemented some code to extract embeddings, but we still need to validate if the results are correct, and how to best expose this to our different levels of API.

hlhr202 · 2023-03-25T04:49:03Z

Hi @hlhr202! 👋

Thanks for bringing this to our attention. The code here doesn't look hard at all to port! We will add it to the repo since it makes sense to have a way for people to extract embeddings.

But I'd like to understand (just to satisfy my curiosity). Why are the LLaMA embeddings useful? Is this the same thing as regular word embeddings from any other model? That is, capture the semantics of a word as a vector to allow computing similarity metrics? Do you have a use case for extracting the embeddings that would help us understand the possibilities better? 😄

Not saying this is a requirement for the PR, I just want to learn if there are different use cases for this that I'm not aware of.

yes, computing semantic similarity is quite useful in many cases. it allow us to search sentences in similar semantic by using natural language query.
btw i will help to simply verify the pr and quickly merge into my llama-node.

hlhr202 · 2023-03-25T11:06:27Z

Please check out #72. I implemented some code to extract embeddings, but we still need to validate if the results are correct, and how to best expose this to our different levels of API.

@setzer22
thanks for your great work!
I just did a simple test for computing cosine similarity, comparing llama-rs and openai's embedding function. not sure if it is accurate...

dog1: My favourite animal is the dog
dog2: I have just adopted a cute dog
cat1: My favourite animal is the cat

llama-rs model: ggml-alpaca-7b-int4

llama-rs cosine similarity:
dog1 vs dog2  ->  0.6884680986404419
dog1 vs cat1  ->  0.9326339960098267

openai model: text-embedding-ada-002

openai cosine similarity:
dog1 vs dog2  ->  0.8523955345153809
dog1 vs cat1  ->  0.9551568031311035

it looks like everything works well, but the resulting similarity is quite different from openai's text-embedding-ada-002.
probably i will plan to run all the test in llama.cpp for another checking

hlhr202 · 2023-03-25T15:53:23Z

It seems llama.cpp have not done embeddings yet. I try to print the embedding vectors, but got size 0.

hlhr202 · 2023-04-05T02:40:25Z

@setzer22 sorry I reopened this ticket cuz I have noticed some changes from llama.cpp. And still I have tested a few examples on 7B alpaca but the results not very accurate (not sure if it is caused by small model size). what I v noticed from llama.cpp is that they were not using any end token as representation of sentence embedding, they put all prompt tokens into eval function, but always get a fixed length of vectors.

hlhr202 · 2023-04-05T02:50:08Z

@setzer22 I think our llama-rs implementation for embeddings may not be correct. what I v noticed from llama.cpp is that they were not using any end token as representation of sentence embedding, they put all prompt tokens into eval function, but always get a fixed length of vectors.

another tricks i found, but i m not sure if their implementation make sense... I guess they just remove additional vector items and I even dont know if they drop the part correctly, quite weird. I will continue follow the issue in the following few weeks. I m going to have a test on 30B model to see if semantic accuracy is better than 7B alpaca.

philpax · 2023-05-24T22:09:15Z

This should now be sorted / understandable with #273. Let me know if there's anything else.

philpax added the issue:enhancement New feature or request label Mar 24, 2023

setzer22 mentioned this issue Mar 24, 2023

Embedding extraction #72

Merged

hlhr202 closed this as completed Mar 26, 2023

hlhr202 reopened this Apr 5, 2023

hhamud mentioned this issue Apr 12, 2023

feat: lightweight, pure rust k-ANN vector database for long-term memory/knowledge-base rustformers/ecosystem#2

Open

philpax closed this as completed May 24, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Hi, is there any plans for word embeddings? #56

Hi, is there any plans for word embeddings? #56

hlhr202 commented Mar 22, 2023 •

edited

Loading

hlhr202 commented Mar 24, 2023

setzer22 commented Mar 24, 2023 •

edited

Loading

setzer22 commented Mar 24, 2023

hlhr202 commented Mar 25, 2023

hlhr202 commented Mar 25, 2023 •

edited

Loading

hlhr202 commented Mar 25, 2023

hlhr202 commented Apr 5, 2023 •

edited

Loading

hlhr202 commented Apr 5, 2023 •

edited

Loading

philpax commented May 24, 2023

Hi, is there any plans for word embeddings? #56

Hi, is there any plans for word embeddings? #56

Comments

hlhr202 commented Mar 22, 2023 • edited Loading

hlhr202 commented Mar 24, 2023

setzer22 commented Mar 24, 2023 • edited Loading

setzer22 commented Mar 24, 2023

hlhr202 commented Mar 25, 2023

hlhr202 commented Mar 25, 2023 • edited Loading

hlhr202 commented Mar 25, 2023

hlhr202 commented Apr 5, 2023 • edited Loading

hlhr202 commented Apr 5, 2023 • edited Loading

philpax commented May 24, 2023

hlhr202 commented Mar 22, 2023 •

edited

Loading

setzer22 commented Mar 24, 2023 •

edited

Loading

hlhr202 commented Mar 25, 2023 •

edited

Loading

hlhr202 commented Apr 5, 2023 •

edited

Loading

hlhr202 commented Apr 5, 2023 •

edited

Loading