Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

When using the new text-embedding-3 models the scores are a lot lower #53

Open
ReneReiterer opened this issue Jun 25, 2024 · 2 comments

Comments

@ReneReiterer
Copy link
Contributor

Hey,
when i try to use the new text-embedding-3 models for creating the embeddings and for querying, i get a lot lower scores for the same query.

with ada-2, a query could get a result score of 0.8, but with text-embedding-3 it goes below 0.5, but returns the same content.
Is there a reason for this?

@Stevenic
Copy link
Owner

That's a function of the embeddings model and nothing I have control over. It implies that they're generating a more diverse range of embeddings... Can you share some examples (query + text being compared to)

@ReneReiterer
Copy link
Contributor Author

Here is an example using the example from the readme of vectra:

with "text-embedding-ada-002":

Querying green...
[0.9027890493383421] blue
[0.8750171543194056] red
[0.8316836924030466] apple

Querying banana...
[0.9025824326098169] apple
[0.8489727589250824] oranges
[0.840552337334082] blue

with "text-embedding-3-small":

Querying green...
[0.5587630540517711] blue
[0.4586459570036867] red
[0.3330212746409029] oranges

Querying banana...
[0.463723740085403] apple
[0.36792568686955635] oranges
[0.3011467689281706] blue

with "text-embedding-3-large":

Querying green...
[0.5854194924173858] red
[0.5425629350657741] blue
[0.3589804053636035] oranges

Querying banana...
[0.4618476040380141] apple
[0.39727599664880175] oranges
[0.37006686089236474] blue

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants