Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fewer than expected results were retrieved during querying the index #38

Open
sametdumankaya opened this issue Oct 27, 2023 · 3 comments
Assignees
Labels
bug Something isn't working documentation Improvements or additions to documentation question Further information is requested

Comments

@sametdumankaya
Copy link

Hi,

I'm trying to use voyager library instead of annoy but encountered with the following problem. Even though there are 25130 elements (see the num_elements attribute of the index below) in the Voyager Index, I'm unable to query since it can't find all of the indexes somehow.

image
@markkohdev
Copy link
Contributor

markkohdev commented Dec 18, 2023

@sametdumankaya sorry about the delayed response here! Can you provide some more information on your use case? Are you attempting to query for N neighbors where N is the number of elements in the index?

Also can you check to ensure that there are no NaN's in your item set?

@markkohdev markkohdev added bug Something isn't working question Further information is requested documentation Improvements or additions to documentation labels Jan 17, 2024
@cvillela
Copy link

Hello @markkohdev!
I am facing the exact same issue. Some calls for querying for N neighbors in an index of length N results in this error.
My objective would be to find the furthest neighbor in a index from a specific vector.
There are no NaN's in the set.

print(f"Len Index {len(cluster_index)}")
neighbors, _ = cluster_index.query(
            vectors=any_vector,
            k=len(cluster_index)
        )

outputs

Len Index 828
RuntimeError: Fewer than expected results were retrieved; only found 825 of 828 requested neighbors.

Is this a parameter tuning problem? Such as some of the "ef" parameters?

Please note that this index also does not contain any mark_deleted() elements

@sametdumankaya
Copy link
Author

@sametdumankaya sorry about the delayed response here! Can you provide some more information on your use case? Are you attempting to query for N neighbors where N is the number of elements in the index?

Also can you check to ensure that there are no NaN's in your item set?

Hey again,

There's basically 25310 elements in the set and I'm trying to get similarity scores for all of them using a random embedding. I confirm that there's no NaN's in the item set. Somehow, 4 of the items were not included in the index and there are 25306 items in the index instead of 25310.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working documentation Improvements or additions to documentation question Further information is requested
Projects
None yet
Development

No branches or pull requests

5 participants