-
Notifications
You must be signed in to change notification settings - Fork 2.1k
Conversation
'\n[ Document scores are NaN; please look into the built index. ]\n' | ||
'[ If using a compressed index, try building an exact index: ]\n' | ||
'[ $ python index_dense_embeddings --indexer-type exact... ]' | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should the error message also mention something about replacing some dummy numbers for score?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
don't think it's necessary
'[ If using a compressed index, try building an exact index: ]\n' | ||
'[ $ python index_dense_embeddings --indexer-type exact... ]' | ||
) | ||
scores.fill_(1) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm curious why 1
is the error value?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i picked an arbitrary number so that the scores play nicely with the rest of the system
the scores are literally a tensor of nan
s:
ipdb> scores
tensor([[nan, nan, nan, nan, nan]], device='cuda:0', dtype=torch.float16)
so needed to pick something that worked
I'm pretty sure the failing tests are unrelated to this PR; they are currently being hammered out elsewhere, so will merge as is |
Patch description
Handles a bug caught in #3806; if the index building goes wrong, there's a possibility that the context/document scores are nan's; this seems to occur if there's an issue with the clustering of the vectors (see linked FAISS issue in #3806)
Testing steps
Reproduced the error with steps in #3806; verified that, with this fix, the error is no longer an issue.