-
Hi, so I'm trying to take the results i get from a retriever and then try and filter out the results based on their scores. That is, check if a score is above a certain threshold. However, I'm not too sure what the right threshold should be since the scores seem to vary depending on the query and the corpus. Does anyone have any experience doing this? EDIT: Just to add on, I'm considering trying to normalize the scores, but there's a variety of different normalization techniques, and I don't know which would be the best in this case |
Beta Was this translation helpful? Give feedback.
Replies: 3 comments
-
Don't have personal experience, but curious to hear what others think! |
Beta Was this translation helpful? Give feedback.
-
I guess to add on what I said before. Right now, I'm getting the following scores for some query and corpus [1.9009349, 0.9240056, 0. , 0. , 0. ] However, only the document that corresponds to the first score is semantically relevant to my query and I'd like to filter it out the second document if possible. Here is the code snippet that I'm referring to
|
Beta Was this translation helpful? Give feedback.
-
I managed to figure out this specific case by adding onto the stopwords that are used. Not sure if this will work out for other test cases, but I'll close this for now |
Beta Was this translation helpful? Give feedback.
I managed to figure out this specific case by adding onto the stopwords that are used. Not sure if this will work out for other test cases, but I'll close this for now