-
Notifications
You must be signed in to change notification settings - Fork 1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Unnecessary float[](BM25Scorer) allocations for non-scoring queries #12297
Comments
I also came across this discussion, but could not understand why this is not an issue. Seems like regression to me, for specific customer use case. I also tried quick hack patch (always returning non-scoring similarity) just to confirm, and the float[] did disappear from the histogram / heap dump.
@jpountz - Can you check once and confirm? |
I agree with Robert that 1kB per segment doesn't sound like a crazy amount of allocations, which suggests that you are searching many segments. Does the memory allocation profile look much better after your change? If yes, I wouldn't be against a small isolated change that avoids these allocations. It looks like one way to do it would be to update the |
I agree that per segment it isn't huge but those allocations add up to couple of GBs (dominated heap in image above). And I did notice that the float[] did not show up in the allocation profile after applying above patch
I was thinking along similar lines. @sgup432 will be raising PR for making this change |
@jpountz I have opened a PR for this, can you check?. I have added dummy scored in TermsWeight in case score is not needed as discussed. |
+1 to not change the signature of |
This has been addressed via #12383 |
Description
While looking into customer issue, I noticed increase in GC time from Lucene 7.x to 8.x. From the JVM histograms, one of the primary difference was float[] allocation. Took a heap dump to check the dominator and it was coming from BM25Scorer.
The change seems to have come in with 8fd7ead, which removed some of the special-case logic around the "non-scoring similarity" embedded in IndexSearcher (returned in the false case from the old IndexSearcher#scorer(boolean needsScores)).
I also validated that the scoring mode for these queries is COMPLETE_NO_SCORES, that has
needsScore
set to false:Version and environment details
Using Lucene 8.10.1, though the issue is there starting 8.x goes into 9.x as well
Screenshot
The text was updated successfully, but these errors were encountered: