Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Performance regression on LatLonPoint#newPolygonQuery #11824

Closed
iverase opened this issue Sep 27, 2022 · 5 comments
Closed

Performance regression on LatLonPoint#newPolygonQuery #11824

iverase opened this issue Sep 27, 2022 · 5 comments
Labels
Milestone

Comments

@iverase
Copy link
Contributor

iverase commented Sep 27, 2022

Description

I just notice a big performance regression on polygon queries using LatLonPoint field in lucene geo benchmarks:

image

I checked and the regression was introduced by this change: #1017.

My suspicion is that before this change, SpatialQuery was calling the method #getSpatialVisitor() once for the whole index but in the new version is calling it once per segment. This method might be expensive for LatLonPoint queries, threfore the regression.

@nknize FYI

Version and environment details

No response

@iverase
Copy link
Contributor Author

iverase commented Sep 27, 2022

close in #11825

@iverase iverase closed this as completed Sep 27, 2022
@iverase iverase added this to the 9.4.0 milestone Sep 27, 2022
@nknize
Copy link
Member

nknize commented Sep 27, 2022

Just seeing this. That's exactly what it would be! Snuck in one of those last commits on the long running PR. Thanks for refactoring and merging @iverase!

@iverase
Copy link
Contributor Author

iverase commented Sep 28, 2022

Fix seems to bring performance back to previous levels:

image

@mikemccand
Copy link
Member

Thanks for catching this @iverase and the quick fix, and the follow-on issue to better detect such regressions before release: #11827

@nknize
Copy link
Member

nknize commented Sep 28, 2022

What's annoying is how incredibly trappy this override logic is. That a method call literally moving from createWeight to getScorerSupplier results in a 72.2% regression even slipped by me before merge doesn't bode well for new committers. And then it sat in regression until an entire company was interested in releasing.

I wonder if we can do better? Like maybe figure out better guardrails in these methods? Perhaps by something as simple as a rename (e.g., getScorerSupplierPerSegment) to signal one happens per segment? This isn't the first and will certainly not be the last time an expensive operation accidentally slips to a critical path. Any other ideas how to lower the bar here for new committers?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants