-
Notifications
You must be signed in to change notification settings - Fork 22
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Optimize plain "not" queries #124
Conversation
b33c889
to
f0e9acc
Compare
f0e9acc
to
d94e333
Compare
Coverage tests are red on master too: https://coveralls.io/github/zopefoundation/Products.ZCatalog |
see also plone/plone.restapi#1252 |
I trust your changes are OK, but don't have any ZCatalog Fu. |
* Further optimize excluding results in not queries Usually the number of the parameters that have to be excluded in the not query is much lower than the number of values in the index, so it makes sense to actually try to pop them out from the list * duration1 appears to be lower after the optimizations
this branch includes now #125 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
We see that it is only possible to squash and merge the commits: |
@ale-rt Yes this is intended to keep the commit history straight. |
In Plone 6 we do queries returning a result except one item (the current context). A
not
query on the UID catalog is used to do this. In larger projects with several 10 or 100 thousands object cataloged, this slowed down queries to take several seconds. Without thenot
everything was fast.After some profiling we found the time is wasted in this loop
Products.ZCatalog/src/Products/PluginIndexes/unindex.py
Lines 575 to 664 in 7f614a6
Whats happens?
The
not
query is written to return all values except the ones matching. Thus, for a single UID, this creates an result containing all index keys, except one. This is very expensive.What I have done:
If the code detects a simple
not
query (without any operators) excluding only one or more keys, I shortcut the wholeindex_query
and return the previous result w/o the current catalog id.Does it help?
Yes. We bench-marked our query on a customer database with ~367000 catalogued objects. It cut down the whole requests time from ~1200ms to ~80ms (with 40ms of it been not query time in both).
cc @tisto @mauritsvanrees @cekk