-
Notifications
You must be signed in to change notification settings - Fork 157
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improve reindex performance by skipping resources with no search parameter changes #2155
Comments
Could be combined with a similar schema-change needed for #2417 |
Need to make sure we consider the impact of global search parameters as well. Also need to consider search parameter disambiguation...two parameters can have the same code. |
should be done in conjunction with #1751 to ensure good e2e test coverage. |
After looking over the code involved, here's an overall design: DB changes:
Class changes:
Logic changes after calling extractSearchParameters(...):
|
Signed-off-by: Troy Biesterfeld <tbieste@us.ibm.com>
Signed-off-by: Troy Biesterfeld <tbieste@us.ibm.com>
Signed-off-by: Troy Biesterfeld <tbieste@us.ibm.com>
Signed-off-by: Troy Biesterfeld <tbieste@us.ibm.com>
Signed-off-by: Troy Biesterfeld <tbieste@us.ibm.com>
Signed-off-by: Troy Biesterfeld <tbieste@us.ibm.com>
Signed-off-by: Troy Biesterfeld <tbieste@us.ibm.com>
Signed-off-by: Troy Biesterfeld <tbieste@us.ibm.com>
Signed-off-by: Troy Biesterfeld <tbieste@us.ibm.com>
Signed-off-by: Troy Biesterfeld <tbieste@us.ibm.com>
Signed-off-by: Troy Biesterfeld <tbieste@us.ibm.com>
Signed-off-by: Troy Biesterfeld <tbieste@us.ibm.com>
Signed-off-by: Troy Biesterfeld <tbieste@us.ibm.com>
Signed-off-by: Troy Biesterfeld <tbieste@us.ibm.com>
Signed-off-by: Troy Biesterfeld <tbieste@us.ibm.com>
Signed-off-by: Troy Biesterfeld <tbieste@us.ibm.com>
Signed-off-by: Troy Biesterfeld <tbieste@us.ibm.com>
Signed-off-by: Troy Biesterfeld <tbieste@us.ibm.com>
Signed-off-by: Troy Biesterfeld <tbieste@us.ibm.com>
Signed-off-by: Troy Biesterfeld <tbieste@us.ibm.com>
Signed-off-by: Troy Biesterfeld <tbieste@us.ibm.com>
Signed-off-by: Troy Biesterfeld <tbieste@us.ibm.com>
Signed-off-by: Troy Biesterfeld <tbieste@us.ibm.com>
Signed-off-by: Troy Biesterfeld <tbieste@us.ibm.com>
Signed-off-by: Troy Biesterfeld <tbieste@us.ibm.com>
Signed-off-by: Troy Biesterfeld <tbieste@us.ibm.com>
Signed-off-by: Troy Biesterfeld <tbieste@us.ibm.com>
Signed-off-by: Troy Biesterfeld <tbieste@us.ibm.com>
Signed-off-by: Troy Biesterfeld <tbieste@us.ibm.com>
Signed-off-by: Troy Biesterfeld <tbieste@us.ibm.com>
Signed-off-by: Troy Biesterfeld <tbieste@us.ibm.com>
Signed-off-by: Troy Biesterfeld <tbieste@us.ibm.com>
Signed-off-by: Troy Biesterfeld <tbieste@us.ibm.com>
Signed-off-by: Troy Biesterfeld <tbieste@us.ibm.com>
Signed-off-by: Troy Biesterfeld <tbieste@us.ibm.com>
Issue #2155 - Use hash to determine if search parameters need update
When I first ran this I tried with 20 reindex-max-requests (lookup param name) and I hit this:
I then turned it down to 10 and it worked just fine. |
verified that if the search parameters did not change that the update of the search parameters is skipped for the update of the tables. Closing issue. |
The
$reindex
operation reads every resource, extracts the search parameters, deletes the current search parameters from the database and inserts the new set.This is expensive and unnecessary if the new search parameter set is identical to the search parameters currently stored. However, reading the existing search parameters for comparison would also be expensive and non-trivial. A possible solution is to store a fingerprint (hash) of the parameter set and use this as a comparison. If the stored hash matches the hash computed from the newly extracted parameter set then the delete/insert step can be skipped.
This optimization is also applicable to new versions of a resource.
Care is needed when computing the hash to make sure it is deterministic - meaning that the parameters must be sorted first before the hash is computed.
A hash such as SHA-256 is statistically reliable for such a feature.
A new column is required on the xx_logical_resources table. No data migration is needed because a default value of null is suitable. The next time a reindex is performed, the hash will be computed and stored. Future reindex operations will then benefit. New resources will be stored with the hash and therefore immediately benefit.
The text was updated successfully, but these errors were encountered: