GPKG status performance regression fix #398
Merged
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Description
Workaround for a GPKG performance regression introduced in b20eec4.
For a 29K feature dataset with 21K changes in the working copy,
sno status
went from ~7s to ~3m20s.Key driver is this query (~200ms):
Becoming ~200s:
I think the speedup is only possible via this method because Sqlite is basically untyped. Expectedly, removing the cast altogether fails tests on MSSQL & PostGIS, but I suspect they're impacted too. Need a better solution really so we can at least always make use of the dataset PK index. Minimum might be casting the other way (ie. tracking table → dataset pk type) with the logic there's often likely to be less changes in the tracking table than there are rows in the dataset table?
While I was profiling, disabled an extra hash verification per-object lookup.
Related links:
Checklist:
Have you included test(s)?covered by existing tests