-
Notifications
You must be signed in to change notification settings - Fork 41
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Faster diffs: Don't always run pygit2.Tree.diff_tree #962
Conversation
Don't run it when the user requests the diff of a particular feature - we can look up the feature for both revs quicker than pygit2 can find all diffs in the entire revision.
5c2a536
to
6bd4dc5
Compare
kart/dataset_mixins.py
Outdated
@@ -160,6 +166,66 @@ def get_decoder(dataset, method_name): | |||
_INSERT_UPDATE = _INSERT_TYPES + _UPDATE_TYPES | |||
_UPDATE_DELETE = _UPDATE_TYPES + _DELETE_TYPES | |||
|
|||
def get_raw_deltas_for_keys( | |||
self, other, key_encoder_method, key_filter, reverse=False |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
type hints on args wouldn't go amiss, the names don't make them very obvious
(we should probably be moving towards type hints everywhere)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done - added type hints to this file
tests/test_diff.py
Outdated
): | ||
def _get_raw_diff_for_subtree(self, *args, **kwargs): | ||
# When only nz_pa_points_topo_150k:1182 is requested, a different more efficient code-path should be used. | ||
raise AssertionError( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
raise AssertionError( | |
pytest.fail( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
Don't run it when the user requests the diff of a particular feature
There will be an extremely great speedup in the extreme case where the user wants to see a single feature that has changed, across a revision-range where a million features have changed.
If the user requests enough features, eventually there will be a crossover where this code path is slower. Eg if the user requests to see more features than have actually changed. However, mostly the size of the list of features requested will be much smaller than the size of the layer / size of the changes.
Checklist: