-
Notifications
You must be signed in to change notification settings - Fork 637
fix: Use map for efficient array reconiliation #2280
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
fix: Use map for efficient array reconiliation #2280
Conversation
|
Thanks @yossivainshtein - this is great. I will take a look once you mark it as ready for review. |
thegedge
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Like @coolsoftwaretyler mentioned, we can give a full review when you're ready, @yossivainshtein.
Thanks for your contribution! I've dropped in some thoughts that may help improve your PR :)
| // Creates a map of node by identifier value. | ||
| // In theory, several nodes can have the same identifier, if the array contains different types, so every identifier is mapped to an array of nodes with the same values. | ||
| // In practice this in probably a rare case, so we can live with the performance hit. | ||
| // | ||
| // If not all nodes have identifier, we can't use the map for lookups, so we return null. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Love that you've thought through this!
I'd highly recommend capturing various edge cases in test (if we haven't already done so). Some cases that come to mind would be tests for arrays of:
- all scalars,
modeltypes without identifiers,modeltypes with identifiers,- union of two distinct
modeltypes with and without identifiers - union of scalar and
modeltype (I believe MST supports this, but double check).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, of course these should be tested. I'll try to add those soon
|
|
||
| function buildObjectByIdMap(nodes: AnyNode[]): [Set<string>, Map<string, Array<AnyNode>> | null] { | ||
| // Creates a map of node by identifier value. | ||
| // In theory, several nodes can have the same identifier, if the array contains different types, so every identifier is mapped to an array of nodes with the same values. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This line of your comment is an interesting one. Basically, it states that we still have the same worst case complexity.
That being said, in those scenarios we're likely to fall back to the old code path. In the case where we have an array with models of the same type containing identifiers, we get a nice perf boost 🚀
(nothing to action here, I just like talking things through)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes, I'm trying to understand when it's OK to use the map and when it isn't...
|
Thanks for the comments @thegedge , @coolsoftwaretyler ! I obviously should add some tests for various edge cases, I hope to get to it soon. I made it draft because I wanted to make sure first that this direction which I took makes sense. Perhaps I should have opened a discussion (not sure how to do that). Another question - My test is more of a performance test, not really a UT. I saw that there's a separate file for performance tests but i couldn't get it to work... do they work? |
|
Hey @yossivainshtein - you can always open discussions at https://github.com/mobxjs/mobx-state-tree/discussions, but I like collaborating in a PR anyways. I'm not sure if the perf tests work. They've been in the codebase a long time. I wouldn't use them as a bar to clear for a PR at the moment, although for this change it makes sense to:
|
What does this PR do and why?
This PR attempts to fix this performance issue when replacing large array.
When replacing an array which contains object with an identifier attribute, it's possible to lookup the object for reconciliation by using a map lookup instead of linear scan of the entire array.
Steps to validate locally
I added a new test case named
it should reconciliate large array instances efficiently.This test takes a long time before the fix and its quick after.