-
Notifications
You must be signed in to change notification settings - Fork 98
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
HashMap performance fix. #394
Conversation
… into hashmap-improvements
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this backward-compatible with the stable vars that are already externalised? @matthewhammer
No stable vars will use this type, since Now that you mention it, I think this PR could be considered "improvements only" and should not break anything running today. (Since the class API is not changing, any code using the class remains fine too.) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd just use a more efficient type for the internal key type definition.
@crusso PTAL. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!
@matthewhammer @crusso The interface is compatible, but what about the data running underneath the change? I'm thinking specifically for users that are using the system pre/post upgrade functions to serialize/deserialize their HashMap to and from stable memory during upgrades. Without a specific My naive understanding.
Usually for breaking changes, libraries provide migration notes - so It might be nice in the documentation to include how one can upgrade from the old version of HashMap to the new version (i.e. include the recommended code for the |
Avoids rehashing when the hash table grows. Fixes one known performance issue.
Background
There are several performance issues with the
HashMap
that Motoko devs experience when they insert many keys and the table's underlying array re-grows.Like a usual hash table, each growth step adds an
O(n)
overhead to what would otherwise be anO(1)
insertion operation. But what's worse here, the currentHashMap
(before this PR) actually re-runs the hash function on every key. ForText
keys with even moderate sizes, this has been shown to be prohibitive even at moderate map sizes.The issue is critical, as it manifests as a failure to insert more keys, as regrowth goes beyond the current message cycle limit. Even with this limit relaxed or removed in the future, it would ideal to avoid this rehashing step. This PR does that.