-
Notifications
You must be signed in to change notification settings - Fork 13.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Optimized HashMap
for size. Added DefaultResizePolicy
#14526
Conversation
deallocate(self.hashes as *mut u8, size, align); | ||
// Remember how everything was allocated out of one buffer | ||
// during initialization? We only need one call to free here. | ||
if self.hashes.is_not_null() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should zero self.hashes
after deallocation etc. just to be safe.
This change is a fairly large change to a core type in rust, due to the addition of an extra type parameter everywhere, so I would imagine that it likely warrants an RFC. If the resizing policy were an implementation of the detail of the hash map, I would be more comfortable with it. |
What is |
deallocate(self.hashes as *mut u8, size, align); | ||
// Remember how everything was allocated out of one buffer | ||
// during initialization? We only need one call to free here. | ||
self.hashes = RawPtr::null(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This isn't necessary, dropping will always zero the destination.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@eddyb drop zeroing is not to be relied upon. it's going away.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@cmr and when it does, destructors will be run precisely once. (Meaning this is ok.)
We were wasting space storing constants and caching relatively-easy-to-perform computations, while |
There are some changes and optimizations unrelated to To begin with, An extra type parameter isn't added exactly "everywhere" thanks to default type parameters. The proposed implementation has flaws:
We can keep using what the author calls a "hackish fraction type". The disadvantages of the current approach are marginal for librustc since hashmaps are often kept on the stack, I suppose. But a fine-grained control should be desirable for Servo. According to @cgaebel, two load factors are particularly useful: approximately 92% and 84%. The latter would make the probing read a single cache line most of the time, at the expense of a higher memory usage. Also, we can make it an internal mechanism without the use of type parameters for now. |
Overall, I like this refactoring and think it cleans up a few subtle sharp corners in Oh, and I find it really cute that a hashmap is now exactly one cache line in size. :) |
Do you have any concrete numbers for where this improves existing applications? The reduction of size of a HashMap to just 64 bytes seems appealing, but it would be useful to see how beneficial this is to applications such as I'm not personally worried about consumers of the I'm also curious about @brson's questions, "why is it needed, and what are the downsides if we don't have it?". I'm not sure that the answer for why is this needed that 1% of use cases may use this would justify it. Extending this would add a resize policy type parameter to almost all collections: |
Force pushed minor changes. No, it makes no measurable difference in microbenchmarks*. And I believe that By the way, it's very difficult to squeeze out further two more pointers without a significant loss of performance. But it might be possible. @alexcrichton, this is concerning. The first time I had a closer look at Fortunately,
|
Would it be possible to split this PR into two parts? The only controversial part of it seems to be the |
For now, I'll make Also, I realized that all collections will get yet another default type parameter once the "Allocator trait" RFC gets accepted. Which one should come first? I'm convinced this warrants an RFC. Keyword arguments would make changes like these much less painful. Hopefully the use of type parameters will improve as well, for instance by bringing back impls on typedefs (#6087, #9767). |
HashMap
for size. Added ResizePolicy
HashMap
for size. Added DefaultResizePolicy
This looks like it's just an internal improvement to |
Refactored the load factor and the minimum capacity out of HashMap. The size of HashMap<K, V> is now 64 bytes by default on a 64-bit platform (or 48 bytes, that is 2 words less, with FNV) Added a documentation in a few places to clarify the behavior.
Feel free to ping the PR whenever you update it, sadly github doesn't send out any notifications about a force-push :( |
An interface that gives a better control over the load factor and the minimum capacity for HashMap. The size of `HashMap<K, V>` is now 64 bytes by default on a 64-bit platform (or at least 40 bytes, that is 3 words less, with FNV and without minimum capacity) Unanswered questions about `ResizePolicy` * should it control the `INITIAL_CAPACITY`? * should it fully control the resizing behavior? Even though the capacity always changes by a factor of 2. * is caching `grow_at` desirable? special thanks to @eddyb and @pnkfelix
Revert "Add bounds for fields in derive macro" Reverts rust-lang/rust-analyzer#14521 as it introduces too many mismatches
An interface that gives a better control over the load factor and the minimum capacity for HashMap.
The size of
HashMap<K, V>
is now 64 bytes by default on a 64-bit platform (or at least 40 bytes, that is 3 words less, with FNV and without minimum capacity)Unanswered questions about
ResizePolicy
INITIAL_CAPACITY
?grow_at
desirable?special thanks to @eddyb and @pnkfelix