Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Switch usage of std HashMap/HashSet default hasher, to aHash algo #258

Merged
merged 16 commits into from
Aug 29, 2020
Merged

Switch usage of std HashMap/HashSet default hasher, to aHash algo #258

merged 16 commits into from
Aug 29, 2020

Conversation

RobDavenport
Copy link
Contributor

@RobDavenport RobDavenport commented Aug 20, 2020

First time submitting a PR to OSS!

For #157.

Using the benchmarks seen from https://github.com/tkaitchuck/aHash, I just went ahead and changed all instances of std HashMap and HashSet to use hashbrown's versions. This is a drop-in replacement, since hashbrown default's to ahash, we are able to get the performance benefits without any major refactoring. The aHash algos should be faster for all of the current uses in Bevy.

Some points of interest:

  1. The import of hashbrown for bevy_property crate needed to include feature 'serde' to allow macros to generate. This feature is not included in any of the other crates. Note that I followed the suggestion from @cart on discord to keep the rest of the features the same (ahash, inline-more).
  2. A single instance of std::collections::HashMap is necessary due to rectangle_pack crate requiring a std::HashMap as a function parameter. This can be seen in texture_atlas_builder.rs lines 90 to 103. I wasn't sure of a better way to fix this (other than modifying the dependent crate reactangle_pack to allow a hashbrown hashmap, or writing some kind of conversion function). If there's a better way let me know and I'll look into it!
  3. cargo fmt likes to re-arrange some positions of 'use' blocks, as well as move multi-line blocks into single lines.
  4. I also went ahead and changed the hashing algorithm used in the renderer pipeline. this can be seen from files bind_group.rs as well as bind_groups.rs. This previously was using the default hasher, but this PR changes it to use aHash. aHash dependency was added where necessary.
  5. I made a modification to the bundle macro in crates/bevy_ecs/hecs/macros/src/lib.rs to also use a hashbrown hashset instead of std::collections::HashSet. I'm not sure if this was a good idea or not. Honestly I'm not too familiar with macros in Rust so would appreciate some special attention there. Hopefully I used the #path thing correctly.

I ran a few examples (including the breakout game) and seems to run OK.

@karroffel karroffel added C-Feature A new feature, making something new possible C-Performance A change motivated by improving speed, memory usage or compile times C-Dependencies A change to the crates that Bevy depends on labels Aug 20, 2020
@@ -88,7 +88,7 @@ impl TextureAtlasBuilder {
rect_placements = None;
break;
}
let mut target_bins = HashMap::new();
let mut target_bins = std::collections::HashMap::new();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just looked at this for fun, and the new 0.2 version of rectangle_pack actually uses BTreeMap here to behave more deterministically.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nice find - perhaps a new issue for updating the version of rectangle_pack ?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just looked at, and using a BTree requires a bunch of Stuff to impl Ord, like Texture, which contains a Vec2 with f32s. Definitely doable, but a bit more involved.

@killercup
Copy link
Contributor

I'm curios if there is any runtime performance difference between using hashbrown and ahash directly with the std HashMap. Since the latter is actually using hashbrown under the hood I'd expect it to perform the same while being faster to compile.

@tbillington
Copy link
Contributor

Might be a good idea to include benchmarks of how this change impacts bevy compile time and bevy runtime. I understand in the linked repo's benchmarks it looks good but I personally think it's important to have benchmarks in context, otherwise we can't really be sure.

@cart
Copy link
Member

cart commented Aug 21, 2020

Love the conversation happening here. Its worth doing some due diligence here to make sure we made the right call.

I gave this a quick run and its already showing a nice performance improvement on the spawner example. Rendering is a bit "overly hash-ey" right now, so thats a good stress test.

Before this change, we get ~16.5 fps. After this change, we get about ~20.1 fps.

@cart
Copy link
Member

cart commented Aug 21, 2020

If someone could run some compile time benchmarks that would be great.

I really like @killercup's idea to try using aHash directly. We should then evaluate compile times and runtimes again.

@RobDavenport
Copy link
Contributor Author

RobDavenport commented Aug 21, 2020

I'm sure there's a better way to test this but here's what I did anyway. Did a full clean-build cycle three times on each of the following:

hashbrown full-build: 2m09s, 2m04s, 2m01s
std::collections::Hash full build (master branch, so a single import of hashbrown for ecs): 2m05s, 2m02s, 2m03s
std::collections /w aHash full build: Will update later...

So yes excluding hash-brown will definitely speed up compile times. I'll go back and redo it to try using the std::collection variants with the ahasher and post results.

Edit updates:
hashbrown full-build: 2m07s, 2m04s, 2m05s
std::collections::Hash full build (master branch, so a single import of hashbrown for ecs): 2m05s 2m04s, 2m03s
std::collections /w aHash full build: 2m04s, 2m03s, 2m07s

Interestingly there doesn't seem to be that much of a difference, we're talking 1-2 seconds between [Everything hashbrown], [everything std + hashbrown ecs] and [everything std with aHasher]. Although I am curious as to why builds randomly take 3-4 seconds longer than the same one. Perhaps someone with a better testing methodology could check it out?

Rough FPS for spawner example in release. I took the first 10 outputs took the average.
hashbrown all the things: 15.635199fps
old master: 14.123489fps
std /w ahash: 16.929153fps

@RobDavenport
Copy link
Contributor Author

RobDavenport commented Aug 21, 2020

From my comment above, I posted my findings for using std::collections hash map + hash set with the aHash hasher instead. Looks like compile times are almost identical. Interestingly, performance seems to be slightly more improved using the std+ahasher route compared to just hashbrown default. Either way, I don't think my testing methods were ideal so hope someone else could chime in! The current PR is now going with the std+ahash route.

@RobDavenport RobDavenport changed the title Switch usage of std HashMap/HashSet to hashbrown, using default aHash algo Switch usage of std HashMap/HashSet default hasher, to aHash algo Aug 21, 2020
@killercup
Copy link
Contributor

killercup commented Aug 21, 2020

tl;dr Sounds like going ahash has no downsides in this case! Nice!

Did a quick compile time check myself and cargo clean && time cargo b -q gave me ~35s in every of my 5 runs -- on both master and this feature branch.

Going the ahash route also makes it possible to interact with (well-written) libraries expecting std HashMaps, btw.

Interestingly, performance seems to be slightly more improved using the std+ahasher route compared to just hashbrown default.

That is surprising me as well! I don't think the std lib version has additional inlining or optimizations applied to it.

@cart
Copy link
Member

cart commented Aug 22, 2020

Thanks @RobDavenport and @killercup for the thoroughness. Sounds like std+aHash is the way to go. @RobDavenport, can you resolve the merge conflicts? Once thats done I'll take one more sanity pass over the changes and I think we'll be good to go.

@RobDavenport
Copy link
Contributor Author

@cart Fixed the merge conflicts, mostly just conflicting cargo.toml files with the new inclusion of parking_lot to many of the crates. I ran a full rebuild + the spawner example and seemed to work well. Should be good to go from now!

@cart
Copy link
Member

cart commented Aug 24, 2020

Alrighty one more thought to simplify this a little bit:

HashMaps are common enough that making them easy to consume / consistent is worth it. I think we should export the following type alias from bevy_core::collections:

pub type HashMap<K, V> = HashMap<K, V, RandomState>;

This removes the need to add aHash dependencies everywhere and simplifies type declarations.

@cart
Copy link
Member

cart commented Aug 24, 2020

(and lets do the same for HashSet)

@RobDavenport
Copy link
Contributor Author

RobDavenport commented Aug 25, 2020

@cart, I tried exposing the HashMap type as a part of bevy_core, but unfortunately I ran into some cyclical dependencies. I'm really not familiar with rust's build system yet so I wasn't able to find a good way to solve this.

I had it setup like so
bevy_core/src/collections.rs

use ahash::RandomState;

pub type HashMap<K, V> = std::collections::HashMap<K, V, RandomState>;
pub type HashSet<K> = std::collections::HashSet<K, RandomState>;

And then modified bevy_core/src/lib.rs to include the collections module, and expose it with pub use.

Caused by:
  The system cannot find the path specified. (os error 3)
[Finished running. Exit status: 0]
[Running 'cargo build']
error: cyclic package dependency: package `bevy_app v0.1.3 (D:\Development\bevy\crates\bevy_app)` depends on itself. Cycle:
package `bevy_app v0.1.3 (D:\Development\bevy\crates\bevy_app)`
    ... which is depended on by `bevy_core v0.1.3 (D:\Development\bevy\crates\bevy_core)`
    ... which is depended on by `bevy_hecs v0.1.3 (D:\Development\bevy\crates\bevy_ecs\hecs)`
    ... which is depended on by `bevy_ecs v0.1.3 (D:\Development\bevy\crates\bevy_ecs)`
    ... which is depended on by `bevy_app v0.1.3 (D:\Development\bevy\crates\bevy_app)`
    ... which is depended on by `bevy v0.1.3 (D:\Development\bevy)`
[Finished running. Exit status: 0]

This was triggered by adding the use bevy_core::collections::HashMap to bevy_ecs/hecs and of course the addition to cargo.toml for it.

@killercup
Copy link
Contributor

Hm, yeah, you'd have to add another crate (bevy_common or something like that) to resolve that cycle. I'm wondering, though: Why not use this type def from the ahash crate?

@cart
Copy link
Member

cart commented Aug 25, 2020

@RobDavenport Arg yeah thats my bad. Sorry!

I like @killercup's idea to use the typedef from the ahash crate. Lets just do that for now.

@RobDavenport
Copy link
Contributor Author

RobDavenport commented Aug 26, 2020

@cart @killercup are you suggesting to change the instances of std::collections::hashmap to use ahash::AHashMap instead? Wont this still cause the problem of having to include ahash dependencies everywhere?

Edit: I went ahead and tried changing it, but ran into another problem, this time it's due to the impl_property! macro which requires Serialize + Deserialize traits for the key + value for AHashMap. I found this issue on the aHash repo tkaitchuck/aHash#42.

@killercup
Copy link
Contributor

@RobDavenport I would try using use ahash::{AHashMap as HashMap, AHashSet as HashSet}; where the the std types are imported now.

@RobDavenport
Copy link
Contributor Author

@killercup did you see my edit? You replied around the same time I posted it

@killercup
Copy link
Contributor

You're right, I didn't see it yet :) Reading the issue, I guess it's a wrapper to be able to overwrite the constructor methods for the hash map. Can you use ahash::RandomState in impl_property! directly? Actually, I'll have a look at this right now :)

@killercup
Copy link
Contributor

Okay, just tried this for a bit and there are a lot of places where it's easy to switch to ahash, but some are more tricky; like macros referring to HashMaps and the whole properties thing.

I am not involved in this project and was mostly just curios. From my experience with Rust projects and I'd recommedn this:

  • First, use ahash in some places, where performance actually matters. This is a small and uncontroverstial PR
  • Then, introduce a bevy_utils (or similar) crate that covers dependencies like ahash, parking_lot, and probably some error handling and logging helpers.

@RobDavenport
Copy link
Contributor Author

@killercup thanks for looking into it. Personally I'd prefer to just put the ahashmaps everywhere just to keep things clean and identical everywhere, but looks like it wont be such an easy task for now.

@cart What do you recommend from here?

@cart
Copy link
Member

cart commented Aug 26, 2020

If its not too much trouble, lets do a new bevy_utils crate and export them as HashMap and HashSet. Short term that means we're basically replacing an ahash dependency with a bevy_utils dependency, but long term that scales better as we encounter more cases like this.

@RobDavenport
Copy link
Contributor Author

@cart I followed your advice and created the utils crate. Please take a look when you have time!

@cart
Copy link
Member

cart commented Aug 29, 2020

Looks good to me! Thanks for dealing with the churn as we landed on the best approach. This is a huge win!

@cart cart merged commit 4aabe98 into bevyengine:master Aug 29, 2020
mrk-its pushed a commit to mrk-its/bevy that referenced this pull request Oct 6, 2020
…vyengine#258)

switch to ahash for HashMaps and HashSets via a new bevy_utils crate
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
C-Dependencies A change to the crates that Bevy depends on C-Feature A new feature, making something new possible C-Performance A change motivated by improving speed, memory usage or compile times
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants