Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optimize "duplicate removal" pass #3

Open
zvxryb opened this issue Feb 21, 2019 · 2 comments
Open

Optimize "duplicate removal" pass #3

zvxryb opened this issue Feb 21, 2019 · 2 comments

Comments

@zvxryb
Copy link
Owner

zvxryb commented Feb 21, 2019

The collision detection process often produces duplicate values; this is expected because each object can produce up to four (2D) or eight (3D) distinct indices, and each one must be tested independently. When two objects collide, and at least one has multiple indices, multiple potential collisions may be emitted.

To avoid unexpected results from the user's perspective (re-running collision handlers unnecessarily), duplicates are removed from the results. This is currently implemented by inserting results into a HashSet before returning them from detect_collisions.

This process is the current limiting factor for performance, taking significantly more time than either index calculation or actual collision detection for the example application.

Alternatives already tested:

  • Using a Vec, calling sort_unstable() followed by dedup() — this is the slowest option
  • Using a Vec, calling rayon's par_sort_unstable() with dedup() — almost as fast as HashSet, but requiring more CPU time
  • Different hashers — I've tested the default std:: hasher, as well as hashers from fnv, rustc_hash, and murmurhash64. Of these, rustc_hash is the fastest (and current default) followed by fnv, with std being the slowest.

Either continue optimizing this or provide an option to the user to obtain potential collision results with duplicates included (so they might implement application-specific solutions).

@zvxryb
Copy link
Owner Author

zvxryb commented Feb 21, 2019

Additional note: this may be less of an issue in real applications; the example is meant to be a sort of "stress test" and has 1500 dynamic entities all in constant collision.

In real applications, there are likely to be fewer collisions (per object) at any given time and the duplicate removal pass is likely to operate on a much smaller set of data.

@zvxryb
Copy link
Owner Author

zvxryb commented Feb 28, 2019

Dropped HashSet entirely:

  1. It doesn't parallelize well (not allowed to have multiple writers and there's no way to combine existing HashSets efficiently)
  2. Performance improvement over sort_unstable()/dedup() in the single-threaded case has become insignificant
  3. More sensible output order; potentially better cache usage if IDs are sorted in the output

This is still the most expensive part of scan()/par_scan()

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant