Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add ability to incrementally add/remove nodes from bvh #99

Merged
merged 11 commits into from
Feb 17, 2024

Conversation

dbenson24
Copy link
Contributor

leaving an initial draft of this PR here. It looks like there might be a perf regression with when I originally wrote this so I'm going to take a look at that. (post optimizaiton intersection of the 120k benchmark is massively improved, intersection with sponza isn't doing as well)

@dbenson24
Copy link
Contributor Author

// PR
test bvh::optimization::bench::bench_intersect_120k_after_optimize_00p   ... bench:         853 ns/iter (+/- 15)
test bvh::optimization::bench::bench_intersect_120k_after_optimize_01p   ... bench:         920 ns/iter (+/- 25)
test bvh::optimization::bench::bench_intersect_120k_after_optimize_10p   ... bench:       2,607 ns/iter (+/- 488)
test bvh::optimization::bench::bench_intersect_120k_after_optimize_50p   ... bench:       2,941 ns/iter (+/- 572)
test bvh::optimization::bench::bench_intersect_120k_with_rebuild_00p     ... bench:         853 ns/iter (+/- 21)
test bvh::optimization::bench::bench_intersect_120k_with_rebuild_01p     ... bench:         914 ns/iter (+/- 17)
test bvh::optimization::bench::bench_intersect_120k_with_rebuild_10p     ... bench:       1,745 ns/iter (+/- 243)
test bvh::optimization::bench::bench_intersect_120k_with_rebuild_50p     ... bench:       1,985 ns/iter (+/- 356)
test bvh::optimization::bench::bench_intersect_sponza_after_optimize_00p ... bench:       1,366 ns/iter (+/- 42)
test bvh::optimization::bench::bench_intersect_sponza_after_optimize_01p ... bench:       3,098 ns/iter (+/- 139)
test bvh::optimization::bench::bench_intersect_sponza_after_optimize_10p ... bench:      15,347 ns/iter (+/- 1,580)
test bvh::optimization::bench::bench_intersect_sponza_after_optimize_50p ... bench:       8,568 ns/iter (+/- 1,122)
test bvh::optimization::bench::bench_intersect_sponza_with_rebuild_00p   ... bench:       1,359 ns/iter (+/- 33)
test bvh::optimization::bench::bench_intersect_sponza_with_rebuild_01p   ... bench:       1,496 ns/iter (+/- 59)
test bvh::optimization::bench::bench_intersect_sponza_with_rebuild_10p   ... bench:       1,906 ns/iter (+/- 105)
test bvh::optimization::bench::bench_intersect_sponza_with_rebuild_50p   ... bench:       2,472 ns/iter (+/- 193)
test bvh::optimization::bench::bench_optimize_bvh_120k_00p               ... bench:   1,064,110 ns/iter (+/- 6,733)
test bvh::optimization::bench::bench_optimize_bvh_120k_01p               ... bench:   2,357,540 ns/iter (+/- 366,656)
test bvh::optimization::bench::bench_optimize_bvh_120k_10p               ... bench:  17,270,170 ns/iter (+/- 2,828,226)
test bvh::optimization::bench::bench_optimize_bvh_120k_50p               ... bench:  72,724,610 ns/iter (+/- 7,787,654)
test bvh::optimization::bench::bench_randomize_120k_50p                  ... bench:   4,599,345 ns/iter (+/- 1,906,880)
// Base
test bvh::optimization::bench::bench_intersect_120k_after_optimize_00p   ... bench:         855 ns/iter (+/- 24)
test bvh::optimization::bench::bench_intersect_120k_after_optimize_01p   ... bench:     143,784 ns/iter (+/- 12,246)
test bvh::optimization::bench::bench_intersect_120k_after_optimize_10p   ... bench:   1,297,513 ns/iter (+/- 622,827)
test bvh::optimization::bench::bench_intersect_120k_after_optimize_50p   ... bench:   2,366,247 ns/iter (+/- 1,195,041)
test bvh::optimization::bench::bench_intersect_120k_with_rebuild_00p     ... bench:         851 ns/iter (+/- 15)
test bvh::optimization::bench::bench_intersect_120k_with_rebuild_01p     ... bench:         919 ns/iter (+/- 18)
test bvh::optimization::bench::bench_intersect_120k_with_rebuild_10p     ... bench:       1,772 ns/iter (+/- 251)
test bvh::optimization::bench::bench_intersect_120k_with_rebuild_50p     ... bench:       2,058 ns/iter (+/- 434)
test bvh::optimization::bench::bench_intersect_sponza_after_optimize_00p ... bench:       1,367 ns/iter (+/- 50)
test bvh::optimization::bench::bench_intersect_sponza_after_optimize_01p ... bench:       2,873 ns/iter (+/- 197)
test bvh::optimization::bench::bench_intersect_sponza_after_optimize_10p ... bench:       4,147 ns/iter (+/- 492)
test bvh::optimization::bench::bench_intersect_sponza_after_optimize_50p ... bench:       5,918 ns/iter (+/- 695)
test bvh::optimization::bench::bench_intersect_sponza_with_rebuild_00p   ... bench:       1,363 ns/iter (+/- 51)
test bvh::optimization::bench::bench_intersect_sponza_with_rebuild_01p   ... bench:       1,498 ns/iter (+/- 67)
test bvh::optimization::bench::bench_intersect_sponza_with_rebuild_10p   ... bench:       1,898 ns/iter (+/- 133)
test bvh::optimization::bench::bench_intersect_sponza_with_rebuild_50p   ... bench:       2,484 ns/iter (+/- 216)
test bvh::optimization::bench::bench_optimize_bvh_120k_00p               ... bench:     984,960 ns/iter (+/- 83,377)
test bvh::optimization::bench::bench_optimize_bvh_120k_01p               ... bench:   2,079,140 ns/iter (+/- 245,858)
test bvh::optimization::bench::bench_optimize_bvh_120k_10p               ... bench:  12,014,960 ns/iter (+/- 1,541,641)
test bvh::optimization::bench::bench_optimize_bvh_120k_50p               ... bench:  53,112,990 ns/iter (+/- 4,470,184)
test bvh::optimization::bench::bench_randomize_120k_50p                  ... bench:   4,634,985 ns/iter (+/- 2,040,404)

@dbenson24
Copy link
Contributor Author

Just gonna jot down my general arguments for this method over the current optimize.

The ability to dynamically add/remove individual nodes is a pretty major feature that you don't get from the other rust bvh (Parry's QBvh). This approach also eliminates the need to store the depth of each node so the nodes are smaller in memory (not massively smaller but it's still something). Keeping depth actually dramatically slows this approach down because it needs to update the depths during each operation.

All of this assumes that this approach also gets better/similar performance to the current one. Once I add the parallel rebuild, optimize/add/remove should be most relevant for changes of less than ~10% of the Bvh due to lack of parallelism.

There are still some optimizations left on the table. Right now add/remove doesn't leave the bvh in a contiguous state in memory. I want to test adding a relayout operation that puts all the nodes back in order. There's probably more operations

@svenstaro
Copy link
Owner

svenstaro commented May 7, 2023

These are some incredible numbers! I might not get to review this but I hope @marstaik could take a look. Please write a CHANGELOG entry for this change.

Copy link
Collaborator

@marstaik marstaik left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Really amazing job, I tried to be as thorough as I could.

A couple of notes

  1. The remove function leaves dead nodes in the vector, it would be better to Tombstone them (we can introduce a new enum value in BvhNode. This could be handled by an actual optimize pass in the future (more on that later).
  2. (This isn't just related to your PR) optimize has a bad name IMO. We are not performing any optimization to the BVH, only removing shapes that have changed and re-adding them.

Reviewing your changes has given me some ideas for future performance increases (things you do not have to do in this PR):

  • An actual optimization step, that removes TombStone'd Nodes and potentially re-balances the tree, and perhaps resort the hierarchy to a depth-first traversal. I believe DFS oriented setup could potentially make traversals faster.
  • Making mutability of shapes occur via a handle of some kind, which when all changes to shapes are made can automatically optimize the tree when it is dropped.

src/bvh/bvh_impl.rs Outdated Show resolved Hide resolved
src/bvh/bvh_impl.rs Outdated Show resolved Hide resolved
src/bvh/optimization.rs Outdated Show resolved Hide resolved
src/bvh/optimization.rs Outdated Show resolved Hide resolved
src/bvh/optimization.rs Show resolved Hide resolved
src/bvh/optimization.rs Outdated Show resolved Hide resolved
self.nodes[parent_index].child_l()
};

// TODO: fix potential issue leaving empty spot in self.nodes
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we can make the Node enum have a Gravestone state, and then we can simply compress the node vector in the optimize function.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

bump

src/bvh/optimization.rs Outdated Show resolved Hide resolved
src/bvh/optimization.rs Outdated Show resolved Hide resolved
src/bvh/optimization.rs Outdated Show resolved Hide resolved
@svenstaro
Copy link
Owner

@dbenson24 lil ping :)

@svenstaro
Copy link
Owner

@dbenson24 are you interested in continuing this?

@dbenson24
Copy link
Contributor Author

Hey yes sorry, recently had a baby and haven't had the time. I do want to get this cleaned up soon

@svenstaro
Copy link
Owner

Oh shit, congratz! Take all the time you need. :)

@dbenson24
Copy link
Contributor Author

Ok I think I've addressed everything except for the Tombstoning. It's definitely a nice to have, but I think in practicality it would rarely be used. Because of how efficiently we can traverse the bvh in parallel to do a build, analysis that goes over the whole tree to do something like reordering it would also have to be made similarly parallel. I just realized I hadn't ported the parallel stuff over to this yet so that'll be the next PR.

@svenstaro
Copy link
Owner

Could you also write a CHANGELOG entry for changes that people should be aware of?

@dbenson24
Copy link
Contributor Author

dbenson24 commented Oct 3, 2023

Can we do the changelog with my next PR and hold off on release until after that? I'll have it up shortly after this one is submitted. That will be the PR that builds the bvh in parallel

@svenstaro
Copy link
Owner

Oh ok, great.

@svenstaro
Copy link
Owner

@marstaik feel like taking another look at this one? It looks to me like all your last comments were addressed.

@marstaik
Copy link
Collaborator

@svenstaro Ill take a look soon, I was on a very long vacation :) I am back now.

@svenstaro
Copy link
Owner

@marstaik Friendly ping :)

Copy link
Collaborator

@marstaik marstaik left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, my life has also recently been hectic (out of country, new job in the last 6mo, etc) but should be a bit more stable for me to work on code now.

There are still some small code quality/renaming/ commented out functions that should be removed as they might cause confusion.

If you don't want to deal with the TODO related to empty nodes/tombstone-ing, maybe I can do it in a separate PR.

child_r_index: r_index,
parent_index,
};
//self.fix_depth(l_index, depth + 1);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

commented code still here

self.nodes[parent_index].child_l()
};

// TODO: fix potential issue leaving empty spot in self.nodes
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

bump

@svenstaro svenstaro mentioned this pull request Nov 27, 2023
@svenstaro
Copy link
Owner

@marstaik Happy to have you back! Should we maybe just merge this and you apply your changes on top? It might be the more efficient approach.

@dbenson24
Copy link
Contributor Author

No need, I think that is pretty much everything except for the tombstoning. I definitely don't want to do that in this PR. I want to get to the multithreading PR

@svenstaro
Copy link
Owner

@dbenson24 Great to see you're still around as well! :) Would be great if we could get this PR in soon.

@dbenson24
Copy link
Contributor Author

I'm pretty sure this is ready to merge

@svenstaro
Copy link
Owner

Some more clippy stuff to fix :)

@dbenson24
Copy link
Contributor Author

Lol these aren't even my clippy errors, clippy has added new lints that affect existing code. I have been missing them because I haven't been updating my rust nightly every time

@svenstaro
Copy link
Owner

Weird, did we just spot a flaky test?

@svenstaro
Copy link
Owner

@marstaik What do you think?

@svenstaro
Copy link
Owner

@marstaik friendly ping :)

@svenstaro
Copy link
Owner

If I hear nothing for a week, I'll just merge this.

@svenstaro svenstaro merged commit 4bb1c33 into svenstaro:master Feb 17, 2024
11 checks passed
@svenstaro
Copy link
Owner

Finally merging this, thanks for you work @dbenson24!

@dbenson24
Copy link
Contributor Author

dbenson24 commented Feb 17, 2024 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants