-
-
Notifications
You must be signed in to change notification settings - Fork 3.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Reduced TableRow
as
Casting
#10811
Reduced TableRow
as
Casting
#10811
Conversation
Localizes all instances of `as` casting into a pair of properly checked functions with `debug_assertions`. By using `debug_assertions` any runtime costs are avoided.
@stepancheg, could I get your review here? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Much improved type safety, faithful translation to the new API, and even some nice new const fn annotations.
Totally content to defer to you on the exact names for the methods. Thanks!
This is kind of problem. We run debug build, don't trigger any assertions, but then run code in production and something breaks, because in production data size is usually larger. Like for one user in thousand the program will crash in non-reproducible use case. For example, it is very easy to get overflow in this fictional scenario: let index: usize = ... // some valid index, happened to be equal to u32::MAX
let next_index = index + 1;
let next_index = TableRow::from_usize(next_index);
if next_index.as_usize() == table_len { ... } here we quietly missed integer overflow. I believe, it won't be possible to observe overhead of If we have to use
Another way to deal with it is to just switch all the indices to |
Another possibility is instead to switch |
Also |
As I see it, Further, end users do have the option of compiling in release mode with
I see this PR as a stepping stone to better safety, and maybe not the final goal. The changes I've made are definitely zero impact in standard release builds, while
Personally, I don't like adding
That's not practically possible because of how it's used in relation to container
It's implicitly stored in
Perhaps, but that's definitely some extra new behaviour that should also still be in a |
Using forward-reverse casting to ensure value is invariant.
Developers using Bevy have this option. Users of apps developed with Bevy, not so much.
Different assertions have different overhead. Many debug assertions in bevy are expensive (some debug assertions even add fields to structs). This one is cheap. In particular, every time
This is not very convincing argument.
I can imagine another scenario: create 4 million entries slowly accidentally forgetting to delete them. I expect Bevy to crash in this scenario rather than quietly overflow because there's some undetected bug during next refactoring.
I'm sorry, I don't understand why it is not possible to switch indices to
Do you have any evidence that unconditional assert for integer range is expensive? Anyway, just in case, I'm not project maintainer, I just post my opinion. The decision how to do it should be made by someone else. |
This will create a single table storing 32GB of data at a minimum (An I don't want to get into an argument over this, because I do agree with you that I personally believe this PR still provides added safety even with only a |
Not necessarily. Operating system will just start swapping memory in the beginning of the table, and will happily allocate much more before it will be killed.
3% is too much. It cannot be that bad. Can I try to replicate your test? Can you give me instructions what you did, so I would try it myself? |
As said, I'm not confident in my testing, but what I did was create a second branch which used Then I ran: // On assert branch
cargo bench --bench ecs -- iter_simple/system --save-baseline assert
// On debug_assert branch
cargo bench --bench ecs -- iter_simple/system --save-baseline debug_assert
// Finally
cargo bench --bench ecs -- iter_simple/system --load-baseline debug_assert --baseline assert I generated the baselines in a loop over 10 minutes each, took the average value from each and then generated again until I found that value. Another option would be to save every baseline with a unique name. The final result was always less than a 3% deviation, and criterion claimed the result was within the noise threshold, indicating a difference not statistically significant, as you and I would expect. I chose Please feel free to bench this however you think is appropriate, I haven't switch to |
Right.
For benchmarking I use my |
OK, how I benchmarked. Build with: cargo build --example bb --release --target aarch64-apple-darwin (on Apple M1 laptop) (Version A is with Running test as: absh -im -a '~/tmp/bb-a' -b '~/tmp/bb-b' Ran it for 10 minutes, with 95% confidence, it is not worse than 0.7% regression:
To detect 0.1% regression it needs to be run for hours. But with 0.1% perf difference such benchmarks cannot be trusted, because compiler optimizations reorder instructions, inline functions differently etc, which may result in perf change in any direction. |
With reliable benchmark results, let's swap over to full asserts here. Thanks for grabbing those. |
Both benchmarks showed here are using I also noticed that most usages of for row in 0..table.entity_count() { // `table.entity_count()` is a `usize`
let entity = entities.get_unchecked(row);
let row = TableRow::from_usize(row); // For the compiler `row` may overflow a `u32`, so it can't elide the internal `assert` But if instead it was: let entity_count = table.entity_count();
assert!(entity_count <= 1 << 32);
for row in 0..entity_count {
let entity = entities.get_unchecked(row);
let row = TableRow::from_usize(row); // Now it's locally known that `row < entity_count <= 2^32`, so the compiler can elide the check
} That is, we just added an |
For the record, I ran the benchmark above running for a night (with
So full assertions is 0.1% regression
|
Just to be clear, I don't think the compiler is currently eliding the assert (it currently has no information to know that |
Right. Here is compiler explorer. For code: assert!(x <= 1 << 32);
for i in 0..x {
assert!(i as u32 as usize == i);
foo(i as u32);
} there's no assertion in the loop, the loop machine code is:
(one conditional branch which is exit from loop on iteration variable). |
Also removed `as_*` assertion since `usize` is guaranteed to be `u32` or larger in Bevy.
Thanks for the detailed testing @stepancheg! I've updated this PR to use |
Also removed `as_*` assertion since `usize` is guaranteed to be `u32` or larger in Bevy.
…/bevy into TableU32Operations
@bushrat011899 can you please add assertions to the loops as proposed by @SkiFire13? Perhaps as a separate PR. |
Added `assert` in `QueryIter::fold_over_table_range` out of the loop to hint compiler on optimisation.
Had to resolve a merge conflict anyway, so I've added in 1 assert statement outside a for-loop where I believe it would still be appropriate. |
# Objective - Since #10811,Bevy uses `assert `in the hot path of iteration. The `for_each `method has an assert in the outer loop to help the compiler remove unnecessary branching in the internal loop. - However , ` for` style iterations do not receive the same treatment. it still have a branch check in the internal loop, which could potentially hurt performance. ## Solution - use `TableRow::from_u32 ` instead of ` TableRow::from_usize` to avoid unnecessary branch. Before ![image](https://github.com/bevyengine/bevy/assets/45868716/f6d2a1ac-2129-48ff-97bf-d86713ddeaaf) After ---------------------------------------------------------------------------- ![image](https://github.com/bevyengine/bevy/assets/45868716/bfe5a9ee-ba6c-4a80-85b0-1c6d43adfe8c)
# Objective - Since bevyengine#10811,Bevy uses `assert `in the hot path of iteration. The `for_each `method has an assert in the outer loop to help the compiler remove unnecessary branching in the internal loop. - However , ` for` style iterations do not receive the same treatment. it still have a branch check in the internal loop, which could potentially hurt performance. ## Solution - use `TableRow::from_u32 ` instead of ` TableRow::from_usize` to avoid unnecessary branch. Before ![image](https://github.com/bevyengine/bevy/assets/45868716/f6d2a1ac-2129-48ff-97bf-d86713ddeaaf) After ---------------------------------------------------------------------------- ![image](https://github.com/bevyengine/bevy/assets/45868716/bfe5a9ee-ba6c-4a80-85b0-1c6d43adfe8c)
Objective
TableRow::new
to acceptu32
#10806Solution
Replaced
new
andindex
methods for bothTableRow
andTableId
withfrom_*
andas_*
methods. These remove the need to perform casting at call sites, reducing the total number of casts in the Bevy codebase. Within these methods, an appropriatedebug_assertion
ensures the cast will behave in an expected manner (no wrapping, etc.). I am using adebug_assertion
instead of anassert
to reduce any possible runtime overhead, however minimal. This choice is something I am open to changing (or leaving up to another PR) if anyone has any strong arguments for it.Changelog
ComponentSparseSet::sparse
stores aTableRow
instead of au32
(private change)TableRow::new
andTableRow::index
methods withTableRow::from_*
andTableRow::as_*
, withdebug_assertions
protecting any internal casting.TableId::new
andTableId::index
methods withTableId::from_*
andTableId::as_*
, withdebug_assertions
protecting any internal casting.TableId
methods are nowconst
Migration Guide
TableRow::new
->TableRow::from_usize
TableRow::index
->TableRow::as_usize
TableId::new
->TableId::from_usize
TableId::index
->TableId::as_usize
Notes
I have chosen to remove the
index
andnew
methods for the following chain of reasoning:new
was called with a mixture ofu32
andusize
values. Likewise forindex
.new
to either beusize
oru32
would break half of these call-sites, requiringas
casting at the site.new_u32
ornew_usize
avoids the above, bu looks visually inconsistent.from_*
andas_*
methods instead.Worth noting is that by updating
ComponentSparseSet
, there are now zero instances of interacting with the inner value ofTableRow
as au32
, it is exclusively used as ausize
value (due to interactions with methods likelen
and slice indexing). I have left theas_u32
andfrom_u32
methods as the "proper" constructors/getters.