Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optimise Entity with repr align & manual PartialOrd/Ord #10558

Merged
merged 5 commits into from
Nov 18, 2023
Merged
Changes from 4 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
47 changes: 44 additions & 3 deletions crates/bevy_ecs/src/entity/mod.rs
Original file line number Diff line number Diff line change
Expand Up @@ -115,18 +115,55 @@ type IdCursor = isize;
/// [`EntityCommands`]: crate::system::EntityCommands
/// [`Query::get`]: crate::system::Query::get
/// [`World`]: crate::world::World
#[derive(Clone, Copy, Eq, Ord, PartialOrd)]
#[derive(Clone, Copy, Eq)]
// Alignment repr necessary to allow LLVM to better output
// optimised codegen for `to_bits`, `PartialEq` and `Ord`.
// See <https://github.com/rust-lang/rust/issues/106107>
Bluefinger marked this conversation as resolved.
Show resolved Hide resolved
#[repr(align(8))]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

YMMV: you could consider adding an align: [u64; 0] field instead of the repr, which would make it aligned the same as u64, and thus less-aligned on smaller targets that don't care about 8-byte alignment for u64.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Adding a field would complicate this PR, as then I'd have to change initialisation usages of Entity within the file. It would add either complication, or noise to change things to use new instead (which in my view, should be the case with constructors anyway). Also, the only 32-bit platform mainstream enough to worry about is wasm32, which should handle u64 just fine as it is a native type. All in all, I'd rather not complicate the structure of Entity too much. Maybe something for a different PR to explore if there's further potential there (or something to bake into #9797)

pub struct Entity {
generation: u32,
// The field order below is necessary for rust to
// optimise better how it handles `Entity`
index: u32,
generation: u32,
Bluefinger marked this conversation as resolved.
Show resolved Hide resolved
}

// By not short-circuiting in comparisons, we get better codegen.
// See <https://github.com/rust-lang/rust/issues/117800>
impl PartialEq for Entity {
#[inline]
fn eq(&self, other: &Entity) -> bool {
(self.generation == other.generation) & (self.index == other.index)
// By using `to_bits`, the codegen can be optimised out even
// further potentially. Relies on the correct alignment/order of
// `Entity`.
self.to_bits() == other.to_bits()
}
}

// The macro codegen output is not optimal and can't be optimised as well
// by the compiler. This impl resolves that like with `PartialEq`, but by
// instead relying on the better alignment of `Entity`, and then by
// comparing against the bit representation directly instead of relying on
// the normal field order.
impl PartialOrd for Entity {
#[inline]
fn partial_cmp(&self, other: &Self) -> Option<std::cmp::Ordering> {
// Make use of our `Ord` impl to ensure optimal codegen output
Some(self.cmp(other))
}
}

// The macro codegen output is not optimal and can't be optimised as well
// by the compiler. This impl resolves that like with `PartialEq`, but by
// instead relying on the better alignment of `Entity`, and then by
// comparing against the bit representation directly instead of relying on
// the normal field order.
impl Ord for Entity {
#[inline]
fn cmp(&self, other: &Self) -> std::cmp::Ordering {
// This will result in better codegen for ordering comparisons, plus
// avoids pitfalls with regards to macro codegen relying on property
// position when we want to compare against the bit representation.
self.to_bits().cmp(&other.to_bits())
}
}

Expand All @@ -145,6 +182,7 @@ pub(crate) enum AllocAtWithoutReplacement {

impl Entity {
#[cfg(test)]
#[inline]
Bluefinger marked this conversation as resolved.
Show resolved Hide resolved
pub(crate) const fn new(index: u32, generation: u32) -> Entity {
Entity { index, generation }
}
Expand Down Expand Up @@ -197,6 +235,7 @@ impl Entity {
/// In general, one should not try to synchronize the ECS by attempting to ensure that
/// `Entity` lines up between instances, but instead insert a secondary identifier as
/// a component.
#[inline]
pub const fn from_raw(index: u32) -> Entity {
Entity {
index,
Expand All @@ -210,13 +249,15 @@ impl Entity {
/// for serialization between runs.
///
/// No particular structure is guaranteed for the returned bits.
#[inline]
pub const fn to_bits(self) -> u64 {
(self.generation as u64) << 32 | self.index as u64
}

/// Reconstruct an `Entity` previously destructured with [`Entity::to_bits`].
///
/// Only useful when applied to results from `to_bits` in the same instance of an application.
#[inline]
pub const fn from_bits(bits: u64) -> Self {
Self {
generation: (bits >> 32) as u32,
Expand Down