Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Revisit labels again #5569

Closed
maniwani opened this issue Aug 3, 2022 · 7 comments · Fixed by #7762
Closed

Revisit labels again #5569

maniwani opened this issue Aug 3, 2022 · 7 comments · Fixed by #7762
Labels
A-ECS Entities, components, systems, and events C-Usability A targeted quality-of-life change that makes Bevy easier to use

Comments

@maniwani
Copy link
Contributor

maniwani commented Aug 3, 2022

What problem does this solve or what need does it fill?

#4957 made labels much cheaper, but also made them much less capable. By nature, the design moves away from using Hash and PartialEq impls to using (TypeId, &'static str) tuples for comparisons. (#5377 recovers constant-time performance by adding a u64, so the &'static str or formatter can just be there for log/debug purposes).

Beyond simple fieldless enum variants, this change basically prevents using different values to mean different labels, e.g. MyLabel(T), which I think is a big ergonomics hit.

Under bevyengine/rfcs#45, the plan for state machines is to facilitate linking each state to a pair of on-enter and on-exit schedules, which would run from within an exclusive system handling state transitions. The implementation for this in #4391 was looking extremely simple.

pub struct CurrentState(pub S);
pub struct NextState(pub S);
pub struct PrevState(pub S);

#[derive(SystemLabel)]
pub struct OnEnter<S: State>(pub S);

#[derive(SystemLabel)]
pub struct OnExit<S: State>(pub S);

pub fn apply_state_transition<S: State>(world: &mut World) {
    world.resource_scope(|world, mut state: Mut<CurrentState<S>>| {
        if world.resource::<NextState<S>>().0.is_some() {
            let new_state = world.resource_mut::<NextState<S>>().0.take().unwrap();
            let old_state = std::mem::replace(&mut state.0, new_state.clone());
            world.resource_mut::<PrevState<S>>().0 = Some(old_state.clone());
            run_schedule(OnExit(old_state), world);
            run_schedule(OnEnter(new_state), world);
        }
    });
}
pub fn run_schedule(system_set: impl SystemLabel, world: &mut World) {
    let systems = world.resource_mut::<Systems>();
    let schedule = systems.take_schedule(&system_set));
    
    schedule.run(world);

    let systems = world.resource_mut::<Systems>();
    systems.return_schedule(&system_set, schedule);
}

The apply_state_transition system just reads the state resources, then loads and runs their appropriate schedules (if they exist). What was really nice was that it was just a convenient export of something users could do on their own.

However, this is no longer possible, and I'm not sure how to adapt it for the current 0.8 build. Since I don't know S, I don't know how to produce a unique &'static str / u64 for each label value. There's probably some way to do it, but having to do a special manual impl for a simple wrapper type feels ridiculous. As one of its main authors, I believe that what's coming with stageless will lead to apps having more labels, so increasing friction here by reducing what users can do with even POD types seems counterproductive.

It's true that schedule construction had too much boxing, but those costs were infrequent, basically once at startup, so I'm skeptical the tradeoff we made was worth it.

What solution would you like?

I haven't figured out an obvious solution yet, but the essence of the problem is that:

  1. bevy wants plain integers for efficient schedule construction
  2. user wants names for logging/debugging

All the boxing happened in the descriptor API methods, where we had to collect all SystemLabel trait objects into a struct before handing it to the SystemStage, so we ended up boxing each and every use of a label.

(edit) Removing that excessive boxing seems to have been the main reason people supported the new scheme.

Since add_system is a method on SystemStage, could we add a HashMap<SystemLabelId, Box<dyn SystemLabel>> field and somehow pass a &mut reference to it into the descriptor process? If we can do that, we'd only box each label once and different values could mean different labels again. The *LabelId types can just wrap (type_id, value_hash) tuples (currently same size as a u128).

What alternative(s) have you considered?

  1. Same solution as above, except initialize the map as a static variable, e.g. using a OnceCell, and have the descriptor methods access that. It's fine for a system label value to have the same ID, program-wide.
  2. Revert the changes. Slightly faster schedule construction isn't all that important if it only happens once or twice.
  3. Stay the course and add some magic macro attributes to enable some common patterns.
@maniwani maniwani added C-Feature A new feature, making something new possible S-Needs-Triage This issue needs to be labelled A-ECS Entities, components, systems, and events C-Usability A targeted quality-of-life change that makes Bevy easier to use and removed C-Feature A new feature, making something new possible labels Aug 3, 2022
@JoJoJet
Copy link
Member

JoJoJet commented Aug 3, 2022

  1. Just initialize the map as a static variable, e.g. using a OnceCell, and have our descriptor methods use that.

This is possible under #5377, and it could be made much easier (and derivable) with some design changes. Most label types are incredibly simple and don't need boxing, so we should only box the types that actually need it IMO.

@maniwani
Copy link
Contributor Author

maniwani commented Aug 3, 2022

This is possible under #5377.

#5377 continues #4957 though, doesn't it?

I want to return to a state where different values mean different labels, without macro attributes, etc. I think it's fine if label types have to be POD, but that's an irrelevant detail.

Alternative 1 is the same idea as the rest of the post, except we wouldn't pass around a reference to the map, it would just be a global variable.

Like, instead of trying to make the ID type handle everything awkwardly, I would prefer we intern one Box<dyn SystemLabel> for each label (but just one), so the ID can just be an ID, and we can lookup the boxed trait object if we need to print something.

// the `u64` here would be a hash of the value
pub struct SystemLabelId(TypeId, u64);`

@JoJoJet
Copy link
Member

JoJoJet commented Aug 3, 2022

#5377 continues #4957 though, doesn't it?

If it solves the problem, I don't see why that's a concern.

Alternative 1 is the same idea as the rest of the post, except we wouldn't pass around a reference to the map, it would just be a global variable.

We don't need anything complex like that.

Here's an example of a basic label that we should all be familiar with. Since it's so basic, heap allocations aren't necessary. I believe that the vast majority of labels will look like this.

#[derive(SystemLabel)]
pub struct MyLabel;

Here's an (approximate) example of a more-complex label, which must be stored on the heap. We can do that without having to change the design of the label traits.

pub struct OurLabel(Vec<i32>);

static MAP: RwLock<HashMap<u64, Vec<i32>>> = stuff();

impl SystemLabel for OurLabel {
    fn data(&self) -> u64 {
        let hash = calculate_hash(&self.0);
        MAP.write().entry(hash).or_insert_with(|| self.0.clone());
        hash
    }
    fn fmt(data: u64, f: &mut Formatter) -> fmt::Result {
        let val = MAP.read().get(&data)?;
        write!(f, "{val:?}")
    }
}
// I believe this can be made even simpler even, hashing ID's might not be necessary.
// I'd have to iterate.

We don't need to force every label to allocate, we can do it as an implementation detail for some types. This isn't currently supported by the derive macro, but it can be.

Does this solve your problem?

Edit: Downcasting is even possible, and ergonomic.

@Weibye Weibye removed the S-Needs-Triage This issue needs to be labelled label Aug 4, 2022
@JoJoJet
Copy link
Member

JoJoJet commented Jan 7, 2023

Gonna try and revive this thread -- I think we should consider making a decision on this issue. We need to resolve the current usability problems with labels, but should we improve upon #4957, or should we revert it?

Reverting it would make deriving labels really annoying again (#[derive(SystemLabel, Clone, PartialEq, Eq, Hash)]), but it would also remove all of the current limitations. If we want to keep #4957, we could try to make improvements to gradually lift limitations, but we could never lift them entirely.

I think either direction is acceptable.

@maniwani
Copy link
Contributor Author

maniwani commented Jan 29, 2023

I'm thinking we should do this for all label traits:

  • Make them require Debug.
  • Use the u64 returned by DynHash (plus TypeId if there's risk of collision) to distinguish between labels.
  • Leak and intern the Debug string as a Cow<'static, str> in a static map to avoid all the boxing.

Something like this:

/// `TypeId` + output of `DynHash` given the `DefaultHasher`.
/// Is different for every value.
struct RawValueHash(TypeId, u64);

/// Lightweight + printable identifier.
pub struct MyLabelTraitId {
    hash: RawValueHash,
    str: &'static str, // Can also use the hash to pull from the map instead of caching it here.
}

Basically a mix of stuff we've talked about before. We'd allocate one string for every label (upon first use), except in cases where we already have a &'static str (e.g. system type names). The hashing can be a blanket impl and the interning can be a default impl, so the derivable label traits just have to provide a method for sticking them in strongly-typed wrappers.

@maniwani
Copy link
Contributor Author

maniwani commented Jan 29, 2023

interning can be a default impl

Like this, I believe.

type Interner = RwLock<HashMap<RawValueHash, Cow<'static, str>>>;
static INTERNER: OnceCell<Interner> = OnceCell::new();

pub trait SystemSet: DynHash + Debug {
    fn id(&self) -> SystemSetId {       
        let key = self.hash();
        let mut map = INTERNER.get_or_init(|| default()).write().unwrap();
        let str = map
            .entry(key)
            .or_insert_with(|| {
                // Things like system-type sets can implement this differently to skip the heap allocation.
                let string = format!("{:?}", self);
                let str: &'static mut str = string.into_boxed_str().leak();
                str.into()
            })
            .deref();
            
        SystemSetId::new(key, str)
    }
}

Like I think minimizing heap allocations is a win even if there is still one per label. If done this way, manually implementing the trait is almost never required. All of our label traits can even share this one map. It sucks that the derive ends up being long, but I'd rather that than trying to workaround limitations.

Is it possible to have an attribute macro that fills in the missing derives?

@JoJoJet
Copy link
Member

JoJoJet commented Jan 30, 2023

Is it possible to have an attribute macro that fills in the missing derives?

We could just make #[derive(SystemSet)] generate a basic Hash impl in addition to the SystemSet impl. That derive should be very trivial to include, although we'd probably want a way of opting out from deriving any extra traits.

github-merge-queue bot pushed a commit that referenced this issue Oct 25, 2023
# Objective

First of all, this PR took heavy inspiration from #7760 and #5715. It
intends to also fix #5569, but with a slightly different approach.


This also fixes #9335 by reexporting `DynEq`.

## Solution

The advantage of this API is that we can intern a value without
allocating for zero-sized-types and for enum variants that have no
fields. This PR does this automatically in the `SystemSet` and
`ScheduleLabel` derive macros for unit structs and fieldless enum
variants. So this should cover many internal and external use cases of
`SystemSet` and `ScheduleLabel`. In these optimal use cases, no memory
will be allocated.

- The interning returns a `Interned<dyn SystemSet>`, which is just a
wrapper around a `&'static dyn SystemSet`.
- `Hash` and `Eq` are implemented in terms of the pointer value of the
reference, similar to my first approach of anonymous system sets in
#7676.
- Therefore, `Interned<T>` does not implement `Borrow<T>`, only `Deref`.
- The debug output of `Interned<T>` is the same as the interned value.

Edit: 
- `AppLabel` is now also interned and the old
`derive_label`/`define_label` macros were replaced with the new
interning implementation.
- Anonymous set ids are reused for different `Schedule`s, reducing the
amount of leaked memory.

### Pros
- `InternedSystemSet` and `InternedScheduleLabel` behave very similar to
the current `BoxedSystemSet` and `BoxedScheduleLabel`, but can be copied
without an allocation.
- Many use cases don't allocate at all.
- Very fast lookups and comparisons when using `InternedSystemSet` and
`InternedScheduleLabel`.
- The `intern` module might be usable in other areas.
- `Interned{ScheduleLabel, SystemSet, AppLabel}` does implement
`{ScheduleLabel, SystemSet, AppLabel}`, increasing ergonomics.

### Cons
- Implementors of `SystemSet` and `ScheduleLabel` still need to
implement `Hash` and `Eq` (and `Clone`) for it to work.

## Changelog

### Added

- Added `intern` module to `bevy_utils`.
- Added reexports of `DynEq` to `bevy_ecs` and `bevy_app`.

### Changed

- Replaced `BoxedSystemSet` and `BoxedScheduleLabel` with
`InternedSystemSet` and `InternedScheduleLabel`.
- Replaced `impl AsRef<dyn ScheduleLabel>` with `impl ScheduleLabel`.
- Replaced `AppLabelId` with `InternedAppLabel`.
- Changed `AppLabel` to use `Debug` for error messages.
- Changed `AppLabel` to use interning.
- Changed `define_label`/`derive_label` to use interning. 
- Replaced `define_boxed_label`/`derive_boxed_label` with
`define_label`/`derive_label`.
- Changed anonymous set ids to be only unique inside a schedule, not
globally.
- Made interned label types implement their label trait. 

### Removed

- Removed `define_boxed_label` and `derive_boxed_label`. 

## Migration guide

- Replace `BoxedScheduleLabel` and `Box<dyn ScheduleLabel>` with
`InternedScheduleLabel` or `Interned<dyn ScheduleLabel>`.
- Replace `BoxedSystemSet` and `Box<dyn SystemSet>` with
`InternedSystemSet` or `Interned<dyn SystemSet>`.
- Replace `AppLabelId` with `InternedAppLabel` or `Interned<dyn
AppLabel>`.
- Types manually implementing `ScheduleLabel`, `AppLabel` or `SystemSet`
need to implement:
  - `dyn_hash` directly instead of implementing `DynHash`
  - `as_dyn_eq`
- Pass labels to `World::try_schedule_scope`, `World::schedule_scope`,
`World::try_run_schedule`. `World::run_schedule`, `Schedules::remove`,
`Schedules::remove_entry`, `Schedules::contains`, `Schedules::get` and
`Schedules::get_mut` by value instead of by reference.

---------

Co-authored-by: Joseph <21144246+JoJoJet@users.noreply.github.com>
Co-authored-by: Carter Anderson <mcanders1@gmail.com>
ameknite pushed a commit to ameknite/bevy that referenced this issue Nov 6, 2023
# Objective

First of all, this PR took heavy inspiration from bevyengine#7760 and bevyengine#5715. It
intends to also fix bevyengine#5569, but with a slightly different approach.


This also fixes bevyengine#9335 by reexporting `DynEq`.

## Solution

The advantage of this API is that we can intern a value without
allocating for zero-sized-types and for enum variants that have no
fields. This PR does this automatically in the `SystemSet` and
`ScheduleLabel` derive macros for unit structs and fieldless enum
variants. So this should cover many internal and external use cases of
`SystemSet` and `ScheduleLabel`. In these optimal use cases, no memory
will be allocated.

- The interning returns a `Interned<dyn SystemSet>`, which is just a
wrapper around a `&'static dyn SystemSet`.
- `Hash` and `Eq` are implemented in terms of the pointer value of the
reference, similar to my first approach of anonymous system sets in
bevyengine#7676.
- Therefore, `Interned<T>` does not implement `Borrow<T>`, only `Deref`.
- The debug output of `Interned<T>` is the same as the interned value.

Edit: 
- `AppLabel` is now also interned and the old
`derive_label`/`define_label` macros were replaced with the new
interning implementation.
- Anonymous set ids are reused for different `Schedule`s, reducing the
amount of leaked memory.

### Pros
- `InternedSystemSet` and `InternedScheduleLabel` behave very similar to
the current `BoxedSystemSet` and `BoxedScheduleLabel`, but can be copied
without an allocation.
- Many use cases don't allocate at all.
- Very fast lookups and comparisons when using `InternedSystemSet` and
`InternedScheduleLabel`.
- The `intern` module might be usable in other areas.
- `Interned{ScheduleLabel, SystemSet, AppLabel}` does implement
`{ScheduleLabel, SystemSet, AppLabel}`, increasing ergonomics.

### Cons
- Implementors of `SystemSet` and `ScheduleLabel` still need to
implement `Hash` and `Eq` (and `Clone`) for it to work.

## Changelog

### Added

- Added `intern` module to `bevy_utils`.
- Added reexports of `DynEq` to `bevy_ecs` and `bevy_app`.

### Changed

- Replaced `BoxedSystemSet` and `BoxedScheduleLabel` with
`InternedSystemSet` and `InternedScheduleLabel`.
- Replaced `impl AsRef<dyn ScheduleLabel>` with `impl ScheduleLabel`.
- Replaced `AppLabelId` with `InternedAppLabel`.
- Changed `AppLabel` to use `Debug` for error messages.
- Changed `AppLabel` to use interning.
- Changed `define_label`/`derive_label` to use interning. 
- Replaced `define_boxed_label`/`derive_boxed_label` with
`define_label`/`derive_label`.
- Changed anonymous set ids to be only unique inside a schedule, not
globally.
- Made interned label types implement their label trait. 

### Removed

- Removed `define_boxed_label` and `derive_boxed_label`. 

## Migration guide

- Replace `BoxedScheduleLabel` and `Box<dyn ScheduleLabel>` with
`InternedScheduleLabel` or `Interned<dyn ScheduleLabel>`.
- Replace `BoxedSystemSet` and `Box<dyn SystemSet>` with
`InternedSystemSet` or `Interned<dyn SystemSet>`.
- Replace `AppLabelId` with `InternedAppLabel` or `Interned<dyn
AppLabel>`.
- Types manually implementing `ScheduleLabel`, `AppLabel` or `SystemSet`
need to implement:
  - `dyn_hash` directly instead of implementing `DynHash`
  - `as_dyn_eq`
- Pass labels to `World::try_schedule_scope`, `World::schedule_scope`,
`World::try_run_schedule`. `World::run_schedule`, `Schedules::remove`,
`Schedules::remove_entry`, `Schedules::contains`, `Schedules::get` and
`Schedules::get_mut` by value instead of by reference.

---------

Co-authored-by: Joseph <21144246+JoJoJet@users.noreply.github.com>
Co-authored-by: Carter Anderson <mcanders1@gmail.com>
rdrpenguin04 pushed a commit to rdrpenguin04/bevy that referenced this issue Jan 9, 2024
# Objective

First of all, this PR took heavy inspiration from bevyengine#7760 and bevyengine#5715. It
intends to also fix bevyengine#5569, but with a slightly different approach.


This also fixes bevyengine#9335 by reexporting `DynEq`.

## Solution

The advantage of this API is that we can intern a value without
allocating for zero-sized-types and for enum variants that have no
fields. This PR does this automatically in the `SystemSet` and
`ScheduleLabel` derive macros for unit structs and fieldless enum
variants. So this should cover many internal and external use cases of
`SystemSet` and `ScheduleLabel`. In these optimal use cases, no memory
will be allocated.

- The interning returns a `Interned<dyn SystemSet>`, which is just a
wrapper around a `&'static dyn SystemSet`.
- `Hash` and `Eq` are implemented in terms of the pointer value of the
reference, similar to my first approach of anonymous system sets in
bevyengine#7676.
- Therefore, `Interned<T>` does not implement `Borrow<T>`, only `Deref`.
- The debug output of `Interned<T>` is the same as the interned value.

Edit: 
- `AppLabel` is now also interned and the old
`derive_label`/`define_label` macros were replaced with the new
interning implementation.
- Anonymous set ids are reused for different `Schedule`s, reducing the
amount of leaked memory.

### Pros
- `InternedSystemSet` and `InternedScheduleLabel` behave very similar to
the current `BoxedSystemSet` and `BoxedScheduleLabel`, but can be copied
without an allocation.
- Many use cases don't allocate at all.
- Very fast lookups and comparisons when using `InternedSystemSet` and
`InternedScheduleLabel`.
- The `intern` module might be usable in other areas.
- `Interned{ScheduleLabel, SystemSet, AppLabel}` does implement
`{ScheduleLabel, SystemSet, AppLabel}`, increasing ergonomics.

### Cons
- Implementors of `SystemSet` and `ScheduleLabel` still need to
implement `Hash` and `Eq` (and `Clone`) for it to work.

## Changelog

### Added

- Added `intern` module to `bevy_utils`.
- Added reexports of `DynEq` to `bevy_ecs` and `bevy_app`.

### Changed

- Replaced `BoxedSystemSet` and `BoxedScheduleLabel` with
`InternedSystemSet` and `InternedScheduleLabel`.
- Replaced `impl AsRef<dyn ScheduleLabel>` with `impl ScheduleLabel`.
- Replaced `AppLabelId` with `InternedAppLabel`.
- Changed `AppLabel` to use `Debug` for error messages.
- Changed `AppLabel` to use interning.
- Changed `define_label`/`derive_label` to use interning. 
- Replaced `define_boxed_label`/`derive_boxed_label` with
`define_label`/`derive_label`.
- Changed anonymous set ids to be only unique inside a schedule, not
globally.
- Made interned label types implement their label trait. 

### Removed

- Removed `define_boxed_label` and `derive_boxed_label`. 

## Migration guide

- Replace `BoxedScheduleLabel` and `Box<dyn ScheduleLabel>` with
`InternedScheduleLabel` or `Interned<dyn ScheduleLabel>`.
- Replace `BoxedSystemSet` and `Box<dyn SystemSet>` with
`InternedSystemSet` or `Interned<dyn SystemSet>`.
- Replace `AppLabelId` with `InternedAppLabel` or `Interned<dyn
AppLabel>`.
- Types manually implementing `ScheduleLabel`, `AppLabel` or `SystemSet`
need to implement:
  - `dyn_hash` directly instead of implementing `DynHash`
  - `as_dyn_eq`
- Pass labels to `World::try_schedule_scope`, `World::schedule_scope`,
`World::try_run_schedule`. `World::run_schedule`, `Schedules::remove`,
`Schedules::remove_entry`, `Schedules::contains`, `Schedules::get` and
`Schedules::get_mut` by value instead of by reference.

---------

Co-authored-by: Joseph <21144246+JoJoJet@users.noreply.github.com>
Co-authored-by: Carter Anderson <mcanders1@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-ECS Entities, components, systems, and events C-Usability A targeted quality-of-life change that makes Bevy easier to use
Projects
None yet
3 participants