Revisit labels again #5569

maniwani · 2022-08-03T18:17:07Z

What problem does this solve or what need does it fill?

#4957 made labels much cheaper, but also made them much less capable. By nature, the design moves away from using Hash and PartialEq impls to using (TypeId, &'static str) tuples for comparisons. (#5377 recovers constant-time performance by adding a u64, so the &'static str or formatter can just be there for log/debug purposes).

Beyond simple fieldless enum variants, this change basically prevents using different values to mean different labels, e.g. MyLabel(T), which I think is a big ergonomics hit.

Under bevyengine/rfcs#45, the plan for state machines is to facilitate linking each state to a pair of on-enter and on-exit schedules, which would run from within an exclusive system handling state transitions. The implementation for this in #4391 was looking extremely simple.

pub struct CurrentState(pub S);
pub struct NextState(pub S);
pub struct PrevState(pub S);

#[derive(SystemLabel)]
pub struct OnEnter<S: State>(pub S);

#[derive(SystemLabel)]
pub struct OnExit<S: State>(pub S);

pub fn apply_state_transition<S: State>(world: &mut World) {
    world.resource_scope(|world, mut state: Mut<CurrentState<S>>| {
        if world.resource::<NextState<S>>().0.is_some() {
            let new_state = world.resource_mut::<NextState<S>>().0.take().unwrap();
            let old_state = std::mem::replace(&mut state.0, new_state.clone());
            world.resource_mut::<PrevState<S>>().0 = Some(old_state.clone());
            run_schedule(OnExit(old_state), world);
            run_schedule(OnEnter(new_state), world);
        }
    });
}

pub fn run_schedule(system_set: impl SystemLabel, world: &mut World) {
    let systems = world.resource_mut::<Systems>();
    let schedule = systems.take_schedule(&system_set));
    
    schedule.run(world);

    let systems = world.resource_mut::<Systems>();
    systems.return_schedule(&system_set, schedule);
}

The apply_state_transition system just reads the state resources, then loads and runs their appropriate schedules (if they exist). What was really nice was that it was just a convenient export of something users could do on their own.

However, this is no longer possible, and I'm not sure how to adapt it for the current 0.8 build. Since I don't know S, I don't know how to produce a unique &'static str / u64 for each label value. There's probably some way to do it, but having to do a special manual impl for a simple wrapper type feels ridiculous. As one of its main authors, I believe that what's coming with stageless will lead to apps having more labels, so increasing friction here by reducing what users can do with even POD types seems counterproductive.

It's true that schedule construction had too much boxing, but those costs were infrequent, basically once at startup, so I'm skeptical the tradeoff we made was worth it.

What solution would you like?

I haven't figured out an obvious solution yet, but the essence of the problem is that:

bevy wants plain integers for efficient schedule construction
user wants names for logging/debugging

All the boxing happened in the descriptor API methods, where we had to collect all SystemLabel trait objects into a struct before handing it to the SystemStage, so we ended up boxing each and every use of a label.

(edit) Removing that excessive boxing seems to have been the main reason people supported the new scheme.

Since add_system is a method on SystemStage, could we add a HashMap<SystemLabelId, Box<dyn SystemLabel>> field and somehow pass a &mut reference to it into the descriptor process? If we can do that, we'd only box each label once and different values could mean different labels again. The *LabelId types can just wrap (type_id, value_hash) tuples (currently same size as a u128).

What alternative(s) have you considered?

Same solution as above, except initialize the map as a static variable, e.g. using a OnceCell, and have the descriptor methods access that. It's fine for a system label value to have the same ID, program-wide.
Revert the changes. Slightly faster schedule construction isn't all that important if it only happens once or twice.
Stay the course and add some magic macro attributes to enable some common patterns.

The text was updated successfully, but these errors were encountered:

JoJoJet · 2022-08-03T19:43:30Z

Just initialize the map as a static variable, e.g. using a OnceCell, and have our descriptor methods use that.

This is possible under #5377, and it could be made much easier (and derivable) with some design changes. Most label types are incredibly simple and don't need boxing, so we should only box the types that actually need it IMO.

maniwani · 2022-08-03T20:33:44Z

This is possible under #5377.

#5377 continues #4957 though, doesn't it?

I want to return to a state where different values mean different labels, without macro attributes, etc. I think it's fine if label types have to be POD, but that's an irrelevant detail.

Alternative 1 is the same idea as the rest of the post, except we wouldn't pass around a reference to the map, it would just be a global variable.

Like, instead of trying to make the ID type handle everything awkwardly, I would prefer we intern one Box<dyn SystemLabel> for each label (but just one), so the ID can just be an ID, and we can lookup the boxed trait object if we need to print something.

// the `u64` here would be a hash of the value
pub struct SystemLabelId(TypeId, u64);`

JoJoJet · 2022-08-03T23:43:03Z

#5377 continues #4957 though, doesn't it?

If it solves the problem, I don't see why that's a concern.

Alternative 1 is the same idea as the rest of the post, except we wouldn't pass around a reference to the map, it would just be a global variable.

We don't need anything complex like that.

Here's an example of a basic label that we should all be familiar with. Since it's so basic, heap allocations aren't necessary. I believe that the vast majority of labels will look like this.

#[derive(SystemLabel)]
pub struct MyLabel;

Here's an (approximate) example of a more-complex label, which must be stored on the heap. We can do that without having to change the design of the label traits.

pub struct OurLabel(Vec<i32>);

static MAP: RwLock<HashMap<u64, Vec<i32>>> = stuff();

impl SystemLabel for OurLabel {
    fn data(&self) -> u64 {
        let hash = calculate_hash(&self.0);
        MAP.write().entry(hash).or_insert_with(|| self.0.clone());
        hash
    }
    fn fmt(data: u64, f: &mut Formatter) -> fmt::Result {
        let val = MAP.read().get(&data)?;
        write!(f, "{val:?}")
    }
}
// I believe this can be made even simpler even, hashing ID's might not be necessary.
// I'd have to iterate.

We don't need to force every label to allocate, we can do it as an implementation detail for some types. This isn't currently supported by the derive macro, but it can be.

Does this solve your problem?

Edit: Downcasting is even possible, and ergonomic.

JoJoJet · 2023-01-07T01:49:40Z

Gonna try and revive this thread -- I think we should consider making a decision on this issue. We need to resolve the current usability problems with labels, but should we improve upon #4957, or should we revert it?

Reverting it would make deriving labels really annoying again (#[derive(SystemLabel, Clone, PartialEq, Eq, Hash)]), but it would also remove all of the current limitations. If we want to keep #4957, we could try to make improvements to gradually lift limitations, but we could never lift them entirely.

I think either direction is acceptable.

maniwani · 2023-01-29T17:50:26Z

I'm thinking we should do this for all label traits:

Make them require Debug.
Use the u64 returned by DynHash (plus TypeId if there's risk of collision) to distinguish between labels.
Leak and intern the Debug string as a Cow<'static, str> in a static map to avoid all the boxing.

Something like this:

/// `TypeId` + output of `DynHash` given the `DefaultHasher`.
/// Is different for every value.
struct RawValueHash(TypeId, u64);

/// Lightweight + printable identifier.
pub struct MyLabelTraitId {
    hash: RawValueHash,
    str: &'static str, // Can also use the hash to pull from the map instead of caching it here.
}

Basically a mix of stuff we've talked about before. We'd allocate one string for every label (upon first use), except in cases where we already have a &'static str (e.g. system type names). The hashing can be a blanket impl and the interning can be a default impl, so the derivable label traits just have to provide a method for sticking them in strongly-typed wrappers.

maniwani · 2023-01-29T18:41:00Z

interning can be a default impl

Like this, I believe.

type Interner = RwLock<HashMap<RawValueHash, Cow<'static, str>>>;
static INTERNER: OnceCell<Interner> = OnceCell::new();

pub trait SystemSet: DynHash + Debug {
    fn id(&self) -> SystemSetId {       
        let key = self.hash();
        let mut map = INTERNER.get_or_init(|| default()).write().unwrap();
        let str = map
            .entry(key)
            .or_insert_with(|| {
                // Things like system-type sets can implement this differently to skip the heap allocation.
                let string = format!("{:?}", self);
                let str: &'static mut str = string.into_boxed_str().leak();
                str.into()
            })
            .deref();
            
        SystemSetId::new(key, str)
    }
}

Like I think minimizing heap allocations is a win even if there is still one per label. If done this way, manually implementing the trait is almost never required. All of our label traits can even share this one map. It sucks that the derive ends up being long, but I'd rather that than trying to workaround limitations.

Is it possible to have an attribute macro that fills in the missing derives?

JoJoJet · 2023-01-30T03:34:35Z

Is it possible to have an attribute macro that fills in the missing derives?

We could just make #[derive(SystemSet)] generate a basic Hash impl in addition to the SystemSet impl. That derive should be very trivial to include, although we'd probably want a way of opting out from deriving any extra traits.

# Objective First of all, this PR took heavy inspiration from #7760 and #5715. It intends to also fix #5569, but with a slightly different approach. This also fixes #9335 by reexporting `DynEq`. ## Solution The advantage of this API is that we can intern a value without allocating for zero-sized-types and for enum variants that have no fields. This PR does this automatically in the `SystemSet` and `ScheduleLabel` derive macros for unit structs and fieldless enum variants. So this should cover many internal and external use cases of `SystemSet` and `ScheduleLabel`. In these optimal use cases, no memory will be allocated. - The interning returns a `Interned<dyn SystemSet>`, which is just a wrapper around a `&'static dyn SystemSet`. - `Hash` and `Eq` are implemented in terms of the pointer value of the reference, similar to my first approach of anonymous system sets in #7676. - Therefore, `Interned<T>` does not implement `Borrow<T>`, only `Deref`. - The debug output of `Interned<T>` is the same as the interned value. Edit: - `AppLabel` is now also interned and the old `derive_label`/`define_label` macros were replaced with the new interning implementation. - Anonymous set ids are reused for different `Schedule`s, reducing the amount of leaked memory. ### Pros - `InternedSystemSet` and `InternedScheduleLabel` behave very similar to the current `BoxedSystemSet` and `BoxedScheduleLabel`, but can be copied without an allocation. - Many use cases don't allocate at all. - Very fast lookups and comparisons when using `InternedSystemSet` and `InternedScheduleLabel`. - The `intern` module might be usable in other areas. - `Interned{ScheduleLabel, SystemSet, AppLabel}` does implement `{ScheduleLabel, SystemSet, AppLabel}`, increasing ergonomics. ### Cons - Implementors of `SystemSet` and `ScheduleLabel` still need to implement `Hash` and `Eq` (and `Clone`) for it to work. ## Changelog ### Added - Added `intern` module to `bevy_utils`. - Added reexports of `DynEq` to `bevy_ecs` and `bevy_app`. ### Changed - Replaced `BoxedSystemSet` and `BoxedScheduleLabel` with `InternedSystemSet` and `InternedScheduleLabel`. - Replaced `impl AsRef<dyn ScheduleLabel>` with `impl ScheduleLabel`. - Replaced `AppLabelId` with `InternedAppLabel`. - Changed `AppLabel` to use `Debug` for error messages. - Changed `AppLabel` to use interning. - Changed `define_label`/`derive_label` to use interning. - Replaced `define_boxed_label`/`derive_boxed_label` with `define_label`/`derive_label`. - Changed anonymous set ids to be only unique inside a schedule, not globally. - Made interned label types implement their label trait. ### Removed - Removed `define_boxed_label` and `derive_boxed_label`. ## Migration guide - Replace `BoxedScheduleLabel` and `Box<dyn ScheduleLabel>` with `InternedScheduleLabel` or `Interned<dyn ScheduleLabel>`. - Replace `BoxedSystemSet` and `Box<dyn SystemSet>` with `InternedSystemSet` or `Interned<dyn SystemSet>`. - Replace `AppLabelId` with `InternedAppLabel` or `Interned<dyn AppLabel>`. - Types manually implementing `ScheduleLabel`, `AppLabel` or `SystemSet` need to implement: - `dyn_hash` directly instead of implementing `DynHash` - `as_dyn_eq` - Pass labels to `World::try_schedule_scope`, `World::schedule_scope`, `World::try_run_schedule`. `World::run_schedule`, `Schedules::remove`, `Schedules::remove_entry`, `Schedules::contains`, `Schedules::get` and `Schedules::get_mut` by value instead of by reference. --------- Co-authored-by: Joseph <21144246+JoJoJet@users.noreply.github.com> Co-authored-by: Carter Anderson <mcanders1@gmail.com>

# Objective First of all, this PR took heavy inspiration from bevyengine#7760 and bevyengine#5715. It intends to also fix bevyengine#5569, but with a slightly different approach. This also fixes bevyengine#9335 by reexporting `DynEq`. ## Solution The advantage of this API is that we can intern a value without allocating for zero-sized-types and for enum variants that have no fields. This PR does this automatically in the `SystemSet` and `ScheduleLabel` derive macros for unit structs and fieldless enum variants. So this should cover many internal and external use cases of `SystemSet` and `ScheduleLabel`. In these optimal use cases, no memory will be allocated. - The interning returns a `Interned<dyn SystemSet>`, which is just a wrapper around a `&'static dyn SystemSet`. - `Hash` and `Eq` are implemented in terms of the pointer value of the reference, similar to my first approach of anonymous system sets in bevyengine#7676. - Therefore, `Interned<T>` does not implement `Borrow<T>`, only `Deref`. - The debug output of `Interned<T>` is the same as the interned value. Edit: - `AppLabel` is now also interned and the old `derive_label`/`define_label` macros were replaced with the new interning implementation. - Anonymous set ids are reused for different `Schedule`s, reducing the amount of leaked memory. ### Pros - `InternedSystemSet` and `InternedScheduleLabel` behave very similar to the current `BoxedSystemSet` and `BoxedScheduleLabel`, but can be copied without an allocation. - Many use cases don't allocate at all. - Very fast lookups and comparisons when using `InternedSystemSet` and `InternedScheduleLabel`. - The `intern` module might be usable in other areas. - `Interned{ScheduleLabel, SystemSet, AppLabel}` does implement `{ScheduleLabel, SystemSet, AppLabel}`, increasing ergonomics. ### Cons - Implementors of `SystemSet` and `ScheduleLabel` still need to implement `Hash` and `Eq` (and `Clone`) for it to work. ## Changelog ### Added - Added `intern` module to `bevy_utils`. - Added reexports of `DynEq` to `bevy_ecs` and `bevy_app`. ### Changed - Replaced `BoxedSystemSet` and `BoxedScheduleLabel` with `InternedSystemSet` and `InternedScheduleLabel`. - Replaced `impl AsRef<dyn ScheduleLabel>` with `impl ScheduleLabel`. - Replaced `AppLabelId` with `InternedAppLabel`. - Changed `AppLabel` to use `Debug` for error messages. - Changed `AppLabel` to use interning. - Changed `define_label`/`derive_label` to use interning. - Replaced `define_boxed_label`/`derive_boxed_label` with `define_label`/`derive_label`. - Changed anonymous set ids to be only unique inside a schedule, not globally. - Made interned label types implement their label trait. ### Removed - Removed `define_boxed_label` and `derive_boxed_label`. ## Migration guide - Replace `BoxedScheduleLabel` and `Box<dyn ScheduleLabel>` with `InternedScheduleLabel` or `Interned<dyn ScheduleLabel>`. - Replace `BoxedSystemSet` and `Box<dyn SystemSet>` with `InternedSystemSet` or `Interned<dyn SystemSet>`. - Replace `AppLabelId` with `InternedAppLabel` or `Interned<dyn AppLabel>`. - Types manually implementing `ScheduleLabel`, `AppLabel` or `SystemSet` need to implement: - `dyn_hash` directly instead of implementing `DynHash` - `as_dyn_eq` - Pass labels to `World::try_schedule_scope`, `World::schedule_scope`, `World::try_run_schedule`. `World::run_schedule`, `Schedules::remove`, `Schedules::remove_entry`, `Schedules::contains`, `Schedules::get` and `Schedules::get_mut` by value instead of by reference. --------- Co-authored-by: Joseph <21144246+JoJoJet@users.noreply.github.com> Co-authored-by: Carter Anderson <mcanders1@gmail.com>

Weibye removed the S-Needs-Triage This issue needs to be labelled label Aug 4, 2022

maniwani mentioned this issue Feb 20, 2023

Use string interning to avoid boxing labels #7760

Closed

geieredgar mentioned this issue Feb 20, 2023

Replace all labels with interned labels #7762

Merged

garryod mentioned this issue Mar 24, 2023

Provide abstraction for a channel johanhelsing/matchbox#160

Merged

cart closed this as completed in #7762 Oct 25, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Revisit labels again #5569

Revisit labels again #5569

maniwani commented Aug 3, 2022 •

edited

Loading

JoJoJet commented Aug 3, 2022 •

edited

Loading

maniwani commented Aug 3, 2022 •

edited

Loading

JoJoJet commented Aug 3, 2022 •

edited

Loading

JoJoJet commented Jan 7, 2023 •

edited

Loading

maniwani commented Jan 29, 2023 •

edited

Loading

maniwani commented Jan 29, 2023 •

edited

Loading

JoJoJet commented Jan 30, 2023 •

edited

Loading

Revisit labels again #5569

Revisit labels again #5569

Comments

maniwani commented Aug 3, 2022 • edited Loading

What problem does this solve or what need does it fill?

What solution would you like?

What alternative(s) have you considered?

JoJoJet commented Aug 3, 2022 • edited Loading

maniwani commented Aug 3, 2022 • edited Loading

JoJoJet commented Aug 3, 2022 • edited Loading

JoJoJet commented Jan 7, 2023 • edited Loading

maniwani commented Jan 29, 2023 • edited Loading

maniwani commented Jan 29, 2023 • edited Loading

JoJoJet commented Jan 30, 2023 • edited Loading

maniwani commented Aug 3, 2022 •

edited

Loading

JoJoJet commented Aug 3, 2022 •

edited

Loading

maniwani commented Aug 3, 2022 •

edited

Loading

JoJoJet commented Aug 3, 2022 •

edited

Loading

JoJoJet commented Jan 7, 2023 •

edited

Loading

maniwani commented Jan 29, 2023 •

edited

Loading

maniwani commented Jan 29, 2023 •

edited

Loading

JoJoJet commented Jan 30, 2023 •

edited

Loading