Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CtxRef::clone is a footgun #1005

Closed
danielkeller opened this issue Dec 27, 2021 · 10 comments · Fixed by #1050
Closed

CtxRef::clone is a footgun #1005

danielkeller opened this issue Dec 27, 2021 · 10 comments · Fixed by #1050

Comments

@danielkeller
Copy link
Contributor

The docs for CtxRef prominently mention that it's an Arc and that it's cloneable. But starting a new frame mutates, not the context, but the CtxRef, invalidating prior clones. I'm sure there's a good reason for this, but in most libraries that give cloneable "handle" types (eg winit, wgpu, etc), mutating methods only affect the thing the handle points to, not the handle itself. I think the API and docs shouldn't suggest that CtxRef and Context are handles, or at least mention that they're handles only for the duration of one frame.

@emilk
Copy link
Owner

emilk commented Dec 28, 2021

You are right in pointing this out - I thought I had documented it, but looking at it now it seems I have forgotten it.

The reason for it is performance and ergonomics. Everything in Context that changes needs to be protected by a mutex. Some things (input and fonts) doesn't change over a frame, so by using a "generational" Context we can avoid having a mutex on them. A mutex has a performance cost, but also an ergonomic cost, as one needs to take care of the lifetime of the lock and avoid double-locks.

There is perhaps some other design that can get the best of two worlds.

@danielkeller
Copy link
Contributor Author

I was actually thinking about the use of mutexes as well. Specifically, without multi_threaded set, the library is thread-safe in that it doesn't have UB if the Context is shared, but it's not semantically correct to share it; any concurrent use causes the AtomicRefCell to panic. I think that if the API makes sense with interior mutability, it would be simpler to embrace the interior mutability: Put everything in one big RefCell, borrow mutably in the implementation, and be !Sync. If the user wants to be thread safe, they just have to wrap it in their own mutex. Alternatively, if a mutable API makes sense, users can "bring their own" interior mutability with RefCell or mutex.
If I have some time, I'll pull the repo and let you know if I have any ideas.

@danielkeller
Copy link
Contributor Author

I tried it out and came up with this, which makes Context a plain struct and moves the interior mutability into CtxRef. The result is not terribly un-ergonomic I think. Hopefully, these ideas are helpful in some way.

@emilk
Copy link
Owner

emilk commented Dec 30, 2021

Nice! How does cargo bench compare before/after?

@danielkeller
Copy link
Contributor Author

I added another commit that properly integrates it with the thread safety feature flags. The cargo bench result is:

Gnuplot not found, using plotters backend
demo_with_tessellate__realistic                                                                            
                        time:   [299.36 us 300.16 us 301.17 us]
                        change: [-1.6892% -1.1828% -0.6514%] (p = 0.00 < 0.05)
                        Change within noise threshold.
Found 10 outliers among 100 measurements (10.00%)
  4 (4.00%) high mild
  6 (6.00%) high severe

demo_no_tessellate      time:   [153.54 us 153.96 us 154.43 us]                               
                        change: [-3.2983% -2.4609% -1.7028%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 17 outliers among 100 measurements (17.00%)
  6 (6.00%) low mild
  6 (6.00%) high mild
  5 (5.00%) high severe

demo_only_tessellate    time:   [141.72 us 143.14 us 145.63 us]                                 
                        change: [+1.1797% +2.3508% +4.0084%] (p = 0.00 < 0.05)
                        Performance has regressed.
Found 8 outliers among 100 measurements (8.00%)
  8 (8.00%) high severe

label &str              time:   [750.54 ns 806.60 ns 926.40 ns]                        
                        change: [-5.1927% +8.3103% +23.811%] (p = 0.25 > 0.05)
                        No change in performance detected.
Found 8 outliers among 100 measurements (8.00%)
  3 (3.00%) high mild
  5 (5.00%) high severe

label format!           time:   [764.43 ns 769.65 ns 775.11 ns]                           
                        change: [+6.3419% +8.5020% +10.931%] (p = 0.00 < 0.05)
                        Performance has regressed.
Found 10 outliers among 100 measurements (10.00%)
  8 (8.00%) high mild
  2 (2.00%) high severe

text_layout_uncached    time:   [148.01 us 148.40 us 148.84 us]                                 
                        change: [+0.0701% +0.9198% +1.7344%] (p = 0.03 < 0.05)
                        Change within noise threshold.
Found 14 outliers among 100 measurements (14.00%)
  1 (1.00%) low mild
  5 (5.00%) high mild
  8 (8.00%) high severe

text_layout_cached      time:   [369.29 ns 370.63 ns 372.25 ns]                               
                        change: [+0.3301% +1.2641% +2.2045%] (p = 0.01 < 0.05)
                        Change within noise threshold.
Found 16 outliers among 100 measurements (16.00%)
  5 (5.00%) high mild
  11 (11.00%) high severe

tessellate_text         time:   [13.662 us 13.775 us 13.899 us]                             
                        change: [+1.1666% +2.4728% +3.6767%] (p = 0.00 < 0.05)
                        Performance has regressed.
Found 8 outliers among 100 measurements (8.00%)
  8 (8.00%) high mild

@emilk
Copy link
Owner

emilk commented Jan 3, 2022

Are the benchmarks change:s the difference compared to master?

@emilk
Copy link
Owner

emilk commented Jan 3, 2022

I created a draft PR for easier reviewing of your progress!

@emilk
Copy link
Owner

emilk commented Jan 6, 2022

I've run the benchmark a few times, and the numbers look good. I like this approach! I think I gave up on it because I didn't manage to find MappedRwLockWriteGuard and friends.

When I tested I found two issues:

  • cargo run -p egui_demo_app --release hangs, likely some deadlock in fn warm_up
  • cd egui_demo_app && cargo run doesn't compile

I also think the Arc -> Rc change could be a separate PR to keep the diff down. It seems orthogonal to the rest of the effort, and could be made before or after the One mutex per Context change.

Do you have more time to work on this?

@emilk
Copy link
Owner

emilk commented Jan 8, 2022

Switching from Arc to Rc is a good idea (and atomic_refcell -> Cell), but requires a lot more to change. For instance, we also need to make all the egui code be able to handle non-Sync in e.g. id_type_map.rs, which is a lot of head-ache. In particular, it needs to be Sync when the multi_threaded feature is on, but not be Sync when it is off.

That change is also not an ergonomic feature, but an optimization, so should be a separate PR/issue.

@emilk
Copy link
Owner

emilk commented Jan 8, 2022

I'm continuing this work in #1050 and will try to get it mergable asap!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
2 participants