Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Some tests sometimes randomly fail #4996

Closed
LoipesMas opened this issue Jun 12, 2022 · 9 comments
Closed

Some tests sometimes randomly fail #4996

LoipesMas opened this issue Jun 12, 2022 · 9 comments
Labels
A-Build-System Related to build systems or continuous integration A-ECS Entities, components, systems, and events C-Bug An unexpected or incorrect behavior P-Crash A sudden unexpected crash S-Ready-For-Implementation This issue is ready for an implementation PR. Go for it!

Comments

@LoipesMas
Copy link
Contributor

LoipesMas commented Jun 12, 2022

Bevy version

f969c62

What you did

Run cargo test (in crates/bevy_transform)

What went wrong

Sometimes (~10% of the time) test systems::test::correct_transforms_when_no_children or systems::test::did_propagate_command_buffer fail.

Additional information

Output:

---- systems::test::correct_transforms_when_no_children stdout ----
thread 'systems::test::correct_transforms_when_no_children' panicked at 'assertion failed: `(left == r
ight)`
  left: `1v0`,
 right: `2v0`: Malformed hierarchy. This probably means that your hierarchy has been improperly mainta
ined, or contains a cycle', crates/bevy_transform/src/systems.rs:69:9
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace


failures:
    systems::test::correct_transforms_when_no_children

Same assert fails for systems::test::did_propagate_command_buffer.
Considering it happens randomly it's probably a race-condition.

I'm on Linux, but it also happens on GitHub CI (see #4994), so I assume my system information is not relevant.

Also, during my testing it happened ~7 times for correct_transforms_when_no_children and once for systems::test::did_propagate_command_buffer.

@LoipesMas LoipesMas added C-Bug An unexpected or incorrect behavior S-Needs-Triage This issue needs to be labelled labels Jun 12, 2022
@LoipesMas
Copy link
Contributor Author

Some more testing:

  • Running only, for example, correct_transforms_when_no_children hasn't failed.
  • Running cargo test --jobs 1 can fail.

@bjorn3 bjorn3 added A-ECS Entities, components, systems, and events P-Crash A sudden unexpected crash S-Needs-Investigation This issue requires detective work to figure out what's going wrong and removed S-Needs-Triage This issue needs to be labelled labels Jun 12, 2022
@alice-i-cecile alice-i-cecile added the A-Build-System Related to build systems or continuous integration label Jun 12, 2022
@mockersf
Copy link
Member

git bisect seems to point to #2250 as causing the issue

@alice-i-cecile
Copy link
Member

Ping @james7132 and @hymm; any ideas?

@alice-i-cecile alice-i-cecile added this to the Bevy 0.8 milestone Jun 12, 2022
@hymm
Copy link
Contributor

hymm commented Jun 12, 2022

how do globals work in cargo tests? Does each test get it's own instance or are the sharing them?

@bjorn3
Copy link
Contributor

bjorn3 commented Jun 12, 2022

They are shared as tests run in the same process on unique threads.

@hymm
Copy link
Contributor

hymm commented Jun 12, 2022

it looks like the tests don't explicitly order transform_propagate_system and parent_update, but the plugins do. Maybe they would run in the added order normally, but with #2250 the executors get starved and they can run in the reverse order sometimes.

@alice-i-cecile alice-i-cecile removed this from the Bevy 0.8 milestone Jun 12, 2022
@alice-i-cecile alice-i-cecile added S-Ready-For-Implementation This issue is ready for an implementation PR. Go for it! and removed S-Needs-Investigation This issue requires detective work to figure out what's going wrong labels Jun 12, 2022
@hymm
Copy link
Contributor

hymm commented Jun 12, 2022

hmm, adding an explicit ordering didn't fix the problem.

I think it has something to do with the panicking test. If I #[ignore] that test the problem seems to go away. Maybe if one of the other tests tasks gets onto the local queue of the thread that panics that causes a problem? I'm not sure what will happen to the task in that scenario.

@hymm
Copy link
Contributor

hymm commented Jun 12, 2022

opened #4998 to fix this. @LoipesMas can you test the PR and see if it fixes things for you?

@LoipesMas
Copy link
Contributor Author

Yep, that fixed it!

@bors bors bot closed this as completed in 3217f21 Jun 20, 2022
robtfm added a commit to robtfm/bevy that referenced this issue Jun 22, 2022
commit 114d169dcec0e426e7815f184c0f85e67713c95b
Author: Robert Swain <robert.swain@gmail.com>
Date:   Tue Jun 21 20:50:06 2022 +0000

    Callable PBR functions (#4939)

    # Objective

    - Builds on top of #4938
    - Make clustered-forward PBR lighting/shadows functionality callable
    - See #3969 for details

    ## Solution

    - Add `PbrInput` struct type containing a `StandardMaterial`, occlusion, world_position, world_normal, and frag_coord
    - Split functionality to calculate the unit view vector, and normal-mapped normal into `bevy_pbr::pbr_functions`
    - Split high-level shading flow into `pbr(in: PbrInput, N: vec3<f32>, V: vec3<f32>, is_orthographic: bool)` function in `bevy_pbr::pbr_functions`
    - Rework `pbr.wgsl` fragment stage entry point to make use of the new functions
    - This has been benchmarked on an M1 Max using `many_cubes -- sphere`. `main` had a median frame time of 15.88ms, this PR 15.99ms, which is a 0.69% frame time increase, which is within noise in my opinion.

    ---

    ## Changelog

    - Added: PBR shading code is now callable. Import `bevy_pbr::pbr_functions` and its dependencies, create a `PbrInput`, calculate the unit view and normal-mapped normal vectors and whether the projection is orthographic, and call `pbr()`!

commit c98826418027362e6f58fae47ad18b9df7ab2df5
Author: James Liu <contact@jamessliu.com>
Date:   Tue Jun 21 20:35:26 2022 +0000

    Mark mutable APIs under ECS storage as pub(crate) (#5065)

    # Objective
    Closes #1557. Partially addresses #3362.

    Cleanup the public facing API for storage types. Most of these APIs are difficult to use safely when directly interfacing with these types, and is also currently impossible to interact with in normal ECS use as there is no `World::storages_mut`. The majority of these types should be easy enough to read, and perhaps mutate the contents, but never structurally altered without the same checks in the rest of bevy_ecs code. This both cleans up the public facing types and helps use unused code detection to remove a few of the APIs we're not using internally.

    ## Solution

     - Mark all APIs that take `&mut T` under `bevy_ecs::storage` as `pub(crate)` or `pub(super)`
     - Cleanup after it all.

    Entire type visibility changes:

     - `BlobVec` is `pub(super)`, only storage code should be directly interacting with it.
     - `SparseArray` is now `pub(crate)` for the entire type. It's an implementation detail for `Table` and `(Component)SparseSet`.
     - `TableMoveResult` is now `pub(crate)

    ---

    ## Changelog
    TODO

    ## Migration Guide
    Dear God, I hope not.

commit 389df183433c85520f7aeda7a46e6eb26859fd72
Author: James Liu <contact@jamessliu.com>
Date:   Tue Jun 21 18:10:27 2022 +0000

    Change check_visibility to use thread-local queues instead of a channel (#4663)

    # Objective
    Further speed up visibility checking by removing the main sources of contention for the system.

    ## Solution
     - ~~Make `ComputedVisibility` a resource wrapping a `FixedBitset`.~~
     - ~~Remove `ComputedVisibility` as a component.~~

    ~~This adds a one-bit overhead to every entity in the app world. For a game with 100,000 entities, this is 12.5KB of memory. This is still small enough to fit entirely in most L1 caches. Also removes the need for a per-Entity change detection tick. This reduces the memory footprint of ComputedVisibility 72x.~~

    ~~The decreased memory usage and less fragmented memory locality should provide significant performance benefits.~~

    ~~Clearing visible entities should be significantly faster than before:~~

     - ~~Setting one `u32` to 0 clears 32 entities per cycle.~~
     - ~~No archetype fragmentation to contend with.~~
     - ~~Change detection is applied to the resource, so there is no per-Entity update tick requirement.~~

    ~~The side benefit of this design is that it removes one more "computed component" from userspace.  Though accessing the values within it are now less ergonomic.~~

    This PR changes `crossbeam_channel` in `check_visibility` to use a `Local<ThreadLocal<Cell<Vec<Entity>>>` to mark down visible entities instead.

    Co-Authored-By: TheRawMeatball <therawmeatball@gmail.com>
    Co-Authored-By: Aevyrie <aevyrie@gmail.com>

commit 511bcc963335314c9f655c49faa273b43bac02b6
Author: Federico Rinaldi <gisquerin@gmail.com>
Date:   Tue Jun 21 15:29:22 2022 +0000

    Improve entity and component API docs (#4767)

    # Objective

    The descriptions included in the API docs of `entity` module, `Entity` struct, and `Component` trait have some issues:
    1. the concept of entity is not clearly defined,
    2. descriptions are a little bit out of place,
    3. in a case the description leak too many details about the implementation,
    4. some descriptions are not exhaustive,
    5. there are not enough examples,
    6. the content can be formatted in a much better way.

    ## Solution

    1. ~~Stress the fact that entity is an abstract and elementary concept. Abstract because the concept of entity is not hardcoded into the library but emerges from the interaction of `Entity` with every other part of `bevy_ecs`, like components and world methods. Elementary because it is a fundamental concept that cannot be defined with other terms (like point in euclidean geometry, or time in classical physics).~~ We decided to omit the definition of entity in the API docs ([see why]). It is only described in its relationship with components.
    2. Information has been moved to relevant places and links are used instead in the other places.
    3. Implementation details about `Entity` have been reduced.
    4. Descriptions have been made more exhaustive by stating how to obtain and use items. Entity operations are enriched with `World` methods.
    5. Examples have been added or enriched.
    6. Sections have been added to organize content. Entity operations are now laid out in a table.

    ### Todo list

    - [x] Break lines at sentence-level.

    ## For reviewers

    - ~~I added a TODO over `Component` docs, make sure to check it out and discuss it if necessary.~~ ([Resolved])
    - You can easily check the rendered documentation by doing `cargo doc -p bevy_ecs --no-deps --open`.

    [see why]: https://github.com/bevyengine/bevy/pull/4767#discussion_r875106329
    [Resolved]: https://github.com/bevyengine/bevy/pull/4767#discussion_r874127825

commit c4fc5d88f0ba2d86ff6c9a489154e4ca29666e3b
Author: Félix Lescaudey de Maneville <felix.maneville@gmail.com>
Date:   Mon Jun 20 20:32:19 2022 +0000

    Fixed bevy_ui touch input (#4099)

    # Objective

    `bevy_ui` doesn't support correctly touch inputs because of two problems in the focus system:
    - It attempts to retrieve touch input with a specific `0` id
    - It doesn't retrieve touch positions and bases its focus solely on mouse position, absent from mobile devices

    ## Solution

    I added a few methods to the `Touches` resource, allowing to check if **any** touch input was pressed, released or cancelled and to retrieve the *position* of the first pressed touch input and adapted the focus system.

    I added a test button to the *iOS* example and it works correclty on emulator. I did not test on a real touch device as:
    - Android is not working (https://github.com/bevyengine/bevy/issues/3249)
    - I don't have an iOS device

commit 30ca97e287de91b3fe6b4d5b1ce27126c1e3f73a
Author: Domi <f.d@posteo.net>
Date:   Mon Jun 20 19:06:38 2022 +0000

    Fix Nix section of linux_dependencies.md (#5050)

    # Objective

    `nix-shell` reported: ```error: 'x11' has been renamed to/replaced by 'xlibsWrapper'```.

    ## Solution

    Replacing `x11` with `xlibsWrapper` in the Nix section of linux_dependencies.md fixes the problem on my system, and bevy projects build fine.

commit 984ce3fa2289aaa680d820675f748d0858732780
Author: Hoidigan <57080125+Hoidigan@users.noreply.github.com>
Date:   Mon Jun 20 18:31:46 2022 +0000

    Add `Input::reset_all`  (#5015)

    Adds a `reset_all` method to reset `pressed`, `just_pressed`, and `just_released` on the `Input`.

    Fixes #3383

commit 9089c8b73e26c1d9f1054d23d9f762e49e04b1d9
Author: Mark Lodato <mlodato517@gmail.com>
Date:   Mon Jun 20 18:04:31 2022 +0000

    Fix redundant "have" in CONTRIBUTING (#5036)

    **This Commit**

    1. Makes it so the sentence doesn't read "are contributors who have Have
       actively ..."
    2. Makes it so all three bullet points end in punctuation

    **Notes**
    Could also remove the leading "Have" from all bullet points and leave it
    on the previous sentence. That's the least redundant but I guess this is
    more flexible if we want to add a sentence that doesn't start with
    "Have" later.

commit 515c8a3f505e43b29ef6ef584cbd26d62cb357ed
Author: Mark Lodato <mlodato517@gmail.com>
Date:   Mon Jun 20 18:04:29 2022 +0000

    Update `clap` to 3.2 in tools using `value_parser` (#5031)

    **Why?**
    The `value_parser` `clap` attribute was added in
    [version 3.2.0][0]. With the current version of `3.1.12` users can get
    errors like:

    ```
    error: unexpected attribute: value_parser
      --> tools/spancmp/src/main.rs:18:25
       |
    18 |     #[clap(short, long, value_parser, default_value_t = 0.0)]
       |                         ^^^^^^^^^^^^
    ```

    See https://github.com/bevyengine/bevy/pull/4944#issuecomment-1157704785 for more details.

    [0]: https://github.com/clap-rs/clap/blob/master/CHANGELOG.md#320---2022-06-13

commit 2ec5ff9652297ec29d8bb502fe2ac24c7e2fefe6
Author: Hoidigan <57080125+Hoidigan@users.noreply.github.com>
Date:   Mon Jun 20 17:35:56 2022 +0000

    Add a `release_all` function to `Input`. (#5011)

    Adds a `release_all` function to `Input` that releases all of the currently pressed inputs and marks them as just released.

commit 3217f216aaf18e855ad925079098c6c0535cf55d
Author: Mike <mike.hsu@gmail.com>
Date:   Mon Jun 20 17:35:55 2022 +0000

    change panicking test to not run on global task pool (#4998)

    # Objective

    - Fixes #4996

    ## Solution

    - Panicking on the global task pool is probably bad. This changes the panicking test to use a single threaded stage to run the test instead.
    - I checked the other #[should_panic]
    - I also added explicit ordering between the transform propagate system and the parent update system. The ambiguous ordering didn't seem to be causing problems, but the tests are probably more correct this way. The plugins that add these systems have an explicit ordering. I can remove this if necessary.

    ## Note

    I don't have a 100% mental model of why panicking is causing intermittent failures. It probably has to do with a task for one of the other tests landing on the panicking thread when it actually panics. Why this causes a problem I'm not sure, but this PR seems to fix things.

    ## Open questions

    - there are some other #[should_panic] tests that run on the task pool in stage.rs. I don't think we restart panicked threads, so this might be killing most of the threads on the pool. But since they're not causing test failures, we should probably decide what to do about that separately. The solution in this PR won't work since those tests are explicitly testing parallelism.

commit 92ea73036225ae47322b99577542afdaa58517d3
Author: JoJoJet <joe102000@gmail.com>
Date:   Mon Jun 20 17:35:54 2022 +0000

    Add benchmarks for schedule dependency resolution (#4961)

    # Objective

    - Add benchmarks to test the performance of `Schedule`'s system dependency resolution.

    ## Solution

    - Do a series of benchmarks while increasing the number of systems in the schedule to see how the run-time scales.
    - Split the benchmarks into a group with no dependencies, and a group with many dependencies.

commit 218b0fd3b6edebab28d8e74498ef9ba97c88c55e
Author: Jakob Hellermann <jakob.hellermann@protonmail.com>
Date:   Mon Jun 20 17:18:58 2022 +0000

    `bevy_reflect`: put `serialize` into external `ReflectSerialize` type (#4782)

    builds on top of #4780

    # Objective

    `Reflect` and `Serialize` are currently very tied together because `Reflect` has a `fn serialize(&self) -> Option<Serializable<'_>>` method. Because of that, we can either implement `Reflect` for types like `Option<T>` with `T: Serialize` and have `fn serialize` be implemented, or without the bound but having `fn serialize` return `None`.

    By separating `ReflectSerialize` into a separate type (like how it already is for `ReflectDeserialize`, `ReflectDefault`), we could separately `.register::<Option<T>>()` and `.register_data::<Option<T>, ReflectSerialize>()` only if the type `T: Serialize`.

    This PR does not change the registration but allows it to be changed in a future PR.

    ## Solution

    - add the type
    ```rust
    struct ReflectSerialize { .. }
    impl<T: Reflect + Serialize> FromType<T> for ReflectSerialize { .. }
    ```

    - remove `#[reflect(Serialize)]` special casing.

    - when serializing reflect value types, look for `ReflectSerialize` in the `TypeRegistry` instead of calling `value.serialize()`

commit bb1d5248339d7a8508c9f794a03b2553e61d17cb
Author: François <mockersf@gmail.com>
Date:   Mon Jun 20 17:02:25 2022 +0000

    Cleanups in diagnostics (#3871)

    - changed `EntityCountDiagnosticsPlugin` to not use an exclusive system to get its entity count
    - removed mention of `WgpuResourceDiagnosticsPlugin` in example `log_diagnostics` as it doesn't exist anymore
    - added ability to enable, disable ~~or toggle~~ a diagnostic (fix #3767)
    - made diagnostic values lazy, so they are only computed if the diagnostic is enabled
    - do not log an average for diagnostics with only one value
    - removed `sum` function from diagnostic as it isn't really useful
    - ~~do not keep an average of the FPS diagnostic. it is already an average on the last 20 frames, so the average FPS was an average of the last 20 frames over the last 20 frames~~
    - do not compute the FPS value as an average over the last 20 frames but give the actual "instant FPS"
    - updated log format to use variable capture
    - added some doc
    - the frame counter diagnostic value can be reseted to 0

commit 9095d2fb31114f36e6e00b3ae4f8d6fbd88ee5ac
Author: Aevyrie <aevyrie@gmail.com>
Date:   Mon Jun 20 11:19:58 2022 +0000

    Physical viewport calculation fix (#5055)

    # Objective

    - Fixes early return when viewport is not set. This now matches the description of the function.

    ## Solution

    - Remove errant try `?`.

commit 8e8cbcc623b1e998788d70ac701ea9c50dc7c3c4
Author: François <mockersf@gmail.com>
Date:   Mon Jun 20 10:32:44 2022 +0000

    gltf: do not import IoTaskPool in wasm (#5038)

    # Objective

    - Remove a warning when building for wasm

    ## Solution

    - Do not import the dependency when building for wasm

commit d717c63d341739433755a0858e7a1b4bc575062f
Author: François <mockersf@gmail.com>
Date:   Mon Jun 20 10:32:43 2022 +0000

    enable optional dependencies to stay optional (#5023)

    # Objective

    - Optional dependencies were enabled by some features as a side effect. for example, enabling the `webgl` feature enables the `bevy_pbr` optional dependency

    ## Solution

    - Use the syntax introduced in rust 1.60 to specify weak dependency features: https://blog.rust-lang.org/2022/04/07/Rust-1.60.0.html#new-syntax-for-cargo-features

    > Weak dependency features tackle the second issue where the `"optional-dependency/feature-name"` syntax would always enable `optional-dependency`. However, often you want to enable the feature on the optional dependency only if some other feature has enabled the optional dependency. Starting in 1.60, you can add a ? as in `"package-name?/feature-name"` which will only enable the given feature if something else has enabled the optional dependency.

commit 5dbb178d5d4459009c260293db0a7780f3e9b803
Author: Corey Farwell <coreyf@rwell.org>
Date:   Sun Jun 19 16:53:49 2022 +0000

    Implement `Eq` and `PartialEq` for `MouseScrollUnit` (#5048)

commit 8b27124a801d83a1e92f1136ebd2bc9752c3e9d7
Author: François <mockersf@gmail.com>
Date:   Sat Jun 18 07:41:54 2022 +0000

    WGSL: use correct syntax for matrix access (#5039)

    # Objective

    - `.x` is not the correct syntax to access a column in a matrix in WGSL: https://www.w3.org/TR/WGSL/#matrix-access-expr
    - naga accepts it and translates it correctly, but it's not valid when shaders are kept as is and used directly in WGSL

    ## Solution

    - Use the correct syntax

commit a62ff657fec0023ca7d2d2daefb4573fb2814571
Author: François <mockersf@gmail.com>
Date:   Fri Jun 17 22:34:58 2022 +0000

    update hashbrown to 0.12 (#5035)

    # Objective

    - Update hashbrown to 0.12

    ## Solution

    - Replace #4004
    - As the 0.12 is already in Bevy dependency tree, it shouldn't be an issue to update
    - The exception for the 0.11 should be removed once https://github.com/zakarumych/gpu-descriptor/pull/21 is merged and released
    - Also removed a few exceptions that weren't needed anymore

commit caa61c5fb7990a816766d0797a81a8a9ac07b3e8
Author: Robert Swain <robert.swain@gmail.com>
Date:   Fri Jun 17 00:14:02 2022 +0000

    bevy_render: Fix KTX2 UASTC format mapping (#4569)

    # Objective

    - KTX2 UASTC format mapping was incorrect. For some reason I had written it to map to a set of data formats based on the count of KTX2 sample information blocks, but the mapping should be done based on the channel type in the sample information.
    - This is a valid change pulled out from #4514 as the attempt to fix the array textures there was incorrect

    ## Solution

    - Fix the KTX2 UASTC `DataFormat` enum to contain the correct formats based on the channel types in section 3.10.2 of https://github.khronos.org/KTX-Specification/ (search for "Basis Universal UASTC Format")
    - Correctly map from the sample information channel type to `DataFormat`
    - Correctly configure transcoding and the resulting texture format based on the `DataFormat`

    ---

    ## Changelog

    - Fixed: KTX2 UASTC format handling

commit 14ed3b30cbbc4ab51de8a7fafce6baff1eb62b73
Author: Arnav Choubey <56453634+x-52@users.noreply.github.com>
Date:   Thu Jun 16 13:20:37 2022 +0000

    Add documentation comments to `bevy_window` (#4333)

    # Objective
    - Add documentation comments and `#![warn(missing_docs)]` to `bevy_window`.
    - Part of #3492

commit ab72c8368f542c516a494069752b369ba6c2d131
Author: François <mockersf@gmail.com>
Date:   Wed Jun 15 19:18:53 2022 +0000

    Fix ron deprecation (#5021)

    # Objective

    - Update to fix `ron` deprecation

commit 32cd9899c84bb42b3b560f3e3b88102babc8eec0
Author: Ben Reeves <benwolverine2019@gmail.com>
Date:   Wed Jun 15 06:29:52 2022 +0000

    bevy_render: Add `attributes` and `attributes_mut` methods to `Mesh`. (#3927)

    # Use Case

    Seems generally useful, but specifically motivated by my work on the [`bevy_datasize`](https://github.com/BGR360/bevy_datasize) crate.

    For that project, I'm implementing "heap size estimators" for all of the Bevy internal types. To do this accurately for `Mesh`, I need to get the lengths of all of the mesh's attribute vectors.

    Currently, in order to accomplish this, I am doing the following:

    * Checking all of the attributes that are mentioned in the `Mesh` class ([see here](https://github.com/BGR360/bevy_datasize/blob/0531ec2d026085a31e937b12d5ecf4109005e737/src/builtins/render/mesh.rs#L46-L54))

    * Providing the user with an option to configure additional attributes to check ([see here](https://github.com/BGR360/bevy_datasize/blob/0531ec2d026085a31e937b12d5ecf4109005e737/src/config.rs#L7-L21))

    This is both overly complicated and a bit wasteful (since I have to check every attribute name that I know about in case there are attributes set for it).

    Co-authored-by: Carter Anderson <mcanders1@gmail.com>

commit dc950a4d2ff45cb0b35f16ec271e45bf3b6e8545
Author: Alice Cecile <alice.i.cecile@gmail.com>
Date:   Tue Jun 14 16:14:33 2022 +0000

    Fix broken `WorldCell` test (#5009)

    # Objective

    Fixes #5008. Aliasing references is allowed under Rust if and only if they are immutable.

    This logic applies to `WorldCell` as well.

commit 915fa69b666eae1d04976995534ffae969973dde
Author: Aevyrie <aevyrie@gmail.com>
Date:   Tue Jun 14 02:07:40 2022 +0000

    Parallel Frustum Culling (#4489)

    # Objective

    Working with a large number of entities with `Aabbs`, rendered with an instanced shader, I found the bottleneck became the frustum culling system. The goal of this PR is to significantly improve culling performance without any major changes. We should consider constructing a BVH for more substantial improvements.

    ## Solution

    - Convert the inner entity query to a parallel iterator with `par_for_each_mut` using a batch size of 1,024.
    - This outperforms single threaded culling when there are more than 1,000 entities.
      - Below this they are approximately equal, with <= 10 microseconds of multithreading overhead.
      - Above this, the multithreaded version is significantly faster, scaling linearly with core count.
    - In my million-entity-workload, this PR improves my framerate by 200% - 300%.

    ## log-log of `check_visibility` time vs. entities for single/multithreaded
    ![image](https://user-images.githubusercontent.com/2632925/163709007-7eab4437-e9f9-4c06-bac0-250073885110.png)

    ---

    ## Changelog

    Frustum culling is now run with a parallel query. When culling more than a thousand entities, this is faster than the previous method, scaling proportionally with the number of available cores.

commit c6222f1acc4a5d4478b5cb188f0eda24afb6546a
Author: Robert Swain <robert.swain@gmail.com>
Date:   Tue Jun 14 00:58:30 2022 +0000

    Separate out PBR lighting, shadows, clustered forward, and utils from pbr.wgsl (#4938)

    # Objective

    - Builds on top of #4901
    - Separate out PBR lighting, shadows, clustered forward, and utils from `pbr.wgsl` as part of making the PBR code more reusable and extensible.
    - See #3969 for details.

    ## Solution

    - Add `bevy_pbr::utils`, `bevy_pbr::clustered_forward`, `bevy_pbr::lighting`, `bevy_pbr::shadows` shader imports exposing many shader functions for external use
    - Split `PI`, `saturate()`, `hsv2rgb()`, and `random1D()` into `bevy_pbr::utils`
    - Split clustered-forward-specific functions into `bevy_pbr::clustered_forward`, including moving the debug visualization code into a `cluster_debug_visualization()` function in that import
    - Split PBR lighting functions into `bevy_pbr::lighting`
    - Split shadow functions into `bevy_pbr::shadows`

    ---

    ## Changelog

    - Added: `bevy_pbr::utils`, `bevy_pbr::clustered_forward`, `bevy_pbr::lighting`, `bevy_pbr::shadows` shader imports exposing many shader functions for external use
      - Split `PI`, `saturate()`, `hsv2rgb()`, and `random1D()` into `bevy_pbr::utils`
      - Split clustered-forward-specific functions into `bevy_pbr::clustered_forward`, including moving the debug visualization code into a `cluster_debug_visualization()` function in that import
      - Split PBR lighting functions into `bevy_pbr::lighting`
      - Split shadow functions into `bevy_pbr::shadows`

commit b333386271b38d96a78702172d522981004a1f1c
Author: Robert Swain <robert.swain@gmail.com>
Date:   Tue Jun 14 00:32:33 2022 +0000

    Add reusable shader functions for transforming position/normal/tangent (#4901)

    # Objective

    - Add reusable shader functions for transforming positions / normals / tangents between local and world / clip space for 2D and 3D so that they are done in a simple and correct way
    - The next step in #3969 so check there for more details.

    ## Solution

    - Add `bevy_pbr::mesh_functions` and `bevy_sprite::mesh2d_functions` shader imports
      - These contain `mesh_` and `mesh2d_` versions of the following functions:
        - `mesh_position_local_to_world`
        - `mesh_position_world_to_clip`
        - `mesh_position_local_to_clip`
        - `mesh_normal_local_to_world`
        - `mesh_tangent_local_to_world`
    - Use them everywhere where it is appropriate
      - Notably not in the sprite and UI shaders where `mesh2d_position_world_to_clip` could have been used, but including all the functions depends on the mesh binding so I chose to not use the function there
    - NOTE: The `mesh_` and `mesh2d_` functions are currently identical. However, if I had defined only `bevy_pbr::mesh_functions` and used that in bevy_sprite, then bevy_sprite would have a runtime dependency on bevy_pbr, which seems undesirable. I also expect that when we have a proper 2D rendering API, these functions will diverge between 2D and 3D.

    ---

    ## Changelog

    - Added: `bevy_pbr::mesh_functions` and `bevy_sprite::mesh2d_functions` shader imports containing `mesh_` and `mesh2d_` versions of the following functions:
      - `mesh_position_local_to_world`
      - `mesh_position_world_to_clip`
      - `mesh_position_local_to_clip`
      - `mesh_normal_local_to_world`
      - `mesh_tangent_local_to_world`

    ## Migration Guide

    - The `skin_tangents` function from the `bevy_pbr::skinning` shader import has been replaced with the `mesh_tangent_local_to_world` function from the `bevy_pbr::mesh_functions` shader import

commit 407c080e599e98274290d2fc89739ecde0303b84
Author: Boxy <supbscripter@gmail.com>
Date:   Mon Jun 13 23:35:54 2022 +0000

    Replace `ReadOnlyFetch` with `ReadOnlyWorldQuery` (#4626)

    # Objective

    - Fix a type inference regression introduced by #3001
    - Make read only bounds on world queries more user friendly

    ptrification required you to write `Q::Fetch: ReadOnlyFetch` as `for<'w> QueryFetch<'w, Q>: ReadOnlyFetch` which has the same type inference problem as `for<'w> QueryFetch<'w, Q>: FilterFetch<'w>` had, i.e. the following code would error:
    ```rust
    #[derive(Component)]
    struct Foo;

    fn bar(a: Query<(&Foo, Without<Foo>)>) {
        foo(a);
    }

    fn foo<Q: WorldQuery>(a: Query<Q, ()>)
    where
        for<'w> QueryFetch<'w, Q>: ReadOnlyFetch,
    {
    }
    ```
    `for<..>` bounds are also rather user unfriendly..

    ## Solution

    Remove the `ReadOnlyFetch` trait in favour of a `ReadOnlyWorldQuery` trait, and remove `WorldQueryGats::ReadOnlyFetch` in favor of `WorldQuery::ReadOnly` allowing the previous code snippet to be written as:
    ```rust
    #[derive(Component)]
    struct Foo;

    fn bar(a: Query<(&Foo, Without<Foo>)>) {
        foo(a);
    }

    fn foo<Q: ReadOnlyWorldQuery>(a: Query<Q, ()>) {}
    ```
    This avoids the `for<...>` bound which makes the code simpler and also fixes the type inference issue.

    The reason for moving the two functions out of `FetchState` and into `WorldQuery` is to allow the world query `&mut T` to share a `State` with the `&T` world query so that it can have `type ReadOnly = &T`. Presumably it would be possible to instead have a `ReadOnlyRefMut<T>` world query and then do `type ReadOnly = ReadOnlyRefMut<T>` much like how (before this PR) we had a `ReadOnlyWriteFetch<T>`. A side benefit of the current solution in this PR is that it will likely make it easier in the future to support an API such as `Query<&mut T> -> Query<&T>`. The primary benefit IMO is just that `ReadOnlyRefMut<T>` and its associated fetch would have to reimplement all of the logic that the `&T` world query impl does but this solution avoids that :)

    ---

    ## Changelog/Migration Guide

    The trait `ReadOnlyFetch` has been replaced with `ReadOnlyWorldQuery` along with the `WorldQueryGats::ReadOnlyFetch` assoc type which has been replaced with `<WorldQuery::ReadOnly as WorldQueryGats>::Fetch`
    - Any where clauses such as `QueryFetch<Q>: ReadOnlyFetch` should be replaced with `Q: ReadOnlyWorldQuery`.
    - Any custom world query impls should implement `ReadOnlyWorldQuery` insead of `ReadOnlyFetch`

    Functions `update_component_access` and `update_archetype_component_access` have been moved from the `FetchState` trait to `WorldQuery`
    - Any callers should now call `Q::update_component_access(state` instead of `state.update_component_access` (and `update_archetype_component_access` respectively)
    - Any custom world query impls should move the functions from the `FetchState` impl to `WorldQuery` impl

    `WorldQuery` has been made an `unsafe trait`, `FetchState` has been made a safe `trait`. (I think this is how it should have always been, but regardless this is _definitely_ necessary now that the two functions have been moved to `WorldQuery`)
    - If you have a custom `FetchState` impl make it a normal `impl` instead of `unsafe impl`
    - If you have a custom `WorldQuery` impl make it an `unsafe impl`, if your code was sound before it is going to still be sound

commit 4050c8aa310b9c482f387d10c42aa4b59f399749
Author: Simonas Kazlauskas <github@kazlauskas.me>
Date:   Mon Jun 13 22:40:29 2022 +0000

    bevy_log: upgrade to tracing-tracy 0.10.0 (#4991)

    This upgrade should bring some significant performance improvements to
    instrumentation. These are mostly achieved by disabling features (by
    default) that are likely not widely used by default – collection of
    callstacks and support for fibers that wasn't used for anything in
    particular yet. For callstack collection it might be worthwhile to
    provide a mechanism to enable this at runtime by calling
    `TracyLayer::with_stackdepth`.

    These should bring the cost of a single span down from 30+µs per span to
    a more reasonable 1.5µs or so and down to the ns scale for events (on my
    1st gen Ryzen machine, anyway.) There is still a fair amount of overhead
    over plain tracy_client instrumentation in formatting and such, but
    dealing with it requires significant effort and this is a
    straightforward improvement to have for the time being.

    Co-authored-by: Simonas Kazlauskas <git@kazlauskas.me>

commit 0560aa58932bba5035fc7ebd0cdecfaa015dbb93
Author: Mike <mike.hsu@gmail.com>
Date:   Mon Jun 13 21:51:16 2022 +0000

    Fix clap for CI (#5005)

    # Objective

    - Fix CI
    - relevant clap issue https://github.com/clap-rs/clap/issues/3822

    ## Solution

    - slap `value_parser` on all the clap derives. This tells clap to use the default parser for the type.

commit b7d784de6e27c46cc42231e1dea94c3ba4de9463
Author: Chris Dawkins <dawkins.chris.dev@gmail.com>
Date:   Sun Jun 12 19:34:26 2022 +0000

    Bugfix `State::set` transition condition infinite loop (#4890)

    # Objective

    - Fixes #4271

    ## Solution

    - Check for a pending transition in addition to a scheduled operation.
    - I don't see a valid reason for updating the state unless both `scheduled` and `transition` are empty.

commit 5a09694dec4057e4c53351ef4de3fccf2252aaf3
Author: ickshonpe <david.curthoys@googlemail.com>
Date:   Sun Jun 12 19:14:48 2022 +0000

    Overflow::Hidden doesn't work correctly with scale_factor_override (#3854)

    # Objective

    Overflow::Hidden doesn't work correctly with scale_factor_override.
    If you run the Bevy UI example with scale_factor_override 3 you'll see half clipped text around the edges of the scrolling listbox.
    The problem seems to be that the corners of the node are transformed before the amount of clipping required is calculated. But then that transformed clip is compared to the original untransformed size of the node rect to see if it should be culled or not. With a higher scale factor the relative size of the untransformed node rect is going to be really big, so the overflow isn't culled.

    # Solution

    Multiply the size of the node rect by extracted_uinode.transform before the cull test.

commit f969c62f7bfaf1932fbb533dfd87467aeddf75cb
Author: François <mockersf@gmail.com>
Date:   Sat Jun 11 20:10:13 2022 +0000

    Fix wasm examples (#4967)

    # Objective

    Fix #4958

    There was 4 issues:

    - this is not true in WASM and on macOS: https://github.com/bevyengine/bevy/blob/f28b92120920f387020f3b3e858f0e7039b9c07e/examples/3d/split_screen.rs#L90
      - ~~I made sure the system was running at least once~~
      - I'm sending the event on window creation
    - in webgl, setting a viewport has impacts on other render passes
      - only in webgl and when there is a custom viewport, I added a render pass without a custom viewport
    - shaderdef NO_ARRAY_TEXTURES_SUPPORT was not used by the 2d pipeline
      - webgl feature was used but not declared in bevy_sprite, I added it to the Cargo.toml
    - shaderdef NO_STORAGE_BUFFERS_SUPPORT was not used by the 2d pipeline
      - I added it based on the BufferBindingType

    The last commit changes the two last fixes to add the shaderdefs in the shader cache directly instead of needing to do it in each pipeline

    Co-authored-by: Carter Anderson <mcanders1@gmail.com>

commit 772d15238c2edd6eb6006a71e74bc64fddaca7f1
Author: Aevyrie <aevyrie@gmail.com>
Date:   Sat Jun 11 09:13:37 2022 +0000

    Change default `Image` `FilterMode` to `Linear` (#4465)

    # Objective

    - Closes #4464

    ## Solution

    - Specify default mag and min filter types for `Image` instead of using `wgpu`'s defaults.

    ---

    ## Changelog

    ### Changed

    - Default `Image` filtering changed from `Nearest` to `Linear`.

    Co-authored-by: Carter Anderson <mcanders1@gmail.com>

commit 728d9696d73eef2675cce699fe1896f5c13701bd
Author: François <mockersf@gmail.com>
Date:   Sat Jun 11 08:56:26 2022 +0000

    fix nightly for miri to 2022-06-08 to avoid timeouts (#4984)

    # Objective

    - Fix timeout in miri

    ## Solution

    - Use a nightly version from before the issue happened: 2022-06-08
    - To be checked after https://github.com/rust-lang/miri/issues/2223 is fixed

commit 57b4620a7dc4acafc7ef00d5466caf4b09f9e3d2
Author: LoipesMas <46327403+LoipesMas@users.noreply.github.com>
Date:   Thu Jun 9 21:18:16 2022 +0000

    Fix Good-First-Issue label in  CONTRIBUTING.md (#4979)

    # Objective
    - CONTRIBUTING.md references wrong issue label, making it potentially confusing for new contributors.

    ## Solution
    - Update  CONTRIBUTING.md.

    ---
    I assume the label was changed recently.

commit e6f34ba47f5d8face2f154dce7bc93bc034017fb
Author: Gino Valente <gino.valente.code@gmail.com>
Date:   Thu Jun 9 21:18:15 2022 +0000

    bevy_reflect: Add statically available type info for reflected types (#4042)

    # Objective

    > Resolves #4504

    It can be helpful to have access to type information without requiring an instance of that type. Especially for `Reflect`, a lot of the gathered type information is known at compile-time and should not necessarily require an instance.

    ## Solution

    Created a dedicated `TypeInfo` enum to store static type information. All types that derive `Reflect` now also implement the newly created `Typed` trait:

    ```rust
    pub trait Typed: Reflect {
      fn type_info() -> &'static TypeInfo;
    }
    ```

    > Note: This trait was made separate from `Reflect` due to `Sized` restrictions.

    If you only have access to a `dyn Reflect`, just call `.get_type_info()` on it. This new trait method on `Reflect` should return the same value as if you had called it statically.

    If all you have is a `TypeId` or type name, you can get the `TypeInfo` directly from the registry using the `TypeRegistry::get_type_info` method (assuming it was registered).

    ### Usage

    Below is an example of working with `TypeInfo`. As you can see, we don't have to generate an instance of `MyTupleStruct` in order to get this information.

    ```rust
    #[derive(Reflect)]
    struct MyTupleStruct(usize, i32, MyStruct);

    let info = MyTupleStruct::type_info();
    if let TypeInfo::TupleStruct(info) = info {
      assert!(info.is::<MyTupleStruct>());
      assert_eq!(std::any::type_name::<MyTupleStruct>(), info.type_name());
      assert!(info.field_at(1).unwrap().is::<i32>());
    } else {
      panic!("Expected `TypeInfo::TupleStruct`");
    }
    ```

    ### Manual Implementations

    It's not recommended to manually implement `Typed` yourself, but if you must, you can use the `TypeInfoCell` to automatically create and manage the static `TypeInfo`s for you (which is very helpful for blanket/generic impls):

    ```rust
    use bevy_reflect::{Reflect, TupleStructInfo, TypeInfo, UnnamedField};
    use bevy_reflect::utility::TypeInfoCell;

    struct Foo<T: Reflect>(T);

    impl<T: Reflect> Typed for Foo<T> {
      fn type_info() -> &'static TypeInfo {
        static CELL: TypeInfoCell = TypeInfoCell::generic();
        CELL.get_or_insert::<Self, _>(|| {
          let fields = [UnnamedField::new::<T>()];
          let info = TupleStructInfo::new::<Self>(&fields);
          TypeInfo::TupleStruct(info)
        })
      }
    }
    ```

    ## Benefits

    One major benefit is that this opens the door to other serialization methods. Since we can get all the type info at compile time, we can know how to properly deserialize something like:

    ```rust
    #[derive(Reflect)]
    struct MyType {
      foo: usize,
      bar: Vec<String>
    }

    // RON to be deserialized:
    (
      type: "my_crate::MyType", // <- We now know how to deserialize the rest of this object
      value: {
        // "foo" is a value type matching "usize"
        "foo": 123,
        // "bar" is a list type matching "Vec<String>" with item type "String"
        "bar": ["a", "b", "c"]
      }
    )
    ```

    Not only is this more compact, but it has better compatibility (we can change the type of `"foo"` to `i32` without having to update our serialized data).

    Of course, serialization/deserialization strategies like this may need to be discussed and fully considered before possibly making a change. However, we will be better equipped to do that now that we can access type information right from the registry.

    ## Discussion

    Some items to discuss:

    1. Duplication. There's a bit of overlap with the existing traits/structs since they require an instance of the type while the type info structs do not (for example, `Struct::field_at(&self, index: usize)` and `StructInfo::field_at(&self, index: usize)`, though only `StructInfo` is accessible without an instance object). Is this okay, or do we want to handle it in another way?
    2. Should `TypeInfo::Dynamic` be removed? Since the dynamic types don't have type information available at runtime, we could consider them `TypeInfo::Value`s (or just even just `TypeInfo::Struct`). The intention with `TypeInfo::Dynamic` was to keep the distinction from these dynamic types and actual structs/values since users might incorrectly believe the methods of the dynamic type's info struct would map to some contained data (which isn't possible statically).
    4. General usefulness of this change, including missing/unnecessary parts.
    5. Possible changes to the scene format? (One possible issue with changing it like in the example above might be that we'd have to be careful when handling generic or trait object types.)

    ## Compile Tests

    I ran a few tests to compare compile times (as suggested [here](https://github.com/bevyengine/bevy/pull/4042#discussion_r876408143)). I toggled `Reflect` and `FromReflect` derive macros using `cfg_attr` for both this PR (aa5178e7736a6f8252e10e543e52722107649d3f) and main (c309acd4322b1c3b2089e247a2d28b938eb7b56d).

    <details>
    <summary>See More</summary>

    The test project included 250 of the following structs (as well as a few other structs):

    ```rust
    #[derive(Default)]
    #[cfg_attr(feature = "reflect", derive(Reflect))]
    #[cfg_attr(feature = "from_reflect", derive(FromReflect))]
    pub struct Big001 {
        inventory: Inventory,
        foo: usize,
        bar: String,
        baz: ItemDescriptor,
        items: [Item; 20],
        hello: Option<String>,
        world: HashMap<i32, String>,
        okay: (isize, usize, /* wesize */),
        nope: ((String, String), (f32, f32)),
        blah: Cow<'static, str>,
    }
    ```

    > I don't know if the compiler can optimize all these duplicate structs away, but I think it's fine either way. We're comparing times, not finding the absolute worst-case time.

    I only ran each build 3 times using `cargo build --timings` (thank you @devil-ira), each of which were preceeded by a `cargo clean --package bevy_reflect_compile_test`.

    Here are the times I got:

    | Test                             | Test 1 | Test 2 | Test 3 | Average |
    | -------------------------------- | ------ | ------ | ------ | ------- |
    | Main                             | 1.7s   | 3.1s   | 1.9s   | 2.33s   |
    | Main + `Reflect`                 | 8.3s   | 8.6s   | 8.1s   | 8.33s   |
    | Main + `Reflect` + `FromReflect` | 11.6s  | 11.8s  | 13.8s  | 12.4s   |
    | PR                               | 3.5s   | 1.8s   | 1.9s   | 2.4s    |
    | PR + `Reflect`                   | 9.2s   | 8.8s   | 9.3s   | 9.1s    |
    | PR + `Reflect` + `FromReflect`   | 12.9s  | 12.3s  | 12.5s  | 12.56s  |

    </details>

    ---

    ## Future Work

    Even though everything could probably be made `const`, we unfortunately can't. This is because `TypeId::of::<T>()` is not yet `const` (see https://github.com/rust-lang/rust/issues/77125). When it does get stabilized, it would probably be worth coming back and making things `const`.

    Co-authored-by: MrGVSV <49806985+MrGVSV@users.noreply.github.com>

commit 1679a9973825e5241ca91f33919e63667d4eb421
Author: Peter Hebden <peterhebden6@gmail.com>
Date:   Thu Jun 9 20:57:43 2022 +0000

    Fix typo in game_menu.rs (#4977)

    Should be trivial

    # Objective

    There is a typo in a comment that is fixed in this commit

    ## Solution

    Fix the typo
    `positionned` -> `positioned`

    ---

commit c6958b3056edfb7669da41ff9da103c379d97617
Author: François <mockersf@gmail.com>
Date:   Thu Jun 9 20:34:09 2022 +0000

    add a `SceneBundle` to spawn a scene (#2424)

    # Objective

    - Spawning a scene is handled as a special case with a command `spawn_scene` that takes an handle but doesn't let you specify anything else. This is the only handle that works that way.
    - Workaround for this have been to add the `spawn_scene` on `ChildBuilder` to be able to specify transform of parent, or to make the `SceneSpawner` available to be able to select entities from a scene by their instance id

    ## Solution

    Add a bundle
    ```rust
    pub struct SceneBundle {
        pub scene: Handle<Scene>,
        pub transform: Transform,
        pub global_transform: GlobalTransform,
        pub instance_id: Option<InstanceId>,
    }
    ```

    and instead of
    ```rust
    commands.spawn_scene(asset_server.load("models/FlightHelmet/FlightHelmet.gltf#Scene0"));
    ```
    you can do
    ```rust
    commands.spawn_bundle(SceneBundle {
        scene: asset_server.load("models/FlightHelmet/FlightHelmet.gltf#Scene0"),
        ..Default::default()
    });
    ```

    The scene will be spawned as a child of the entity with the `SceneBundle`

    ~I would like to remove the command `spawn_scene` in favor of this bundle but didn't do it yet to get feedback first~

    Co-authored-by: François <8672791+mockersf@users.noreply.github.com>
    Co-authored-by: Carter Anderson <mcanders1@gmail.com>

commit cdb62af4bf52f160399bd01936cf3fc5b7afbd81
Author: James Liu <contact@jamessliu.com>
Date:   Thu Jun 9 03:34:51 2022 +0000

    Replace ComponentSparseSet's internals with a Column (#4909)

    # Objective
    Following #4855, `Column` is just a parallel `BlobVec`/`Vec<UnsafeCell<ComponentTicks>>` pair, which is identical to the dense and ticks vecs in `ComponentSparseSet`, which has some code duplication with `Column`.

    ## Solution
    Replace dense and ticks in `ComponentSparseSet` with a `Column`.

commit f2b545049c9696bdce388e939022190c401a0dbc
Author: James Liu <contact@jamessliu.com>
Date:   Thu Jun 9 03:19:31 2022 +0000

    Implement FusedIterator for eligible Iterator types (#4942)

    # Objective
    Most of our `Iterator` impls satisfy the requirements of `std::iter::FusedIterator`, which has internal specialization that optimizes `Interator::fuse`. The std lib iterator combinators do have a few that rely on `fuse`, so this could optimize those use cases. I don't think we're using any of them in the engine itself, but beyond a light increase in compile time, it doesn't hurt to implement the trait.

    ## Solution
    Implement the trait for all eligible iterators in first party crates. Also add a missing `ExactSizeIterator` on an iterator that could use it.

commit 012ae07dc87d43a7437bd4200b4a7f04bf99dbf3
Author: James Liu <contact@jamessliu.com>
Date:   Thu Jun 9 02:43:24 2022 +0000

    Add global init and get accessors for all newtyped TaskPools (#2250)

    Right now, a direct reference to the target TaskPool is required to launch tasks on the pools, despite the three newtyped pools (AsyncComputeTaskPool, ComputeTaskPool, and IoTaskPool) effectively acting as global instances. The need to pass a TaskPool reference adds notable friction to spawning subtasks within existing tasks. Possible use cases for this may include chaining tasks within the same pool like spawning separate send/receive I/O tasks after waiting on a network connection to be established, or allowing cross-pool dependent tasks like starting dependent multi-frame computations following a long I/O load.

    Other task execution runtimes provide static access to spawning tasks (i.e. `tokio::spawn`), which is notably easier to use than the reference passing required by `bevy_tasks` right now.

    This PR makes does the following:

     * Adds `*TaskPool::init` which initializes a `OnceCell`'ed with a provided TaskPool. Failing if the pool has already been initialized.
     * Adds `*TaskPool::get` which fetches the initialized global pool of the respective type or panics. This generally should not be an issue in normal Bevy use, as the pools are initialized before they are accessed.
     * Updated default task pool initialization to either pull the global handles and save them as resources, or if they are already initialized, pull the a cloned global handle as the resource.

    This should make it notably easier to build more complex task hierarchies for dependent tasks. It should also make writing bevy-adjacent, but not strictly bevy-only plugin crates easier, as the global pools ensure it's all running on the same threads.

    One alternative considered is keeping a thread-local reference to the pool for all threads in each pool to enable the same `tokio::spawn` interface. This would spawn tasks on the same pool that a task is currently running in. However this potentially leads to potential footgun situations where long running blocking tasks run on `ComputeTaskPool`.

commit 5ace79ff09f47a41934b6efc299891706cf5cc89
Author: Cai Bingjun <62678643+C-BJ@users.noreply.github.com>
Date:   Wed Jun 8 17:55:57 2022 +0000

    Let the project page support GitHub's new ability to display open source licenses (#4966)

    Change _LICENSE-APACHE_ and _LICENSE-MIT_ file location
    Delete _LICENSE_
    You can make the license in about on bevy's GitHub page display as **Apache-2.0, MIT licenses found** instead of **View license**

commit f28b92120920f387020f3b3e858f0e7039b9c07e
Author: Carter Anderson <mcanders1@gmail.com>
Date:   Tue Jun 7 22:22:10 2022 +0000

    Add "depth_load_op" configuration to 3d Cameras (#4904)

    # Objective

    Users should be able to configure depth load operations on cameras. Currently every camera clears depth when it is rendered. But sometimes later passes need to rely on depth from previous passes.

    ## Solution

    This adds the `Camera3d::depth_load_op` field with a new `Camera3dDepthLoadOp` value. This is a custom type because Camera3d uses "reverse-z depth" and this helps us record and document that in a discoverable way. It also gives us more control over reflection + other trait impls, whereas `LoadOp` is owned by the `wgpu` crate.

    ```rust
    commands.spawn_bundle(Camera3dBundle {
        camera_3d: Camera3d {
            depth_load_op: Camera3dDepthLoadOp::Load,
            ..default()
        },
        ..default()
    });
    ```

    ### two_passes example with the "second pass" camera configured to the default (clear depth to 0.0)

    ![image](https://user-images.githubusercontent.com/2694663/171743172-46d4fdd5-5090-46ea-abe4-1fbc519f6ee8.png)

    ### two_passes example with the "second pass" camera configured to "load" the depth
    ![image](https://user-images.githubusercontent.com/2694663/171743323-74dd9a1d-9c25-4883-98dd-38ca0bed8c17.png)

    ---

    ## Changelog

    ### Added

    * `Camera3d` now has a `depth_load_op` field, which can configure the Camera's main 3d pass depth loading behavior.

commit cbf032419d3a04959cc83d2df64c5f53eeda606e
Author: Aevyrie <aevyrie@gmail.com>
Date:   Tue Jun 7 15:23:45 2022 +0000

    Refactor `Camera` methods and add viewport rect  (#4948)

    While working on a refactor of `bevy_mod_picking` to include viewport-awareness, I found myself writing these functions to test if a cursor coordinate was inside the camera's rendered area.

    # Objective

    - Simplify conversion from physical to logical pixels
    - Add methods that returns the dimensions of the viewport as a min-max rect

    ---

    ## Changelog

    - Added `Camera::to_logical`
    - Added `Camera::physical_viewport_rect`
    - Added `Camera::logical_viewport_rect`

commit d51a87cf2868596dffd7bf9cd0f77a4122165769
Author: Johan Klokkhammer Helsing <johanhelsing@gmail.com>
Date:   Tue Jun 7 08:14:10 2022 +0000

    Recommend posting new plugins in #crates discord channel (#4956)

    # Objective

    - Guide people to the right discord channel to post about their new plugin. #showcase was split into multiple channels.

    ## Solution

    - recommend posting in #crates

commit 73174730e44a9e2fea7f004e2b78581e8e33bddc
Author: François <mockersf@gmail.com>
Date:   Tue Jun 7 02:16:47 2022 +0000

    use the default() method in examples instead of Default::default() (#4952)

    # Objective

    - Use the `..default()` method in examples instead of `..Default::default()`

commit 649e30de0974fbba8477c5c21e5bea24d3efe4cd
Author: Danny <its.danny@hey.com>
Date:   Tue Jun 7 02:02:53 2022 +0000

    update system test example to include using events (#4951)

    # Objective

    - Adds an example of testing systems that handle events. I had a hard time figuring out how to do it a couple days ago so figured an official example could be useful.
    - Fixes #4936

    ## Solution

    - Adds a `Score` resource and an `EnemyDied` event. An `update_score` system updates the score when a new event comes through. I'm not sure the example is great, as this probably isn't how you'd do it in a real game, but I didn't want to change the existing example too much.

commit 39ea1bb9b7f03e9803648b2c28c269ce365e56e4
Author: François <mockersf@gmail.com>
Date:   Mon Jun 6 20:22:51 2022 +0000

    run examples in wasm in CI (#4818)

    # Objective

    - Run examples in WASM in CI
    - Fix #4817

    ## Solution

    - on feature `bevy_ci_testing`
      - add an extra log message before exiting
      - when building for wasm, read CI config file at compile time
    - add a simple [playwright](https://playwright.dev) test script that opens the browser then waits for the success log, and takes a screenshot
    - add a CI job that runs the playwright test for Chromium and Firefox on one example (lighting) and save the screenshots
      - Firefox screenshot is good (with some clusters visible)
      - Chromium screenshot is gray, I don't know why but it's logging `GPU stall due to ReadPixels`
      - Webkit is not enabled for now, to revisit once https://bugs.webkit.org/show_bug.cgi?id=234926 is fixed or worked around
    - the CI job only runs on bors validation

    example run: https://github.com/mockersf/bevy/actions/runs/2361673465. The screenshots can be downloaded

commit 193998b5d4f3f02d690f3b0fc475db9817ee56f6
Author: François <mockersf@gmail.com>
Date:   Mon Jun 6 20:00:30 2022 +0000

    add NO_STORAGE_BUFFERS_SUPPORT shaderdef when needed (#4949)

    # Objective

    - fix #4946
    - fix running 3d in wasm

    ## Solution

    - since #4867, the imports are splitter differently, and this shader def was not always set correctly depending on the shader used
    - add it when needed

commit 25219a4d18c262a4abf919d9fb73a957da5a028c
Author: Wybe Westra <dev@wwestra.nl>
Date:   Mon Jun 6 17:52:09 2022 +0000

    Add transparency examples (#3695)

    Adds examples demonstrating transparency for 2d, 3d and UI.

    Fixes #3215.

commit 92ddfe8ad40dc8a0270fe01a1d7dec83850e5ee0
Author: ira <JustTheCoolDude@gmail.com>
Date:   Mon Jun 6 16:09:16 2022 +0000

    Add methods for querying lists of entities. (#4879)

    # Objective
    Improve querying ergonomics around collections and iterators of entities.

    Example how queries over Children might be done currently.
    ```rust
    fn system(foo_query: Query<(&Foo, &Children)>, bar_query: Query<(&Bar, &Children)>) {
        for (foo, children) in &foo_query {
            for child in children.iter() {
                if let Ok((bar, children)) = bar_query.get(*child) {
                    for child in children.iter() {
                        if let Ok((foo, children)) = foo_query.get(*child) {
                            // D:
                        }
                    }
                }
            }
        }
    }
    ```
    Answers #4868
    Partially addresses #4864
    Fixes #1470
    ## Solution
    Based on the great work by @deontologician in #2563

    Added `iter_many` and `many_for_each_mut` to `Query`.
    These take a list of entities (Anything that implements `IntoIterator<Item: Borrow<Entity>>`).

    `iter_many` returns a `QueryManyIter` iterator over immutable results of a query (mutable data will be cast to an immutable form).

    `many_for_each_mut` calls a closure for every result of the query, ensuring not aliased mutability.
    This iterator goes over the list of entities in order and returns the result from the query for it. Skipping over any entities that don't match the query.

    Also added `unsafe fn iter_many_unsafe`.

    ### Examples
    ```rust
    #[derive(Component)]
    struct Counter {
        value: i32
    }

    #[derive(Component)]
    struct Friends {
        list: Vec<Entity>,
    }

    fn system(
        friends_query: Query<&Friends>,
        mut counter_query: Query<&mut Counter>,
    ) {
        for friends in &friends_query {
            for counter in counter_query.iter_many(&friends.list) {
                println!("Friend's counter: {:?}", counter.value);
            }

            counter_query.many_for_each_mut(&friends.list, |mut counter| {
                counter.value += 1;
                println!("Friend's counter: {:?}", counter.value);
            });
        }
    }

    ```

    Here's how example in the Objective section can be written with this PR.
    ```rust
    fn system(foo_query: Query<(&Foo, &Children)>, bar_query: Query<(&Bar, &Children)>) {
        for (foo, children) in &foo_query {
            for (bar, children) in bar_query.iter_many(children) {
                for (foo, children) in foo_query.iter_many(children) {
                    // :D
                }
            }
        }
    }
    ```
    ## Additional changes
    Implemented `IntoIterator` for `&Children` because why not.
    ## Todo
    - Bikeshed!

    Co-authored-by: deontologician <deontologician@gmail.com>

    Co-authored-by: devil-ira <justthecooldude@gmail.com>

commit b47291264b4d2508e288b4ec122589d3aae7a0e5
Author: dataphract <dataphract@gmail.com>
Date:   Mon Jun 6 15:47:52 2022 +0000

    diagnostics: meaningful error when graph node has wrong number of inputs (#4924)

    # Objective

    Currently, providing the wrong number of inputs to a render graph node triggers this assertion:

    ```
    thread 'main' panicked at 'assertion failed: `(left == right)`
      left: `1`,
     right: `2`', /[redacted]/bevy/crates/bevy_render/src/renderer/graph_runner.rs:164:13
    note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace
    ```

    This does not provide the user any context.

    ## Solution

    Add a new `RenderGraphRunnerError` variant to handle this case. The new message looks like this:

    ```
    ERROR bevy_render::renderer: Error running render graph:
    ERROR bevy_render::renderer: > node (name: 'Some("outline_pass")') has 2 input slots, but was provided 1 values
    ```

    ---

    ## Changelog

    ### Changed

    `RenderGraphRunnerError` now has a new variant, `MismatchedInputCount`.

    ## Migration Guide

    Exhaustive matches on `RenderGraphRunnerError` will need to add a branch to handle the new `MismatchedInputCount` variant.

commit c4080c68323c396261ef130d2eece4d8e5732a75
Author: Yutao Yuan <infmagic2047reg@outlook.com>
Date:   Mon Jun 6 15:47:51 2022 +0000

    Fix release workflow (#4903)

    # Objective

    While playing with the code, I found some problems in the recently merged version-bumping workflow:
    - Most importantly, now that we are using `0.8.0-dev` in development, the workflow will try to bump it to `0.9.0` :sob:
    - The crate filter is outdated now that we have more crates in `tools`.
    - We are using `bevy@users.noreply.github.com`, but according to [Github help](https://docs.github.com/en/account-and-profile/setting-up-and-managing-your-personal-account-on-github/managing-email-preferences/setting-your-commit-email-address#about-commit-email-addresses), that email address means "old no-reply email format for the user `bevy`". It is currently not associated with any account, but I feel this is still not appropriate here.

    ## Solution

    - Create a new workflow, `Post-release version bump`, that should be run after a release and bumps version from `0.X.0` to `0.X+1.0-dev`. Unfortunately, cargo-release doesn't have a builtin way to do this, so we need to parse and increment the version manually.
    - Add the new crates in `tools` to exclusion list. Also removes the dependency version specifier from `bevy_ecs_compile_fail_tests`. It is not in the workspace so the dependency version will not get automatically updated by cargo-release.
    - Change the author email to `41898282+github-actions[bot]@users.noreply.github.com`. According to the discussion [here](https://github.com/actions/checkout/issues/13#issuecomment-724415212) and [here](https://github.saobby.my.eu.orgmunity/t/github-actions-bot-email-address/17204/6), this is the email address associated with the github-actions bot account.
    - Also add the workflows to our release checklist.

    See infmagic2047#5 and infmagic2047#6 for examples of release and post-release PRs.

commit 85cd0eb445b6fbad6bcff86a37ccb0e59cf948ea
Author: TheRawMeatball <therawmeatball@gmail.com>
Date:   Mon Jun 6 14:46:41 2022 +0000

    Add ParallelCommands system parameter (#4749)

    (follow-up to #4423)
    # Objective
    Currently, it isn't possible to easily fire commands from within par_for_each blocks. This PR allows for issuing commands from within parallel scopes.

commit 2f5a1c6e1699cae7791f8122598116e233894a72
Author: Yoshiera <yoshierahuang@126.com>
Date:   Mon Jun 6 14:24:41 2022 +0000

    remove redundant query parameters (#4945)

    # Objective

    In the `queue_custom` system in `shader_instancing` example, the query of `material_meshes`  has a redundant `With<Handle<Mesh>>` query filter because `Handle<Mesh>` is included in the component access.

    ## Solution

    Remove the `With<Handle<Mesh>>` filter

commit 765bd46c2eb853a6921c4de47c065e4850ede5fe
Author: Thierry Berger <contact@thierryberger.com>
Date:   Mon Jun 6 00:06:49 2022 +0000

    add a post-processing example (#4797)

    # Objective

    - Add an example showing a custom post processing effect, done after the first rendering pass.

    ## Solution

    - A simple post processing "chromatic aberration" effect. I mixed together examples `3d/render_to_texture`, and `shader/shader_material_screenspace_texture`
    - Reading a bit how https://github.com/bevyengine/bevy/pull/3430 was done gave me pointers to apply the main pass to the 2d render rather than using a 3d quad.

    This work might be or not be relevant to https://github.com/bevyengine/bevy/issues/2724

    <details>

    <summary> ⚠️ Click for a video of the render ⚠️ I’ve been told it might hurt the eyes 👀 , maybe we should choose another effect just in case ?</summary>

    https://user-images.githubusercontent.com/2290685/169138830-a6dc8a9f-8798-44b9-8d9e-449e60614916.mp4

    </details>

    # Request for feedbacks

    - [ ] Is chromatic aberration effect ok ? (Correct term, not a danger for the eyes ?) I'm open to suggestion to make something different.
    - [ ] Is the code idiomatic ? I preferred a "main camera -> **new camera with post processing applied to a quad**" approach to emulate minimum modification to existing code wanting to add global post processing.

    ---

    ## Changelog

    - Add a full screen post processing shader example

commit 5e2cfb2f19e0e2287fd48d55df92f78617acf831
Author: Carter Anderson <mcanders1@gmail.com>
Date:   Sun Jun 5 00:27:49 2022 +0000

    Camera Driven Viewports (#4898)

    # Objective

    Users should be able to render cameras to specific areas of a render target, which enables scenarios like split screen, minimaps, etc.

    Builds on the new Camera Driven Rendering added here: #4745
    Fixes: #202
    Alternative to #1389 and #3626 (which are incompatible with the new Camera Driven Rendering)

    ## Solution

    ![image](https://user-images.githubusercontent.com/2694663/171560044-f0694f67-0cd9-4598-83e2-a9658c4fed57.png)

    Cameras can now configure an optional "viewport", which defines a rectangle within their render target to draw to. If a `Viewport` is defined, the camera's `CameraProjection`, `View`, and visibility calculations will use the viewport configuration instead of the full render target.

    ```rust
    // This camera will render to the first half of the primary window (on the left side).
    commands.spawn_bundle(Camera3dBundle {
        camera: Camera {
            viewport: Some(Viewport {
                physical_position: UVec2::new(0, 0),
                physical_size: UVec2::new(window.physical_width() / 2, window.physical_height()),
                depth: 0.0..1.0,
            }),
            ..default()
        },
        ..default()
    });
    ```

    To account for this, the `Camera` component has received a few adjustments:

    * `Camera` now has some new getter functions:
      * `logical_viewport_size`, `physical_viewport_size`, `logical_target_size`, `physical_target_size`, `projection_matrix`
    *  All computed camera values are now private and live on the `ComputedCameraValues` field (logical/physical width/height, the projection matrix). They are now exposed on `Camera` via getters/setters  This wasn't _needed_ for viewports, but it was long overdue.

    ---

    ## Changelog

    ### Added

    * `Camera` components now have a `viewport` field, which can be set to draw to a portion of a render target instead of the full target.
    * `Camera` component has some new functions: `logical_viewport_size`, `physical_viewport_size`, `logical_target_size`, `physical_target_size`, and `projection_matrix`
    * Added a new split_screen example illustrating how to render two cameras to the same scene

    ## Migration Guide

    `Camera::projection_matrix` is no longer a public field. Use the new `Camera::projection_matrix()` method instead:

    ```rust

    // Bevy 0.7
    let projection = camera.projection_matrix;

    // Bevy 0.8
    let projection = camera.projection_matrix();
    ```

commit 8e08e26c253192e9d6f23cfbb8b140556bf0c659
Author: Henry Sloan <henryksloan@gmail.com>
Date:   Sat Jun 4 20:00:01 2022 +0000

    Update commented vsync code in example to use present_mode (#4926)

    # Objective

    - To fix the broken commented code in `examples/shader/compute_shader_game_of_life.rs` for disabling frame throttling

    ## Solution

    - Change the commented code from using the old `WindowDescriptor::vsync` to the new `WindowDescriptor::present_mode`

    ### Note
    I chose to use the fully qualified scope `bevy::window::PresentWindow::Immediate` rather than explicitly including `PresentWindow` to avoid an unused import when the code is commented.

commit 3a9383f9977bf16461565633426777a62cb51356
Author: Alice Cecile <alice.i.cecile@gmail.com>
Date:   Sat Jun 4 14:30:44 2022 +0000

    Revert ndk-glue to 0.5 to synchronize with winit (#4916)

    # Objective

    - Upgrading ndk-glue (our Android interop layer) desynchronized us from winit
    - This further broke Android builds, see #4905 (oops...)
    - Reverting to 0.5 should help with this, until the new `winit` version releases
    - Fixes #4774 and closes #4529

commit 1fcb7d0c2e7086f8c3162b8ea5f9db3ff5f829c9
Author: Matthias Deiml <matthias@deiml.net>
Date:…
james7132 pushed a commit to james7132/bevy that referenced this issue Jun 22, 2022
# Objective

- Fixes bevyengine#4996 

## Solution

- Panicking on the global task pool is probably bad. This changes the panicking test to use a single threaded stage to run the test instead.
- I checked the other #[should_panic]
- I also added explicit ordering between the transform propagate system and the parent update system. The ambiguous ordering didn't seem to be causing problems, but the tests are probably more correct this way. The plugins that add these systems have an explicit ordering. I can remove this if necessary.

## Note

I don't have a 100% mental model of why panicking is causing intermittent failures. It probably has to do with a task for one of the other tests landing on the panicking thread when it actually panics. Why this causes a problem I'm not sure, but this PR seems to fix things.

## Open questions

- there are some other #[should_panic] tests that run on the task pool in stage.rs. I don't think we restart panicked threads, so this might be killing most of the threads on the pool. But since they're not causing test failures, we should probably decide what to do about that separately. The solution in this PR won't work since those tests are explicitly testing parallelism.
inodentry pushed a commit to IyesGames/bevy that referenced this issue Aug 8, 2022
# Objective

- Fixes bevyengine#4996 

## Solution

- Panicking on the global task pool is probably bad. This changes the panicking test to use a single threaded stage to run the test instead.
- I checked the other #[should_panic]
- I also added explicit ordering between the transform propagate system and the parent update system. The ambiguous ordering didn't seem to be causing problems, but the tests are probably more correct this way. The plugins that add these systems have an explicit ordering. I can remove this if necessary.

## Note

I don't have a 100% mental model of why panicking is causing intermittent failures. It probably has to do with a task for one of the other tests landing on the panicking thread when it actually panics. Why this causes a problem I'm not sure, but this PR seems to fix things.

## Open questions

- there are some other #[should_panic] tests that run on the task pool in stage.rs. I don't think we restart panicked threads, so this might be killing most of the threads on the pool. But since they're not causing test failures, we should probably decide what to do about that separately. The solution in this PR won't work since those tests are explicitly testing parallelism.
james7132 pushed a commit to james7132/bevy that referenced this issue Oct 28, 2022
# Objective

- Fixes bevyengine#4996 

## Solution

- Panicking on the global task pool is probably bad. This changes the panicking test to use a single threaded stage to run the test instead.
- I checked the other #[should_panic]
- I also added explicit ordering between the transform propagate system and the parent update system. The ambiguous ordering didn't seem to be causing problems, but the tests are probably more correct this way. The plugins that add these systems have an explicit ordering. I can remove this if necessary.

## Note

I don't have a 100% mental model of why panicking is causing intermittent failures. It probably has to do with a task for one of the other tests landing on the panicking thread when it actually panics. Why this causes a problem I'm not sure, but this PR seems to fix things.

## Open questions

- there are some other #[should_panic] tests that run on the task pool in stage.rs. I don't think we restart panicked threads, so this might be killing most of the threads on the pool. But since they're not causing test failures, we should probably decide what to do about that separately. The solution in this PR won't work since those tests are explicitly testing parallelism.
bors bot pushed a commit that referenced this issue Nov 2, 2022
# Objective
Right now, the `TaskPool` implementation allows panics to permanently kill worker threads upon panicking. This is currently non-recoverable without using a `std::panic::catch_unwind` in every scheduled task. This is poor ergonomics and even poorer developer experience. This is exacerbated by #2250 as these threads are global and cannot be replaced after initialization.

Removes the need for temporary fixes like #4998. Fixes #4996. Fixes #6081. Fixes #5285. Fixes #5054. Supersedes #2307.

## Solution

The current solution is to wrap `Executor::run` in `TaskPool` with a `catch_unwind`, and discarding the potential panic. This was taken straight from [smol](https://github.com/smol-rs/smol/blob/404c7bcc0aea59b82d7347058043b8de7133241c/src/spawn.rs#L44)'s current implementation. ~~However, this is not entirely ideal as:~~
 
 - ~~the signaled to the awaiting task. We would need to change `Task<T>` to use `async_task::FallibleTask` internally, and even then it doesn't signal *why* it panicked, just that it did.~~ (See below).
 - ~~no error is logged of any kind~~ (See below)
 - ~~it's unclear if it drops other tasks in the executor~~ (it does not)
 - ~~This allows the ECS parallel executor to keep chugging even though a system's task has been dropped. This inevitably leads to deadlock in the executor.~~ Assuming we don't catch the unwind in ParallelExecutor, this will naturally kill the main thread.

### Alternatives
A final solution likely will incorporate elements of any or all of the following.

#### ~~Log and Ignore~~
~~Log the panic, drop the task, keep chugging. This only addresses the discoverability of the panic. The process will continue to run, probably deadlocking the executor. tokio's detatched tasks operate in this fashion.~~

Panics already do this by default, even when caught by `catch_unwind`.

#### ~~`catch_unwind` in `ParallelExecutor`~~
~~Add another layer catching system-level panics into the `ParallelExecutor`. How the executor continues when a core dependency of many systems fails to run is up for debate.~~

`async_task::Task`  bubbles up panics already, this will transitively push panics all the way to the main thread.

#### ~~Emulate/Copy `tokio::JoinHandle` with `Task<T>`~~
~~`tokio::JoinHandle<T>` bubbles up the panic from the underlying task when awaited. This can be transitively applied across other APIs that also use `Task<T>` like `Query::par_for_each` and `TaskPool::scope`, bubbling up the panic until it's either caught or it reaches the main thread.~~

`async_task::Task`  bubbles up panics already, this will transitively push panics all the way to the main thread.

#### Abort on Panic
The nuclear option. Log the error, abort the entire process on any thread in the task pool panicking. Definitely avoids any additional infrastructure for passing the panic around, and might actually lead to more efficient code as any unwinding is optimized out. However gives the developer zero options for dealing with the issue, a seemingly poor choice for debuggability, and prevents graceful shutdown of the process. Potentially an option for handling very low-level task management (a la #4740). Roughly takes the shape of:

```rust
struct AbortOnPanic;

impl Drop for AbortOnPanic {
   fn drop(&mut self) {
     abort!();
   }
}

let guard = AbortOnPanic;
// Run task
std::mem::forget(AbortOnPanic);
```

---

## Changelog

Changed: `bevy_tasks::TaskPool`'s threads  will no longer terminate permanently when a task scheduled onto them panics.
Changed: `bevy_tasks::Task` and`bevy_tasks::Scope` will propagate panics in the spawned tasks/scopes to the parent thread.
ItsDoot pushed a commit to ItsDoot/bevy that referenced this issue Feb 1, 2023
# Objective

- Fixes bevyengine#4996 

## Solution

- Panicking on the global task pool is probably bad. This changes the panicking test to use a single threaded stage to run the test instead.
- I checked the other #[should_panic]
- I also added explicit ordering between the transform propagate system and the parent update system. The ambiguous ordering didn't seem to be causing problems, but the tests are probably more correct this way. The plugins that add these systems have an explicit ordering. I can remove this if necessary.

## Note

I don't have a 100% mental model of why panicking is causing intermittent failures. It probably has to do with a task for one of the other tests landing on the panicking thread when it actually panics. Why this causes a problem I'm not sure, but this PR seems to fix things.

## Open questions

- there are some other #[should_panic] tests that run on the task pool in stage.rs. I don't think we restart panicked threads, so this might be killing most of the threads on the pool. But since they're not causing test failures, we should probably decide what to do about that separately. The solution in this PR won't work since those tests are explicitly testing parallelism.
ItsDoot pushed a commit to ItsDoot/bevy that referenced this issue Feb 1, 2023
# Objective
Right now, the `TaskPool` implementation allows panics to permanently kill worker threads upon panicking. This is currently non-recoverable without using a `std::panic::catch_unwind` in every scheduled task. This is poor ergonomics and even poorer developer experience. This is exacerbated by bevyengine#2250 as these threads are global and cannot be replaced after initialization.

Removes the need for temporary fixes like bevyengine#4998. Fixes bevyengine#4996. Fixes bevyengine#6081. Fixes bevyengine#5285. Fixes bevyengine#5054. Supersedes bevyengine#2307.

## Solution

The current solution is to wrap `Executor::run` in `TaskPool` with a `catch_unwind`, and discarding the potential panic. This was taken straight from [smol](https://github.com/smol-rs/smol/blob/404c7bcc0aea59b82d7347058043b8de7133241c/src/spawn.rs#L44)'s current implementation. ~~However, this is not entirely ideal as:~~
 
 - ~~the signaled to the awaiting task. We would need to change `Task<T>` to use `async_task::FallibleTask` internally, and even then it doesn't signal *why* it panicked, just that it did.~~ (See below).
 - ~~no error is logged of any kind~~ (See below)
 - ~~it's unclear if it drops other tasks in the executor~~ (it does not)
 - ~~This allows the ECS parallel executor to keep chugging even though a system's task has been dropped. This inevitably leads to deadlock in the executor.~~ Assuming we don't catch the unwind in ParallelExecutor, this will naturally kill the main thread.

### Alternatives
A final solution likely will incorporate elements of any or all of the following.

#### ~~Log and Ignore~~
~~Log the panic, drop the task, keep chugging. This only addresses the discoverability of the panic. The process will continue to run, probably deadlocking the executor. tokio's detatched tasks operate in this fashion.~~

Panics already do this by default, even when caught by `catch_unwind`.

#### ~~`catch_unwind` in `ParallelExecutor`~~
~~Add another layer catching system-level panics into the `ParallelExecutor`. How the executor continues when a core dependency of many systems fails to run is up for debate.~~

`async_task::Task`  bubbles up panics already, this will transitively push panics all the way to the main thread.

#### ~~Emulate/Copy `tokio::JoinHandle` with `Task<T>`~~
~~`tokio::JoinHandle<T>` bubbles up the panic from the underlying task when awaited. This can be transitively applied across other APIs that also use `Task<T>` like `Query::par_for_each` and `TaskPool::scope`, bubbling up the panic until it's either caught or it reaches the main thread.~~

`async_task::Task`  bubbles up panics already, this will transitively push panics all the way to the main thread.

#### Abort on Panic
The nuclear option. Log the error, abort the entire process on any thread in the task pool panicking. Definitely avoids any additional infrastructure for passing the panic around, and might actually lead to more efficient code as any unwinding is optimized out. However gives the developer zero options for dealing with the issue, a seemingly poor choice for debuggability, and prevents graceful shutdown of the process. Potentially an option for handling very low-level task management (a la bevyengine#4740). Roughly takes the shape of:

```rust
struct AbortOnPanic;

impl Drop for AbortOnPanic {
   fn drop(&mut self) {
     abort!();
   }
}

let guard = AbortOnPanic;
// Run task
std::mem::forget(AbortOnPanic);
```

---

## Changelog

Changed: `bevy_tasks::TaskPool`'s threads  will no longer terminate permanently when a task scheduled onto them panics.
Changed: `bevy_tasks::Task` and`bevy_tasks::Scope` will propagate panics in the spawned tasks/scopes to the parent thread.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-Build-System Related to build systems or continuous integration A-ECS Entities, components, systems, and events C-Bug An unexpected or incorrect behavior P-Crash A sudden unexpected crash S-Ready-For-Implementation This issue is ready for an implementation PR. Go for it!
Projects
None yet
5 participants