Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Compute pipelines never freed at runtime, leaking memory #2564

Closed
jimblandy opened this issue Mar 31, 2022 · 2 comments · Fixed by #2565
Closed

Compute pipelines never freed at runtime, leaking memory #2564

jimblandy opened this issue Mar 31, 2022 · 2 comments · Fixed by #2565
Labels
area: correctness We're behaving incorrectly type: bug Something isn't working

Comments

@jimblandy
Copy link
Member

See also #2546. The following program drops everything but the instance, and yet calling instance.poll_all(true) never causes the compute pipeline to be freed at the hal level:

use anyhow::Result; // anyhow = "1"
use futures_executor::block_on; // futures-executor = "0.3"

fn main() -> Result<()> {
    env_logger::init();
    
    let instance = wgpu::Instance::new(wgpu::Backends::VULKAN);

    {
        let (device, queue) = block_on(async {
            let adapter = instance.request_adapter(&wgpu::RequestAdapterOptions::default())
                .await
                .expect("request_adapter returned None");
            adapter.request_device(&wgpu::DeviceDescriptor {
                label: Some("play-wgpu device"),
                features: wgpu::Features::default(),
                limits: wgpu::Limits::default(),
            },
                                   None).await
        })?;

        let module = device.create_shader_module(&wgpu::ShaderModuleDescriptor {
            label: Some("play-wgpu shader module"),
            source: wgpu::ShaderSource::Wgsl("@stage(compute) @workgroup_size(32) fn main() {}".into())
        });

        let pipeline_layout = device.create_pipeline_layout(&wgpu::PipelineLayoutDescriptor {
            label: Some("play-wgpu pipeline layout descriptor"),
            bind_group_layouts: &[],
            push_constant_ranges: &[],
        });
        
        let compute_pipeline = device.create_compute_pipeline(&wgpu::ComputePipelineDescriptor {
            label: Some("play-wgpu compute pipeline"),
            layout: Some(&pipeline_layout),
            module: &module,
            entry_point: "main",
        });

        drop((compute_pipeline, queue));
    }
    
    instance.poll_all(true);
    log::info!("called instance.poll_all with everything out of scope");
    // compute pipeline has not yet been freed at this point

    Ok(())
}
@jimblandy jimblandy added type: bug Something isn't working area: correctness We're behaving incorrectly labels Mar 31, 2022
@jimblandy
Copy link
Member Author

Global::compute_pipeline_drop takes the LifeGuard's RefCount and drops it, and adds the pipeline's id to the device's suspected_resources, but because the pipeline was never added to the device's TrackerSet, when triage_suspected calls trackers.compute_pipes.remove_abandoned(id), it always gets false, so it never ends up adding the raw pipeline to free_resources (nor anyone's last_resources).

@jimblandy
Copy link
Member Author

Similarly for render pipelines.

jimblandy added a commit to jimblandy/wgpu that referenced this issue Mar 31, 2022
Without this change, `LifetimeTracker::triage_suspected` never notices
that compute or render pipelines are free, and they stick around until
the hub is destroyed.

Fixes gfx-rs#2564.
kvark pushed a commit that referenced this issue Apr 1, 2022
Without this change, `LifetimeTracker::triage_suspected` never notices
that compute or render pipelines are free, and they stick around until
the hub is destroyed.

Fixes #2564.
moz-v2v-gh pushed a commit to mozilla/gecko-dev that referenced this issue Apr 6, 2022
…r=jgilbert

New versions of several crates are introduced to third_party/rust, by
changing the versions requested in `gfx/wgpu_bindings/Cargo.toml` and
running `mach vendor rust`:

- `wgpu-core`, `wgpu-hal`, and `wgpu-types`, as used by `wgpu_bindings`
- `naga`, `ash`, and `metal`, as used by the above

These are all exact copies of the upstream sources, at the git
revisions listed in `.cargo/config.in`.

This brings in fixes for some upstream `wgpu` bugs that were fuzzblockers:

- Compute pipelines never freed at runtime, leaking memory #2564
  gfx-rs/wgpu#2564

- Device::drop doesn't actually free the device when using backend::direct::Context #2563
  gfx-rs/wgpu#2563

The Firefox sources also needed some adjustments to catch up with
upstream changes:

- The C type `mozilla::webgpu::ffi::WGPUTextureFormat` is now a struct
  containing a tag enum and a union, not just an enum. This is needed
  for [gfx-rs/wgpu#2477](gfx-rs/wgpu#2477).

  (Note that Firefox's `WebGPU.webidl` is behind the current spec,
  so even though the newest ASTC texture formats are supported in `wgpu`,
  they're not available in Firefox yet.)

- `wgpu` got a new feature, `id32`, which cbindgen needed to be told
  about so that it would generate preprocessor-protected code like
  this:

      #if defined(WGPU_FEATURE_ID32)
      typedef uint32_t WGPUNonZeroId;
      #endif

      #if !defined(WGPU_FEATURE_ID32)
      typedef uint64_t WGPUNonZeroId;
      #endif

  instead of just spitting out two conflicting definitions of
  `WGPUNonZeroId`.

- The `wgpu_core::hub::IdentityHandlerFactory` trait's `spawn` method
  no longer takes a `min_index` argument. (Our implementations of that
  trait never used that argument anyway, so this was easy to
  accommodate.)

Differential Revision: https://phabricator.services.mozilla.com/D142779
jamienicol pushed a commit to jamienicol/gecko that referenced this issue Apr 6, 2022
…r=jgilbert

New versions of several crates are introduced to third_party/rust, by
changing the versions requested in `gfx/wgpu_bindings/Cargo.toml` and
running `mach vendor rust`:

- `wgpu-core`, `wgpu-hal`, and `wgpu-types`, as used by `wgpu_bindings`
- `naga`, `ash`, and `metal`, as used by the above

These are all exact copies of the upstream sources, at the git
revisions listed in `.cargo/config.in`.

This brings in fixes for some upstream `wgpu` bugs that were fuzzblockers:

- Compute pipelines never freed at runtime, leaking memory #2564
  gfx-rs/wgpu#2564

- Device::drop doesn't actually free the device when using backend::direct::Context #2563
  gfx-rs/wgpu#2563

The Firefox sources also needed some adjustments to catch up with
upstream changes:

- The C type `mozilla::webgpu::ffi::WGPUTextureFormat` is now a struct
  containing a tag enum and a union, not just an enum. This is needed
  for [gfx-rs/wgpu#2477](gfx-rs/wgpu#2477).

  (Note that Firefox's `WebGPU.webidl` is behind the current spec,
  so even though the newest ASTC texture formats are supported in `wgpu`,
  they're not available in Firefox yet.)

- `wgpu` got a new feature, `id32`, which cbindgen needed to be told
  about so that it would generate preprocessor-protected code like
  this:

      #if defined(WGPU_FEATURE_ID32)
      typedef uint32_t WGPUNonZeroId;
      #endif

      #if !defined(WGPU_FEATURE_ID32)
      typedef uint64_t WGPUNonZeroId;
      #endif

  instead of just spitting out two conflicting definitions of
  `WGPUNonZeroId`.

- The `wgpu_core::hub::IdentityHandlerFactory` trait's `spawn` method
  no longer takes a `min_index` argument. (Our implementations of that
  trait never used that argument anyway, so this was easy to
  accommodate.)

Differential Revision: https://phabricator.services.mozilla.com/D142779
moz-v2v-gh pushed a commit to mozilla/gecko-dev that referenced this issue Apr 7, 2022
…r=jgilbert

New versions of several crates are introduced to third_party/rust, by
changing the versions requested in `gfx/wgpu_bindings/Cargo.toml` and
running `mach vendor rust`:

- `wgpu-core`, `wgpu-hal`, and `wgpu-types`, as used by `wgpu_bindings`
- `naga`, `ash`, and `metal`, as used by the above

These are all exact copies of the upstream sources, at the git
revisions listed in `.cargo/config.in`.

This brings in fixes for some upstream `wgpu` bugs that were fuzzblockers:

- Compute pipelines never freed at runtime, leaking memory #2564
  gfx-rs/wgpu#2564

- Device::drop doesn't actually free the device when using backend::direct::Context #2563
  gfx-rs/wgpu#2563

The Firefox sources also needed some adjustments to catch up with
upstream changes:

- The C type `mozilla::webgpu::ffi::WGPUTextureFormat` is now a struct
  containing a tag enum and a union, not just an enum. This is needed
  for [gfx-rs/wgpu#2477](gfx-rs/wgpu#2477).

  (Note that Firefox's `WebGPU.webidl` is behind the current spec,
  so even though the newest ASTC texture formats are supported in `wgpu`,
  they're not available in Firefox yet.)

- `wgpu` got a new feature, `id32`, which cbindgen needed to be told
  about so that it would generate preprocessor-protected code like
  this:

      #if defined(WGPU_FEATURE_ID32)
      typedef uint32_t WGPUNonZeroId;
      #endif

      #if !defined(WGPU_FEATURE_ID32)
      typedef uint64_t WGPUNonZeroId;
      #endif

  instead of just spitting out two conflicting definitions of
  `WGPUNonZeroId`.

- The `wgpu_core::hub::IdentityHandlerFactory` trait's `spawn` method
  no longer takes a `min_index` argument. (Our implementations of that
  trait never used that argument anyway, so this was easy to
  accommodate.)

Differential Revision: https://phabricator.services.mozilla.com/D142779
jamienicol pushed a commit to jamienicol/gecko that referenced this issue Apr 7, 2022
…r=jgilbert

New versions of several crates are introduced to third_party/rust, by
changing the versions requested in `gfx/wgpu_bindings/Cargo.toml` and
running `mach vendor rust`:

- `wgpu-core`, `wgpu-hal`, and `wgpu-types`, as used by `wgpu_bindings`
- `naga`, `ash`, and `metal`, as used by the above

These are all exact copies of the upstream sources, at the git
revisions listed in `.cargo/config.in`.

This brings in fixes for some upstream `wgpu` bugs that were fuzzblockers:

- Compute pipelines never freed at runtime, leaking memory #2564
  gfx-rs/wgpu#2564

- Device::drop doesn't actually free the device when using backend::direct::Context #2563
  gfx-rs/wgpu#2563

The Firefox sources also needed some adjustments to catch up with
upstream changes:

- The C type `mozilla::webgpu::ffi::WGPUTextureFormat` is now a struct
  containing a tag enum and a union, not just an enum. This is needed
  for [gfx-rs/wgpu#2477](gfx-rs/wgpu#2477).

  (Note that Firefox's `WebGPU.webidl` is behind the current spec,
  so even though the newest ASTC texture formats are supported in `wgpu`,
  they're not available in Firefox yet.)

- `wgpu` got a new feature, `id32`, which cbindgen needed to be told
  about so that it would generate preprocessor-protected code like
  this:

      #if defined(WGPU_FEATURE_ID32)
      typedef uint32_t WGPUNonZeroId;
      #endif

      #if !defined(WGPU_FEATURE_ID32)
      typedef uint64_t WGPUNonZeroId;
      #endif

  instead of just spitting out two conflicting definitions of
  `WGPUNonZeroId`.

- The `wgpu_core::hub::IdentityHandlerFactory` trait's `spawn` method
  no longer takes a `min_index` argument. (Our implementations of that
  trait never used that argument anyway, so this was easy to
  accommodate.)

Differential Revision: https://phabricator.services.mozilla.com/D142779
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area: correctness We're behaving incorrectly type: bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant