Lavapipe memory leak reproduction #306

Korijn · 2022-11-11T08:29:22Z

This is a follow-up to pygfx/pygfx#362

I wanted to know if the lavapipe memory leak was hiding in pygfx's statefulness or not, and the quickest way to be sure of that was to try and reproduce the issue in this repo.

Round 0 means before doing anything.

Output on WSL2 using lavapipe:

memory usage (round: 0): 53.707 MB
memory usage (round: 1): 132.238 MB
memory usage (round: 2): 142.789 MB
memory usage (round: 3): 152.613 MB
memory usage (round: 4): 162.695 MB
memory usage (round: 5): 173.035 MB
memory usage (round: 6): 186.828 MB
memory usage (round: 7): 196.641 MB
memory usage (round: 8): 206.719 MB
memory usage (round: 9): 216.531 MB
memory usage (round: 10): 226.336 MB

Reusing the canvas object only:

memory usage (round: 0): 67.988 MB
memory usage (round: 1): 144.867 MB
memory usage (round: 2): 155.559 MB
memory usage (round: 3): 161.363 MB
memory usage (round: 4): 167.188 MB
memory usage (round: 5): 173.012 MB
memory usage (round: 6): 178.816 MB
memory usage (round: 7): 184.863 MB
memory usage (round: 8): 192.656 MB
memory usage (round: 9): 198.469 MB
memory usage (round: 10): 204.281 MB

Reusing the device object only (so canvas does not seem to be leaking):

memory usage (round: 0): 123.086 MB
memory usage (round: 1): 146.574 MB
memory usage (round: 2): 148.547 MB
memory usage (round: 3): 148.668 MB
memory usage (round: 4): 149.008 MB
memory usage (round: 5): 148.996 MB
memory usage (round: 6): 148.977 MB
memory usage (round: 7): 148.961 MB
memory usage (round: 8): 148.938 MB
memory usage (round: 9): 148.914 MB
memory usage (round: 10): 148.898 MB

Reusing both the device and canvas objects:

memory usage (round: 0): 117.270 MB
memory usage (round: 1): 143.203 MB
memory usage (round: 2): 144.629 MB
memory usage (round: 3): 144.746 MB
memory usage (round: 4): 144.734 MB
memory usage (round: 5): 144.719 MB
memory usage (round: 6): 144.699 MB
memory usage (round: 7): 144.684 MB
memory usage (round: 8): 144.660 MB
memory usage (round: 9): 144.637 MB
memory usage (round: 10): 144.617 MB

almarklein · 2022-11-11T09:08:53Z

So it seems to be the device ...

Korijn · 2022-11-11T13:20:13Z

Making this even more minimal. All I'm doing is requesting a device. Nothing else.

import gc
import psutil
import wgpu.backends.rs
import wgpu


p = psutil.Process()


def print_mem_usage(i):
    megs = p.memory_info().rss / 1024**2
    print(f"memory usage (round: {i}): {megs:.3f} MB")


if __name__ == "__main__":
    print_mem_usage(0)
    for i in range(10):
        adapter = wgpu.request_adapter(canvas=None, power_preference="high-performance")
        device = adapter.request_device()
        gc.collect()
        print_mem_usage(i + 1)

Output:

memory usage (round: 0): 27.520 MB
memory usage (round: 1): 82.320 MB
memory usage (round: 2): 85.195 MB
memory usage (round: 3): 87.473 MB
memory usage (round: 4): 89.773 MB
memory usage (round: 5): 92.289 MB
memory usage (round: 6): 94.598 MB
memory usage (round: 7): 97.117 MB
memory usage (round: 8): 101.395 MB
memory usage (round: 9): 103.914 MB
memory usage (round: 10): 106.203 MB

If I only request an adapter, there is no leakage.

Korijn · 2022-11-11T13:27:21Z

I think this is a symptom of various bugs in wgpu/wgpu-native, actually.

Device::drop doesn't actually free the device when using backend::direct::Context gfx-rs/wgpu#2563
Compute pipelines never freed at runtime, leaking memory gfx-rs/wgpu#2564
Command Encoders are not getting dropped and leaking memory gfx-rs/wgpu#2553

You can find a lot more memory leak bugs actually, which is ironic since it is all rust. :)

We may get a better experience once we upgrade to the latest version of wgpu-native.

almarklein · 2022-11-11T14:03:42Z

I guess this counts as good news! 🍰

Korijn · 2022-12-05T09:50:18Z

I'd be curious to see if we can also reproduce with the same code in Rust... to see if the problem is with our wrappers or not.

almarklein · 2023-01-26T19:50:38Z

AttributeError: module 'wgpu.backends.rs' has no attribute 'device_dropper'

Renamed that to `delayed_dropper, because it now also drops adapters (and maybe more at some point).

Korijn · 2023-02-17T10:00:56Z

Current output:

❯ python examples/memtest.py
memory usage (round: 0): 31.480 MB
memory usage (round: 1): 85.023 MB
memory usage (round: 2): 103.195 MB
memory usage (round: 3): 123.359 MB
memory usage (round: 4): 141.039 MB
memory usage (round: 5): 159.051 MB
memory usage (round: 6): 177.285 MB
memory usage (round: 7): 197.090 MB
memory usage (round: 8): 214.926 MB
memory usage (round: 9): 233.688 MB
memory usage (round: 10): 253.035 MB

Korijn · 2023-02-28T08:21:23Z

I'm abandoning this. Someone needs to try and reproduce this on the wgpu-native layer. I don't think our wrappers are the cause of this memory pattern.

almarklein · 2023-02-28T08:54:12Z

Latest status:

There is still a big leak when running on Lavapipe, happening solely by instantiating and deleting an adapter and device.
When I run the memtest.py (on MacOS) with number of iters to 2000 or so, the memory very slowly climbs to 220 MB or so, but at some point is also drops back to 207, so it does not look like a memory leak (in wgpu) to me.
Is this bad? This issue mostly affects stuff that we do on CI. It is also likely that it will be fixed at some point.

Anything to add @Korijn ?

Korijn · 2023-02-28T09:24:50Z

On my machine (Windows 11) this happens when I run 2000 iters:

❯ .venv/Scripts/python .\examples\memtest.py
memory usage (round: 0): 31.539 MB
memory usage (round: 1): 84.938 MB
memory usage (round: 2): 103.070 MB
memory usage (round: 3): 123.062 MB
memory usage (round: 4): 140.855 MB
memory usage (round: 5): 158.680 MB
memory usage (round: 6): 177.039 MB
memory usage (round: 7): 196.820 MB
memory usage (round: 8): 214.621 MB
memory usage (round: 9): 233.270 MB
memory usage (round: 10): 252.785 MB
memory usage (round: 11): 272.219 MB
memory usage (round: 12): 290.516 MB
memory usage (round: 13): 308.684 MB
memory usage (round: 14): 328.559 MB
memory usage (round: 15): 347.172 MB
memory usage (round: 16): 365.258 MB
memory usage (round: 17): 384.004 MB
memory usage (round: 18): 403.688 MB
memory usage (round: 19): 421.660 MB
memory usage (round: 20): 441.699 MB
memory usage (round: 21): 459.203 MB
memory usage (round: 22): 484.004 MB
memory usage (round: 23): 502.707 MB
memory usage (round: 24): 521.070 MB
memory usage (round: 25): 541.070 MB
memory usage (round: 26): 559.723 MB
memory usage (round: 27): 578.074 MB
memory usage (round: 28): 596.566 MB
memory usage (round: 29): 616.914 MB
memory usage (round: 30): 634.898 MB
memory usage (round: 31): 653.207 MB
memory usage (round: 32): 673.762 MB
memory usage (round: 33): 694.543 MB
memory usage (round: 34): 712.656 MB
memory usage (round: 35): 730.969 MB
memory usage (round: 36): 748.664 MB
memory usage (round: 37): 769.293 MB
memory usage (round: 38): 787.816 MB
memory usage (round: 39): 806.270 MB
memory usage (round: 40): 826.395 MB
memory usage (round: 41): 845.262 MB
memory usage (round: 42): 863.387 MB
memory usage (round: 43): 901.039 MB
memory usage (round: 44): 923.969 MB
memory usage (round: 45): 942.598 MB
memory usage (round: 46): 961.023 MB
memory usage (round: 47): 980.031 MB
memory usage (round: 48): 1001.082 MB
memory usage (round: 49): 1020.277 MB
memory usage (round: 50): 1039.008 MB
memory usage (round: 51): 1057.801 MB
memory usage (round: 52): 1078.867 MB
memory usage (round: 53): 1097.836 MB
memory usage (round: 54): 1116.832 MB
memory usage (round: 55): 1137.848 MB
memory usage (round: 56): 1158.691 MB
memory usage (round: 57): 1177.633 MB
memory usage (round: 58): 1196.777 MB
memory usage (round: 59): 1217.512 MB
memory usage (round: 60): 1236.738 MB
memory usage (round: 61): 1254.988 MB
memory usage (round: 62): 1274.188 MB
memory usage (round: 63): 1295.066 MB
Unrecognized device error ERROR_INITIALIZATION_FAILED
Exception ignored from cffi callback <function GPUAdapter._request_device.<locals>.callback at 0x000001A8702E5CA0>:
Traceback (most recent call last):
  File "C:\Users\kvang\dev\wgpu-py\wgpu\backends\rs.py", line 741, in callback
    raise RuntimeError(f"Request device failed ({status}): {msg}")
RuntimeError: Request device failed (1): DeviceLost
Traceback (most recent call last):
  File "C:\Users\kvang\dev\wgpu-py\examples\memtest.py", line 22, in <module>
    device = adapter.request_device()
  File "C:\Users\kvang\dev\wgpu-py\wgpu\backends\rs.py", line 618, in request_device
    return self._request_device(
  File "C:\Users\kvang\dev\wgpu-py\wgpu\backends\rs.py", line 748, in _request_device
    assert device_id is not None
AssertionError

almarklein · 2023-02-28T09:31:18Z

That's not good ... I can indeed reproduce that. With both WGPU_BACKEND_TYPE set to "Vulkan" and "D3D12".

Korijn · 2023-02-28T11:39:26Z

By the way, crashing right before 64 seems like too much of a coincidence.

almarklein · 2023-02-28T13:21:13Z

crashing right before 64 seems like too much of a coincidence.

Seeing the same, consistently. Should help trace the cause ...

I made an issue to track mem leaks in general: #353

crjenkins · 2023-11-14T03:09:35Z

LOL

Korijn added 2 commits November 11, 2022 09:18

check if leak is in wgpu-py

1b5ba8e

clean

9898842

Korijn changed the title ~~check if leak is in wgpu-py~~ Lavapipe memory leak reproduction Nov 11, 2022

Korijn mentioned this pull request Nov 11, 2022

Execute examples in the same process pygfx/pygfx#362

Merged

more minimal repro

2f39c6c

almarklein mentioned this pull request Dec 5, 2022

Resource Management in pygfx pygfx/pygfx#382

Closed

This was referenced Dec 23, 2022

Update wgpu-native pygfx/pygfx#417

Closed

Update to latest wgpu spec #289

Closed

Korijn added 2 commits January 25, 2023 18:56

Merge branch 'main' into debug-memory-leaks

bd887b6

Merge branch 'main' into debug-memory-leaks

70ededa

Korijn added 2 commits February 17, 2023 10:58

Merge branch 'main' of github.com:pygfx/wgpu-py into debug-memory-leaks

d106750

tweaks for latest wgpu-py

39549c8

Korijn closed this Feb 28, 2023

almarklein mentioned this pull request Feb 28, 2023

Investigate memoryleaks #353

Closed

4 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Lavapipe memory leak reproduction #306

Lavapipe memory leak reproduction #306

Korijn commented Nov 11, 2022 •

edited

Loading

almarklein commented Nov 11, 2022

Korijn commented Nov 11, 2022 •

edited

Loading

Korijn commented Nov 11, 2022 •

edited

Loading

almarklein commented Nov 11, 2022

Korijn commented Dec 5, 2022

almarklein commented Jan 26, 2023 •

edited

Loading

Korijn commented Feb 17, 2023

Korijn commented Feb 28, 2023

almarklein commented Feb 28, 2023

Korijn commented Feb 28, 2023

almarklein commented Feb 28, 2023

Korijn commented Feb 28, 2023

almarklein commented Feb 28, 2023

crjenkins commented Nov 14, 2023

Lavapipe memory leak reproduction #306

Lavapipe memory leak reproduction #306

Conversation

Korijn commented Nov 11, 2022 • edited Loading

almarklein commented Nov 11, 2022

Korijn commented Nov 11, 2022 • edited Loading

Korijn commented Nov 11, 2022 • edited Loading

almarklein commented Nov 11, 2022

Korijn commented Dec 5, 2022

almarklein commented Jan 26, 2023 • edited Loading

Korijn commented Feb 17, 2023

Korijn commented Feb 28, 2023

almarklein commented Feb 28, 2023

Korijn commented Feb 28, 2023

almarklein commented Feb 28, 2023

Korijn commented Feb 28, 2023

almarklein commented Feb 28, 2023

crjenkins commented Nov 14, 2023

Korijn commented Nov 11, 2022 •

edited

Loading

Korijn commented Nov 11, 2022 •

edited

Loading

Korijn commented Nov 11, 2022 •

edited

Loading

almarklein commented Jan 26, 2023 •

edited

Loading