Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Lavapipe memory leak reproduction #306

Closed
wants to merge 7 commits into from
Closed

Lavapipe memory leak reproduction #306

wants to merge 7 commits into from

Conversation

Korijn
Copy link
Collaborator

@Korijn Korijn commented Nov 11, 2022

This is a follow-up to pygfx/pygfx#362

I wanted to know if the lavapipe memory leak was hiding in pygfx's statefulness or not, and the quickest way to be sure of that was to try and reproduce the issue in this repo.

Round 0 means before doing anything.

Output on WSL2 using lavapipe:

memory usage (round: 0): 53.707 MB
memory usage (round: 1): 132.238 MB
memory usage (round: 2): 142.789 MB
memory usage (round: 3): 152.613 MB
memory usage (round: 4): 162.695 MB
memory usage (round: 5): 173.035 MB
memory usage (round: 6): 186.828 MB
memory usage (round: 7): 196.641 MB
memory usage (round: 8): 206.719 MB
memory usage (round: 9): 216.531 MB
memory usage (round: 10): 226.336 MB

Reusing the canvas object only:

memory usage (round: 0): 67.988 MB
memory usage (round: 1): 144.867 MB
memory usage (round: 2): 155.559 MB
memory usage (round: 3): 161.363 MB
memory usage (round: 4): 167.188 MB
memory usage (round: 5): 173.012 MB
memory usage (round: 6): 178.816 MB
memory usage (round: 7): 184.863 MB
memory usage (round: 8): 192.656 MB
memory usage (round: 9): 198.469 MB
memory usage (round: 10): 204.281 MB

Reusing the device object only (so canvas does not seem to be leaking):

memory usage (round: 0): 123.086 MB
memory usage (round: 1): 146.574 MB
memory usage (round: 2): 148.547 MB
memory usage (round: 3): 148.668 MB
memory usage (round: 4): 149.008 MB
memory usage (round: 5): 148.996 MB
memory usage (round: 6): 148.977 MB
memory usage (round: 7): 148.961 MB
memory usage (round: 8): 148.938 MB
memory usage (round: 9): 148.914 MB
memory usage (round: 10): 148.898 MB

Reusing both the device and canvas objects:

memory usage (round: 0): 117.270 MB
memory usage (round: 1): 143.203 MB
memory usage (round: 2): 144.629 MB
memory usage (round: 3): 144.746 MB
memory usage (round: 4): 144.734 MB
memory usage (round: 5): 144.719 MB
memory usage (round: 6): 144.699 MB
memory usage (round: 7): 144.684 MB
memory usage (round: 8): 144.660 MB
memory usage (round: 9): 144.637 MB
memory usage (round: 10): 144.617 MB

@Korijn Korijn changed the title check if leak is in wgpu-py Lavapipe memory leak reproduction Nov 11, 2022
@almarklein
Copy link
Member

So it seems to be the device ...

@Korijn
Copy link
Collaborator Author

Korijn commented Nov 11, 2022

Making this even more minimal. All I'm doing is requesting a device. Nothing else.

import gc
import psutil
import wgpu.backends.rs
import wgpu


p = psutil.Process()


def print_mem_usage(i):
    megs = p.memory_info().rss / 1024**2
    print(f"memory usage (round: {i}): {megs:.3f} MB")


if __name__ == "__main__":
    print_mem_usage(0)
    for i in range(10):
        adapter = wgpu.request_adapter(canvas=None, power_preference="high-performance")
        device = adapter.request_device()
        gc.collect()
        print_mem_usage(i + 1)

Output:

memory usage (round: 0): 27.520 MB
memory usage (round: 1): 82.320 MB
memory usage (round: 2): 85.195 MB
memory usage (round: 3): 87.473 MB
memory usage (round: 4): 89.773 MB
memory usage (round: 5): 92.289 MB
memory usage (round: 6): 94.598 MB
memory usage (round: 7): 97.117 MB
memory usage (round: 8): 101.395 MB
memory usage (round: 9): 103.914 MB
memory usage (round: 10): 106.203 MB

If I only request an adapter, there is no leakage.

@Korijn
Copy link
Collaborator Author

Korijn commented Nov 11, 2022

I think this is a symptom of various bugs in wgpu/wgpu-native, actually.

You can find a lot more memory leak bugs actually, which is ironic since it is all rust. :)

We may get a better experience once we upgrade to the latest version of wgpu-native.

@almarklein
Copy link
Member

I guess this counts as good news! 🍰

@Korijn
Copy link
Collaborator Author

Korijn commented Dec 5, 2022

I'd be curious to see if we can also reproduce with the same code in Rust... to see if the problem is with our wrappers or not.

This was referenced Dec 23, 2022
@almarklein
Copy link
Member

almarklein commented Jan 26, 2023

AttributeError: module 'wgpu.backends.rs' has no attribute 'device_dropper'

Renamed that to `delayed_dropper, because it now also drops adapters (and maybe more at some point).

@Korijn
Copy link
Collaborator Author

Korijn commented Feb 17, 2023

Current output:

❯ python examples/memtest.py
memory usage (round: 0): 31.480 MB
memory usage (round: 1): 85.023 MB
memory usage (round: 2): 103.195 MB
memory usage (round: 3): 123.359 MB
memory usage (round: 4): 141.039 MB
memory usage (round: 5): 159.051 MB
memory usage (round: 6): 177.285 MB
memory usage (round: 7): 197.090 MB
memory usage (round: 8): 214.926 MB
memory usage (round: 9): 233.688 MB
memory usage (round: 10): 253.035 MB

@Korijn
Copy link
Collaborator Author

Korijn commented Feb 28, 2023

I'm abandoning this. Someone needs to try and reproduce this on the wgpu-native layer. I don't think our wrappers are the cause of this memory pattern.

@Korijn Korijn closed this Feb 28, 2023
@almarklein
Copy link
Member

Latest status:

  • There is still a big leak when running on Lavapipe, happening solely by instantiating and deleting an adapter and device.
  • When I run the memtest.py (on MacOS) with number of iters to 2000 or so, the memory very slowly climbs to 220 MB or so, but at some point is also drops back to 207, so it does not look like a memory leak (in wgpu) to me.
  • Is this bad? This issue mostly affects stuff that we do on CI. It is also likely that it will be fixed at some point.

Anything to add @Korijn ?

@Korijn
Copy link
Collaborator Author

Korijn commented Feb 28, 2023

On my machine (Windows 11) this happens when I run 2000 iters:

❯ .venv/Scripts/python .\examples\memtest.py
memory usage (round: 0): 31.539 MB
memory usage (round: 1): 84.938 MB
memory usage (round: 2): 103.070 MB
memory usage (round: 3): 123.062 MB
memory usage (round: 4): 140.855 MB
memory usage (round: 5): 158.680 MB
memory usage (round: 6): 177.039 MB
memory usage (round: 7): 196.820 MB
memory usage (round: 8): 214.621 MB
memory usage (round: 9): 233.270 MB
memory usage (round: 10): 252.785 MB
memory usage (round: 11): 272.219 MB
memory usage (round: 12): 290.516 MB
memory usage (round: 13): 308.684 MB
memory usage (round: 14): 328.559 MB
memory usage (round: 15): 347.172 MB
memory usage (round: 16): 365.258 MB
memory usage (round: 17): 384.004 MB
memory usage (round: 18): 403.688 MB
memory usage (round: 19): 421.660 MB
memory usage (round: 20): 441.699 MB
memory usage (round: 21): 459.203 MB
memory usage (round: 22): 484.004 MB
memory usage (round: 23): 502.707 MB
memory usage (round: 24): 521.070 MB
memory usage (round: 25): 541.070 MB
memory usage (round: 26): 559.723 MB
memory usage (round: 27): 578.074 MB
memory usage (round: 28): 596.566 MB
memory usage (round: 29): 616.914 MB
memory usage (round: 30): 634.898 MB
memory usage (round: 31): 653.207 MB
memory usage (round: 32): 673.762 MB
memory usage (round: 33): 694.543 MB
memory usage (round: 34): 712.656 MB
memory usage (round: 35): 730.969 MB
memory usage (round: 36): 748.664 MB
memory usage (round: 37): 769.293 MB
memory usage (round: 38): 787.816 MB
memory usage (round: 39): 806.270 MB
memory usage (round: 40): 826.395 MB
memory usage (round: 41): 845.262 MB
memory usage (round: 42): 863.387 MB
memory usage (round: 43): 901.039 MB
memory usage (round: 44): 923.969 MB
memory usage (round: 45): 942.598 MB
memory usage (round: 46): 961.023 MB
memory usage (round: 47): 980.031 MB
memory usage (round: 48): 1001.082 MB
memory usage (round: 49): 1020.277 MB
memory usage (round: 50): 1039.008 MB
memory usage (round: 51): 1057.801 MB
memory usage (round: 52): 1078.867 MB
memory usage (round: 53): 1097.836 MB
memory usage (round: 54): 1116.832 MB
memory usage (round: 55): 1137.848 MB
memory usage (round: 56): 1158.691 MB
memory usage (round: 57): 1177.633 MB
memory usage (round: 58): 1196.777 MB
memory usage (round: 59): 1217.512 MB
memory usage (round: 60): 1236.738 MB
memory usage (round: 61): 1254.988 MB
memory usage (round: 62): 1274.188 MB
memory usage (round: 63): 1295.066 MB
Unrecognized device error ERROR_INITIALIZATION_FAILED
Exception ignored from cffi callback <function GPUAdapter._request_device.<locals>.callback at 0x000001A8702E5CA0>:
Traceback (most recent call last):
  File "C:\Users\kvang\dev\wgpu-py\wgpu\backends\rs.py", line 741, in callback
    raise RuntimeError(f"Request device failed ({status}): {msg}")
RuntimeError: Request device failed (1): DeviceLost
Traceback (most recent call last):
  File "C:\Users\kvang\dev\wgpu-py\examples\memtest.py", line 22, in <module>
    device = adapter.request_device()
  File "C:\Users\kvang\dev\wgpu-py\wgpu\backends\rs.py", line 618, in request_device
    return self._request_device(
  File "C:\Users\kvang\dev\wgpu-py\wgpu\backends\rs.py", line 748, in _request_device
    assert device_id is not None
AssertionError

@almarklein
Copy link
Member

That's not good ... I can indeed reproduce that. With both WGPU_BACKEND_TYPE set to "Vulkan" and "D3D12".

@Korijn
Copy link
Collaborator Author

Korijn commented Feb 28, 2023

By the way, crashing right before 64 seems like too much of a coincidence.

@almarklein almarklein mentioned this pull request Feb 28, 2023
4 tasks
@almarklein
Copy link
Member

crashing right before 64 seems like too much of a coincidence.

Seeing the same, consistently. Should help trace the cause ...

I made an issue to track mem leaks in general: #353

@crjenkins
Copy link

LOL

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants