Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RenderingDevice cannot release texture resource when window is minimized #98044

Closed
hippogyz opened this issue Oct 10, 2024 · 17 comments · Fixed by #100257
Closed

RenderingDevice cannot release texture resource when window is minimized #98044

hippogyz opened this issue Oct 10, 2024 · 17 comments · Fixed by #100257
Assignees
Milestone

Comments

@hippogyz
Copy link

hippogyz commented Oct 10, 2024

Tested versions

4.3.stable.mono

System information

Godot v4.3.stable.mono - Windows 10.0.22631 - Vulkan (Forward+) - dedicated NVIDIA GeForce RTX 4070 (NVIDIA; 31.0.15.3161) - AMD Ryzen 9 7900X 12-Core Processor (24 Threads)

Issue description

When the game window is minimized, the texture created by RenderingDevice.create_texture() won't be freed by calling RenderingDevice.free_rid().

image
when the game window is not minimized

image
when the game window is minimized after a while

BTW, the Memory inspected by Editor Debugger grows much slower than inspected by Windows Task Manager, when this bug is triggered.
image
not minimized
image
minimized after a while

Steps to reproduce

extends Node2D

var rd : RenderingDevice;
var texture : RID;

func _ready() -> void:
	rd = RenderingServer.get_rendering_device()

func _process(delta: float) -> void:
	#RenderingServer.call_on_render_thread(create_texture);
	create_texture()

func create_texture() -> void:
	if texture.is_valid():
		rd.free_rid(texture)
	
	var format = RDTextureFormat.new()
	format.width = 1024
	format.height = 1024
	format.depth = 1
	format.format = RenderingDevice.DATA_FORMAT_R8G8B8A8_UNORM
	format.array_layers = 1
	format.mipmaps = 1
	format.usage_bits = RenderingDevice.TEXTURE_USAGE_SAMPLING_BIT | RenderingDevice.TEXTURE_USAGE_CAN_UPDATE_BIT | RenderingDevice.TEXTURE_USAGE_STORAGE_BIT
	
	texture = rd.texture_create(format, RDTextureView.new(), [])
  1. This script will create a texture pre frame and release the texture created last frame.
  2. Attach it to a node and run the scene.
  3. Minimize the game window (not editor). (just hide this window below other windows won't trigger this bug)

Minimal reproduction project (MRP)

See 'Steps to reproduce'

@hippogyz
Copy link
Author

Maybe related with #90030 and #90017.

Similar situation is also found when a NoiseTexture Resource is modified every frame by a script (I know it is not recommended).
In my understanding, the NoiseTexture will create a new texture when the parameter is modified. So this bug might be triggered when the window is minimized.

@hippogyz
Copy link
Author

A workaround should be checking if the game window is minimized before creating resources from RenderingDevice.

Such as:

if Engine.get_main_loop().root.mode == Window.MODE_MINIMIZED:
   return
# ... original code

@clayjohn
Copy link
Member

Do the resources get freed once the window is no longer minimized?

@hippogyz
Copy link
Author

hippogyz commented Oct 10, 2024

Do the resources get freed once the window is no longer minimized?

Probably NO, it seems only part of them will be freed.

In my project, I find this bug in a compute shader which creates a texture each frame. In this case, the memory leak is more obvious than the example code above.

The memory leak of the example code above is too severe, so I cannot observe the same behavior before my compute's memory used up :(

image
This is observed in my compute shader case, I just reopen the window three times which corresponds to the three sudden decline.

@Uumutunal
Copy link
Contributor

When the free_rid() method is called the resources are not released immediately, but are queued to be freed in the next frame, but when the window is minimized the gpu cycle is paused which means it doesn't process frames anymore, this causes resources to be queued but never released. And when the window is opened up again it starts to process the frames and clears out the queue.

As a test, I added this inside add_frame_delay method in os.cpp,

if (!p_can_draw) {
	RD::get_singleton()->_flush_and_stall_for_all_frames();
}

It solved this problem, #90030 and also #90017, but I'm not sure what would be the proper implementation because _flush_and_stall_for_all_frames is a private method (I changed it to public for testing) and calling it every frame might be problematic.

@clayjohn
Copy link
Member

@Uumutunal Great work! That seems like an acceptable workaround. We could probably store a variable to ensure that we only call _flush_and_stall_for_all_frames() once when the window is first minimized.

However, the way I understand things is that resources should get cleaned up on the next frame, which should happen once the window is opened back up. However Hippogyz' comment Seems to indicate that, even after the GPU starts processing frames again, it doesn't clean up the resources. Which makes me think that maybe there is still a problem somewhere else and the resources are somehow getting orphaned before they are actually cleaned up.

@DarioSamo what do you think?

@DarioSamo DarioSamo self-assigned this Nov 23, 2024
@Uumutunal
Copy link
Contributor

We could probably store a variable to ensure that we only call _flush_and_stall_for_all_frames() once when the window is first minimized.

I'm guessing calling it once won't work because new resources keep getting created while the window is minimized. We would have to call it at regular intervals at least, something like once a second.

even after the GPU starts processing frames again, it doesn't clean up the resources.

After testing it more, yes you are right, it seems that only part of the resources are freed unless it is very small amount like 1-2 mb. I have no idea why this happens though.

@DarioSamo
Copy link
Contributor

Resources are not freed until there's a guarantee the GPU is no longer executing the frame. It only knows that if it has to wait for the fence, which happens when it wants to begin drawing on that frame in the queue. Therefore destruction of resources is delayed until that frame begins recording.

void RenderingDevice::_begin_frame() {
	if (frames[frame].fence_signaled) {
		driver->fence_wait(frames[frame].fence);
		frames[frame].fence_signaled = false;
	}

        ...

	_free_pending_resources(frame);

       ...
}

Calling _flush_and_stall_for_all_frames() does indeed fix most of the issue as the trick the method does is wait on the current frame to finish executing and begins the next one.

We're indeed lacking a way to force this. There's other methods such as buffer_get_data() that will also trigger it, but we don't have a method publicly exposed for it. sync() is the closest we have, but the use of submit() and sync() is reserved for Local Devices only.

It'd seem the only way to go about it is making the sync() method accepted or to add a different method altogether.

@Uumutunal
Copy link
Contributor

Asking just to learn because I'm pretty much new to this, but why can't we simply add a public method and call _flush_and_stall_for_all_frames() or even _free_pending_resources()? Because the reason buffer_get_data(), sync() and submit() work is that they all call _free_pending_resources(), and I'm guessing we'll have to call it one way or another.

@DarioSamo
Copy link
Contributor

DarioSamo commented Nov 26, 2024

Adding public methods to certain classes is usually done with extreme care because these classes are exposed to users either via GDScript or Extensions and require documentation. It also presents a commitment to support this new method in future versions and we must make sure to not break compatibility if possible.

If we come to situations like this where there's no other solution, we can discuss it and add the method, but it's not usually something that we should think of as the first solution if possible.

As for free pending resources, it's unlikely we'd ever want to expose that. That's an internal implementation detail that end users should not concern themselves with, as from their perspective, resources are deleted when they call free().

@Uumutunal
Copy link
Contributor

Alright, that makes sense. Thanks for the explanation.

@clayjohn
Copy link
Member

@DarioSamo What do you think about the fact that not all resources are getting freed once the window is opened again? It looks like in hippogyz' case a lot of the memory is never freed, even once the window is opened again

@clayjohn
Copy link
Member

This might be the same issue as #84212

@DarioSamo
Copy link
Contributor

@DarioSamo What do you think about the fact that not all resources are getting freed once the window is opened again? It looks like in hippogyz' case a lot of the memory is never freed, even once the window is opened again

Stale references perhaps? Could be a problem caused by either RenderingDevice or something else. I think that'd warrant investigation separate from the issue that was suspected.

@DarioSamo
Copy link
Contributor

I'm a bit confused about the possible implications on how to solve this properly. There's something that rings wrong to me about the fact that we allow GDScript to call code on instances where the frame queue is not active because it's no longer presenting to the screen.

While the OP has shown a fairly simple script that demonstrates the issue, it falls to reason there'll be many other errors like for example attempting to draw something using RD while the window is minimized. Even that kind of situation will lead to an endless amount of commands being queued and never generated, eventually leading to an overflow of the maximum command list size allowed by the GPU.

The only solution I see to that is running the frame queue on the background regardless of that, but if present is skipped, that means the resources will be saturated while the window is minimized and the GPU will keep doing a ton of work.

We'd need some sort of mixed solution where we allow running frames only if the user has requested something from RenderingDevice while the window is minimized, and of course we don't present to the swap chain. Do we want to go for a solution like that?

@DarioSamo
Copy link
Contributor

After looking into it further, I see we got some logic that already handles the fact viewports are not drawn while the window is minimized. We might be able to find some path from that to make RD execute the command queue if it's not empty (e.g. RS has a 'has_changes' method that might come in handy).

@DarioSamo
Copy link
Contributor

Adding something here that has a method for RenderingServer to check whether it has any queued commands to process is probably our best bet here.

bool buffers_swapped = false; // New
if ((DisplayServer::get_singleton()->can_any_window_draw() || DisplayServer::get_singleton()->has_additional_outputs()) &&
		RenderingServer::get_singleton()->is_render_loop_enabled()) {
	if ((!force_redraw_requested) && OS::get_singleton()->is_in_low_processor_usage_mode()) {
		if (RenderingServer::get_singleton()->has_changed()) {
			RenderingServer::get_singleton()->draw(true, scaled_step); // flush visual commands
			Engine::get_singleton()->increment_frames_drawn();
			buffers_swapped = true; // New
		}
	} else {
		RenderingServer::get_singleton()->draw(true, scaled_step); // flush visual commands
		Engine::get_singleton()->increment_frames_drawn();
		force_redraw_requested = false;
		buffers_swapped = true; // New
	}
}

if (!buffers_swapped) {
	// New
	RenderingServer::get_singleton()->command_that_flushes_rd();
}

The script posted by the OP has no interaction with RenderingServer whatsoever to find out if there's anything queued up, so we'd basically need some way for RenderingServer to redirect and perform a check on RD if there's anything that needs to be processed or freed.

Like I said, this issue is not entirely about the fact that resources are not freed, but a script that constantly fills and submits drawing lists will also run into the same issue and eventually run out of command buffer memory.

darksylinc added a commit to darksylinc/godot that referenced this issue Dec 10, 2024
Fixes godotengine#90017
Fixes godotengine#90030
Fixes godotengine#98044

This PR makes the following changes:

# Force processing of GPU commands for frame_count frames

The variable `frames_pending_resources_for_processing` is added to track
this.

The ticket godotengine#98044 suggested to use `_flush_and_stall_for_all_frames()`
while minimized.

Technically this works and is a viable solution.

However I noticed that this issue was happening because Logic/Physics
continue to work "business as usual" while minimized(\*). Only Graphics
was being deactivated (which caused commands to accumulate until window
is restored).

To continue this behavior of "business as usual", I decided that GPU
work should also "continue as usual" by buffering commands in a double
or triple buffer scheme until all commands are done processing (if they
ever stop coming). This is specially important if the app specifically
intends to keep processing while minimized.

Calling `_flush_and_stall_for_all_frames()` would fix the leak, but it
would make  Godot's behavior different while minimized vs while the
window is presenting.

\* `OS::add_frame_delay` _does_ consider being minimized, but it just
throttles CPU usage. Some platforms such as Android completely disable
processing because the higher level code stops being called when the app
goes into background. But this seems like an implementation-detail that
diverges from the rest of the platforms (e.g. Windows, Linux & macOS
continue to process while minimized).

# Rename p_swap_buffers for p_present

**This is potentially a breaking change** (if it actually breaks
anything, I ignore. But I strongly suspect it doesn't break anything).

"Swap Buffers" is a concept carried from OpenGL, where a frame is "done"
when `glSwapBuffers()` is called, which basically means "present to the
screen".

However it _also_ means that OpenGL internally swaps its internal
buffers in a double/triple buffer scheme (in Vulkan, we do that
ourselves and is tracked by `RenderingDevice::frame`).

Modern APIs like Vulkan differentiate between "submitting GPU work" and
"presenting".

Before this PR, calling `RendererCompositorRD::end_frame(false)` would
literally do nothing. This is often undesired and the cause of the leak.
After this PR, calling `RendererCompositorRD::end_frame(false)` will now
process commands, swap our internal buffers in a double/triple buffer
scheme **but avoid presenting to the screen**.

Hence the rename of the variable from `p_swap_buffers` to `p_present`
(which slightly alters its behavior).
If we want `RendererCompositorRD::end_frame(false)` to do nothing, then
we should not call it at all.

This PR reflects such change: When we're minimized **_and_**
`has_pending_resources_for_processing()` returns false, we don't call
`RendererCompositorRD::end_frame()` at all.

But if `has_pending_resources_for_processing()` returns true, we will
call it, but with `p_present = false` because we're minimized.

There's still the issue that Godot keeps processing work (logic,
scripts, physics) while minimized, which we shouldn't do by default. But
that's work for follow up PR.
darksylinc added a commit to darksylinc/godot that referenced this issue Dec 11, 2024
Fixes godotengine#90017
Fixes godotengine#90030
Fixes godotengine#98044

This PR makes the following changes:

# Force processing of GPU commands for frame_count frames

The variable `frames_pending_resources_for_processing` is added to track
this.

The ticket godotengine#98044 suggested to use `_flush_and_stall_for_all_frames()`
while minimized.

Technically this works and is a viable solution.

However I noticed that this issue was happening because Logic/Physics
continue to work "business as usual" while minimized(\*). Only Graphics
was being deactivated (which caused commands to accumulate until window
is restored).

To continue this behavior of "business as usual", I decided that GPU
work should also "continue as usual" by buffering commands in a double
or triple buffer scheme until all commands are done processing (if they
ever stop coming). This is specially important if the app specifically
intends to keep processing while minimized.

Calling `_flush_and_stall_for_all_frames()` would fix the leak, but it
would make  Godot's behavior different while minimized vs while the
window is presenting.

\* `OS::add_frame_delay` _does_ consider being minimized, but it just
throttles CPU usage. Some platforms such as Android completely disable
processing because the higher level code stops being called when the app
goes into background. But this seems like an implementation-detail that
diverges from the rest of the platforms (e.g. Windows, Linux & macOS
continue to process while minimized).

# Rename p_swap_buffers for p_present

**This is potentially a breaking change** (if it actually breaks
anything, I ignore. But I strongly suspect it doesn't break anything).

"Swap Buffers" is a concept carried from OpenGL, where a frame is "done"
when `glSwapBuffers()` is called, which basically means "present to the
screen".

However it _also_ means that OpenGL internally swaps its internal
buffers in a double/triple buffer scheme (in Vulkan, we do that
ourselves and is tracked by `RenderingDevice::frame`).

Modern APIs like Vulkan differentiate between "submitting GPU work" and
"presenting".

Before this PR, calling `RendererCompositorRD::end_frame(false)` would
literally do nothing. This is often undesired and the cause of the leak.
After this PR, calling `RendererCompositorRD::end_frame(false)` will now
process commands, swap our internal buffers in a double/triple buffer
scheme **but avoid presenting to the screen**.

Hence the rename of the variable from `p_swap_buffers` to `p_present`
(which slightly alters its behavior).
If we want `RendererCompositorRD::end_frame(false)` to do nothing, then
we should not call it at all.

This PR reflects such change: When we're minimized **_and_**
`has_pending_resources_for_processing()` returns false, we don't call
`RendererCompositorRD::end_frame()` at all.

But if `has_pending_resources_for_processing()` returns true, we will
call it, but with `p_present = false` because we're minimized.

There's still the issue that Godot keeps processing work (logic,
scripts, physics) while minimized, which we shouldn't do by default. But
that's work for follow up PR.
@AThousandShips AThousandShips added this to the 4.4 milestone Dec 12, 2024
WhalesState pushed a commit to mounirtohami/godot that referenced this issue Dec 15, 2024
Fixes godotengine#90017
Fixes godotengine#90030
Fixes godotengine#98044

This PR makes the following changes:

# Force processing of GPU commands for frame_count frames

The variable `frames_pending_resources_for_processing` is added to track
this.

The ticket godotengine#98044 suggested to use `_flush_and_stall_for_all_frames()`
while minimized.

Technically this works and is a viable solution.

However I noticed that this issue was happening because Logic/Physics
continue to work "business as usual" while minimized(\*). Only Graphics
was being deactivated (which caused commands to accumulate until window
is restored).

To continue this behavior of "business as usual", I decided that GPU
work should also "continue as usual" by buffering commands in a double
or triple buffer scheme until all commands are done processing (if they
ever stop coming). This is specially important if the app specifically
intends to keep processing while minimized.

Calling `_flush_and_stall_for_all_frames()` would fix the leak, but it
would make  Godot's behavior different while minimized vs while the
window is presenting.

\* `OS::add_frame_delay` _does_ consider being minimized, but it just
throttles CPU usage. Some platforms such as Android completely disable
processing because the higher level code stops being called when the app
goes into background. But this seems like an implementation-detail that
diverges from the rest of the platforms (e.g. Windows, Linux & macOS
continue to process while minimized).

# Rename p_swap_buffers for p_present

**This is potentially a breaking change** (if it actually breaks
anything, I ignore. But I strongly suspect it doesn't break anything).

"Swap Buffers" is a concept carried from OpenGL, where a frame is "done"
when `glSwapBuffers()` is called, which basically means "present to the
screen".

However it _also_ means that OpenGL internally swaps its internal
buffers in a double/triple buffer scheme (in Vulkan, we do that
ourselves and is tracked by `RenderingDevice::frame`).

Modern APIs like Vulkan differentiate between "submitting GPU work" and
"presenting".

Before this PR, calling `RendererCompositorRD::end_frame(false)` would
literally do nothing. This is often undesired and the cause of the leak.
After this PR, calling `RendererCompositorRD::end_frame(false)` will now
process commands, swap our internal buffers in a double/triple buffer
scheme **but avoid presenting to the screen**.

Hence the rename of the variable from `p_swap_buffers` to `p_present`
(which slightly alters its behavior).
If we want `RendererCompositorRD::end_frame(false)` to do nothing, then
we should not call it at all.

This PR reflects such change: When we're minimized **_and_**
`has_pending_resources_for_processing()` returns false, we don't call
`RendererCompositorRD::end_frame()` at all.

But if `has_pending_resources_for_processing()` returns true, we will
call it, but with `p_present = false` because we're minimized.

There's still the issue that Godot keeps processing work (logic,
scripts, physics) while minimized, which we shouldn't do by default. But
that's work for follow up PR.
jirisvd pushed a commit to jirisvd/godot that referenced this issue Dec 16, 2024
Fixes godotengine#90017
Fixes godotengine#90030
Fixes godotengine#98044

This PR makes the following changes:

# Force processing of GPU commands for frame_count frames

The variable `frames_pending_resources_for_processing` is added to track
this.

The ticket godotengine#98044 suggested to use `_flush_and_stall_for_all_frames()`
while minimized.

Technically this works and is a viable solution.

However I noticed that this issue was happening because Logic/Physics
continue to work "business as usual" while minimized(\*). Only Graphics
was being deactivated (which caused commands to accumulate until window
is restored).

To continue this behavior of "business as usual", I decided that GPU
work should also "continue as usual" by buffering commands in a double
or triple buffer scheme until all commands are done processing (if they
ever stop coming). This is specially important if the app specifically
intends to keep processing while minimized.

Calling `_flush_and_stall_for_all_frames()` would fix the leak, but it
would make  Godot's behavior different while minimized vs while the
window is presenting.

\* `OS::add_frame_delay` _does_ consider being minimized, but it just
throttles CPU usage. Some platforms such as Android completely disable
processing because the higher level code stops being called when the app
goes into background. But this seems like an implementation-detail that
diverges from the rest of the platforms (e.g. Windows, Linux & macOS
continue to process while minimized).

# Rename p_swap_buffers for p_present

**This is potentially a breaking change** (if it actually breaks
anything, I ignore. But I strongly suspect it doesn't break anything).

"Swap Buffers" is a concept carried from OpenGL, where a frame is "done"
when `glSwapBuffers()` is called, which basically means "present to the
screen".

However it _also_ means that OpenGL internally swaps its internal
buffers in a double/triple buffer scheme (in Vulkan, we do that
ourselves and is tracked by `RenderingDevice::frame`).

Modern APIs like Vulkan differentiate between "submitting GPU work" and
"presenting".

Before this PR, calling `RendererCompositorRD::end_frame(false)` would
literally do nothing. This is often undesired and the cause of the leak.
After this PR, calling `RendererCompositorRD::end_frame(false)` will now
process commands, swap our internal buffers in a double/triple buffer
scheme **but avoid presenting to the screen**.

Hence the rename of the variable from `p_swap_buffers` to `p_present`
(which slightly alters its behavior).
If we want `RendererCompositorRD::end_frame(false)` to do nothing, then
we should not call it at all.

This PR reflects such change: When we're minimized **_and_**
`has_pending_resources_for_processing()` returns false, we don't call
`RendererCompositorRD::end_frame()` at all.

But if `has_pending_resources_for_processing()` returns true, we will
call it, but with `p_present = false` because we're minimized.

There's still the issue that Godot keeps processing work (logic,
scripts, physics) while minimized, which we shouldn't do by default. But
that's work for follow up PR.
alessand10 pushed a commit to alessand10/godot that referenced this issue Dec 17, 2024
Fixes godotengine#90017
Fixes godotengine#90030
Fixes godotengine#98044

This PR makes the following changes:

# Force processing of GPU commands for frame_count frames

The variable `frames_pending_resources_for_processing` is added to track
this.

The ticket godotengine#98044 suggested to use `_flush_and_stall_for_all_frames()`
while minimized.

Technically this works and is a viable solution.

However I noticed that this issue was happening because Logic/Physics
continue to work "business as usual" while minimized(\*). Only Graphics
was being deactivated (which caused commands to accumulate until window
is restored).

To continue this behavior of "business as usual", I decided that GPU
work should also "continue as usual" by buffering commands in a double
or triple buffer scheme until all commands are done processing (if they
ever stop coming). This is specially important if the app specifically
intends to keep processing while minimized.

Calling `_flush_and_stall_for_all_frames()` would fix the leak, but it
would make  Godot's behavior different while minimized vs while the
window is presenting.

\* `OS::add_frame_delay` _does_ consider being minimized, but it just
throttles CPU usage. Some platforms such as Android completely disable
processing because the higher level code stops being called when the app
goes into background. But this seems like an implementation-detail that
diverges from the rest of the platforms (e.g. Windows, Linux & macOS
continue to process while minimized).

# Rename p_swap_buffers for p_present

**This is potentially a breaking change** (if it actually breaks
anything, I ignore. But I strongly suspect it doesn't break anything).

"Swap Buffers" is a concept carried from OpenGL, where a frame is "done"
when `glSwapBuffers()` is called, which basically means "present to the
screen".

However it _also_ means that OpenGL internally swaps its internal
buffers in a double/triple buffer scheme (in Vulkan, we do that
ourselves and is tracked by `RenderingDevice::frame`).

Modern APIs like Vulkan differentiate between "submitting GPU work" and
"presenting".

Before this PR, calling `RendererCompositorRD::end_frame(false)` would
literally do nothing. This is often undesired and the cause of the leak.
After this PR, calling `RendererCompositorRD::end_frame(false)` will now
process commands, swap our internal buffers in a double/triple buffer
scheme **but avoid presenting to the screen**.

Hence the rename of the variable from `p_swap_buffers` to `p_present`
(which slightly alters its behavior).
If we want `RendererCompositorRD::end_frame(false)` to do nothing, then
we should not call it at all.

This PR reflects such change: When we're minimized **_and_**
`has_pending_resources_for_processing()` returns false, we don't call
`RendererCompositorRD::end_frame()` at all.

But if `has_pending_resources_for_processing()` returns true, we will
call it, but with `p_present = false` because we're minimized.

There's still the issue that Godot keeps processing work (logic,
scripts, physics) while minimized, which we shouldn't do by default. But
that's work for follow up PR.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants