Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Only copy the relevant portion of the screen when copying to backbuffer in Compatibility backend #84733

Merged
merged 1 commit into from
Jan 8, 2024

Conversation

clayjohn
Copy link
Member

@clayjohn clayjohn commented Nov 10, 2023

Partially addresses: #79439

Implements the first point from #79439 (comment)

Previously we copied the entire backbuffer, now we only copy the relevant portion. This matches the behavior on the RD backends.

Opening as draft as I need to do more testing

On my laptop using intel integrated graphics I get roughly double the performance with this change (~160 FPS to ~320 FPS) using the MRP from #79439. Testing on a pixel 4 and there is no difference (likely because it is not a fill rate bottleneck)

@clayjohn clayjohn added bug topic:rendering topic:2d performance cherrypick:4.1 Considered for cherry-picking into a future 4.1.x release cherrypick:4.2 Considered for cherry-picking into a future 4.2.x release labels Nov 10, 2023
@clayjohn clayjohn added this to the 4.3 milestone Nov 10, 2023
@Calinou
Copy link
Member

Calinou commented Nov 24, 2023

Tested locally (rebased on top of master 671c04f), it works as expected. Visuals look identical in the MRP when Clip Children is enabled (compared to before this PR).

Exported projects (for quick testing)

Press Enter/Space, left-click or tap the screen to toggle Clip Children.

Android

Web

Benchmark

Linux

OS: Fedora 39
CPU: Intel Core i9-13900K
GPU: GeForce RTX 4090 (NVIDIA 535.129.03)

1152×648 window

No Clip Children Clip Children Before Clip Children After (this PR)
11594 FPS (0.09 mspf) 859 FPS (1.16 mspf) 942 FPS (1.06 mspf)

3840×2160 fullscreen

No Clip Children Clip Children Before Clip Children After (this PR)
6611 FPS (0.15 mspf) 393 FPS (2.54 mspf) 882 FPS (1.13 mspf)

Web on Linux

Browser: Chromium 118
CPU: Intel Core i9-13900K
GPU: GeForce RTX 4090 (NVIDIA 535.129.03)

Start Chromium with the --disable-frame-rate-limit --disable-gpu-vsync CLI arguments to disable any FPS limiting from the browser.

3840×2160 fullscreen

No Clip Children Clip Children Before Clip Children After (this PR)
2431 FPS (0.41 mspf) 282 FPS (3.55 mspf) 447 FPS (2.24 mspf)

Android

OS: Android 13
Device: Samsung Galaxy Z Fold4 (Snapdragon 8 Gen 1+, Adreno 730)

2176×1812

No Clip Children Clip Children Before Clip Children After (this PR)
120 FPS (8.33 mspf, capped by V-Sync) 62 FPS (16.13 mspf) 120 FPS (8.33 mspf, capped by V-Sync)

Web on Android

Browser: Bromite 108
Device: Samsung Galaxy Z Fold4 (Snapdragon 8 Gen 1+, Adreno 730)

2176×1812

No Clip Children Clip Children Before Clip Children After (this PR)
120 FPS (8.33 mspf, capped by V-Sync) 10 FPS (100.00 mspf) 13 FPS (76.92 mspf)

I'm not sure where this large performance discrepancy between native and web comes from. It's not new, but I'm puzzled it's this large when desktop Chromium is "only" 2× slower than desktop while Android Bromite is roughly 10× slower.

@joined72
Copy link
Contributor

Any news, seems to be really good performance improvements! ;)

@clayjohn clayjohn marked this pull request as ready for review January 6, 2024 00:44
@clayjohn clayjohn requested a review from a team as a code owner January 6, 2024 00:44
@clayjohn
Copy link
Member Author

clayjohn commented Jan 6, 2024

Thank you @Calinou for testing I wanted to ensure that a performance difference could be reproduced on Android, since you already did that, I am happy to mark as ready for review

Copy link
Member

@lawnjelly lawnjelly left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks fine to me.

@akien-mga akien-mga merged commit 774c463 into godotengine:master Jan 8, 2024
15 checks passed
@akien-mga
Copy link
Member

Thanks!

@YuriSizov YuriSizov removed the cherrypick:4.2 Considered for cherry-picking into a future 4.2.x release label Jan 24, 2024
@YuriSizov
Copy link
Contributor

Cherry-picked for 4.2.2.

@YuriSizov
Copy link
Contributor

Can't get a clean cherry-pick for 4.1 because it partially depends on changes from #78168.

@YuriSizov YuriSizov removed the cherrypick:4.1 Considered for cherry-picking into a future 4.1.x release label Jan 24, 2024
@Nicholas3413
Copy link

Are there anyway to include this improvement for the Godot 4.1.5 next? i tested the FPS difference on Godot 4.3 (stable) and Godot 4.1.4, there are about 15~ fps gap (145 vs 131)

@clayjohn clayjohn deleted the GL-CanvasGroup-performance branch September 2, 2024 03:51
@clayjohn
Copy link
Member Author

clayjohn commented Sep 2, 2024

Are there anyway to include this improvement for the Godot 4.1.5 next? i tested the FPS difference on Godot 4.3 (stable) and Godot 4.1.4, there are about 15~ fps gap (145 vs 131)

I'm not sure if we will be releasing another 4.1.x version.

However, in either case, a new PR would be needed targeting 4.1.x directly as the source diverged enough between 4.1 and 4.2 that this can't be cleanly cherry picked. In other words, you will have to copy these changes manually into the 4.1 branch if you want to use them there.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants