Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

WIP: renderer: micro-optimize CPU culling and other things #1125

Draft
wants to merge 23 commits into
base: master
Choose a base branch
from

Conversation

illwieckz
Copy link
Member

@illwieckz illwieckz commented May 10, 2024

This code is a hot spot, we better avoid computing useless things if we can return early and do bitwise operations instead of relying on the branch prediction being right.

Commits are meant to be squashed, they are just small steps I did one by one for testing I was not breaking anything.

While I'm at it I'm also do minor improvements there and there in tr_world.cpp.

@illwieckz
Copy link
Member Author

illwieckz commented May 10, 2024

Before:

20240510-123313-000 unvanquished-orbit

After:

20240510-124016-000 unvanquished-orbit

The engine now spends 7.5% of the time in RE_RenderScene instead of 9%.

BoxOnPlaneSide is also used in other functions (outside of the screenshot).

Edit: This is a release-like RelWithDebInfo build with LTO enabled in both cases.

@illwieckz illwieckz force-pushed the illwieckz/cpu-culling branch 3 times, most recently from 042f364 to 3849aa9 Compare May 10, 2024 11:52
@illwieckz
Copy link
Member Author

illwieckz commented May 10, 2024

The screenshots were done over #1043 because this branch was first written over it, and the code looks to already be bit faster when #1043 is merged:

The current PR is bringing some extra performance boost over it. While the small performance boost of #1043 was not the purpose of that other PR, this one was, and it looks like both are useful for gaining extra CPU performance small steps after small steps.

@illwieckz illwieckz force-pushed the illwieckz/cpu-culling branch 9 times, most recently from db321e9 to 31949f7 Compare May 10, 2024 21:40
@illwieckz illwieckz changed the title renderer: micro-optimize CPU culling WIP: renderer: micro-optimize CPU culling and other things May 10, 2024
@illwieckz illwieckz marked this pull request as draft May 10, 2024 22:02
@illwieckz illwieckz force-pushed the illwieckz/cpu-culling branch 3 times, most recently from d8a7050 to 20e405a Compare May 11, 2024 05:36
@illwieckz
Copy link
Member Author

illwieckz commented May 12, 2024

I tested this branch rebased over for-0.55.0/sync branch and I also see a speed bump.

Before:

20240512-191032-000 unvanquished-orbit

After:

20240512-191513-000 unvanquished-orbit

So, I'll start to clean-up things and submit the patches for merging in a near future.

Edit: the percentage is higher than the ones in records from previous comments because this time I used a lower graphics preset so less time is spent on other parts of the code.

@@ -1061,10 +1036,8 @@ void R_AddPrecachedWorldInteractions( trRefLight_t *light )
{
continue;
}
else
{
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A lot of stuff here, for example this, or changing if (p == nullptr) to if (!p), is just cosmetic changes which don't change the meaning of the code.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants