Add 'tools' section to the optimization article. #10

Ivorforce · 2025-10-12T10:51:25Z

Follow-up of Engine optimization PR guidelines #7

Improvements added:

I've added a 'tools' section. This gives us the opportunity to go more in depth about the tools at your disposal, and to link relevant resources. I've linked some (agner fog, compiler explorer, and our C++ profiling article), but more could be added.
I removed some detail from other sections, because I think the information fits better in the new sections.

I've tried to be brief, though I felt like some examples were needed to really get the points across.

engine/guidelines/optimization.rst

Mickeon · 2025-10-12T13:55:49Z

engine/guidelines/optimization.rst

+    For micro-optimizations, C++ compilers will often be aware of basic tricks and
+    will already perform them in optimized builds.


This note could use a few examples, maybe. Constant unfolding is later mentioned, so it may not be necessary, after all. Would be nice if someone chimed in. I suppose things like Common subexpression elimination count as part of this?
Either way, perhaps something along these lines to hammer the point home?

Suggested change

For micro-optimizations, C++ compilers will often be aware of basic tricks and

will already perform them in optimized builds.

C++ compilers already perform many basic optimization tricks when requested. Therefore, without proper care, micro-optimizing code can often reduce readability while providing no practical benefit in optimized builds.

Feel free to rephrase.

Yea, good point, the info box needs a conclusion. I've added a sentence about not all changes being good.

engine/guidelines/optimization.rst

Mickeon · 2025-10-12T15:02:22Z

engine/guidelines/optimization.rst

+looks to test the code faithfully, but the benchmarks show no improvement. This might be
+indicative of a poorly written benchmark, which the compiler is able to 'optimize away' by using


Suggested change

looks to test the code faithfully, but the benchmarks show no improvement. This might be

indicative of a poorly written benchmark, which the compiler is able to 'optimize away' by using

aims to test the code faithfully, but the benchmarks show no improvement. This might be

indicative of a poorly written benchmark, which the compiler is able to "optimize away" by performing

Also, alternatively, but maybe it makes the sentence more complex:

Suggested change

looks to test the code faithfully, but the benchmarks show no improvement. This might be

indicative of a poorly written benchmark, which the compiler is able to 'optimize away' by using

aims to test the code faithfully, but the benchmarks show no improvement. This might suggest the benchmarks are not written to account for optimizations performed by the compiler, such as

What I wanted to say isn't that one tries to test the code faithfully, but that the benchmark looks like it should test the code faithfully. I'm not sure what the difference is, but it feels more accurate in my mind. I changed the phrasing to make this clearer, but feel free to challenge my changes if you think it's weird 😄

engine/guidelines/optimization.rst

Mickeon · 2025-10-12T15:05:42Z

engine/guidelines/optimization.rst

+Assembly viewers show the final compiled version of your code in readable
+assembly format. This can be an effective way to optimize low-level code. It is not effective


Assembly viewers, plural, so it's "They're", I think.

Suggested change

Assembly viewers show the final compiled version of your code in readable

assembly format. This can be an effective way to optimize low-level code. It is not effective

Assembly viewers show the final compiled version of your code in a readable

format. This information can greatly help optimizing low-level code. They're not effective

Alternatively, but I personally avoid "but"s. I don't know if it's even necessary to specify "assembly" again.

Suggested change

Assembly viewers show the final compiled version of your code in readable

assembly format. This can be an effective way to optimize low-level code. It is not effective

Assembly viewers show the final compiled version of your code in a readable, but still assembly

format. This information can greatly help optimizing low-level code. They're not effective

I wanted to comment that "examining assembly" can be an effective way to optimize the code (not necessarily the viewers). I changed the wording accordingly.

engine/guidelines/optimization.rst

Mickeon · 2025-10-12T15:19:19Z

engine/guidelines/optimization.rst

+`software optimization resources <https://www.agner.org/optimize/>`__ have emerged as an invaluable
+tool for this, especially his `C++ optimization guide <https://agner.org/optimize/optimizing_cpp.pdf>`__.


It's more than one ("resources"), so it's plural "tools".

Suggested change

`software optimization resources <https://www.agner.org/optimize/>`__ have emerged as an invaluable

tool for this, especially his `C++ optimization guide <https://agner.org/optimize/optimizing_cpp.pdf>`__.

`software optimization resources <https://www.agner.org/optimize/>`__ have released invaluable

tools for this, especially his `C++ optimization guide <https://agner.org/optimize/optimizing_cpp.pdf>`__.

But I'm not sure that fits quite right, either?

Suggested change

`software optimization resources <https://www.agner.org/optimize/>`__ have emerged as an invaluable

tool for this, especially his `C++ optimization guide <https://agner.org/optimize/optimizing_cpp.pdf>`__.

`software optimization resources <https://www.agner.org/optimize/>`__ are invaluable

for this, especially his `C++ optimization guide <https://agner.org/optimize/optimizing_cpp.pdf>`__.

"emerged" is often used in scientific context in this way. It means that something that somebody made has slowly gained attention and is, by now, a useful tool known in the field. Something that somebody just made is never useful because its usefulness has to be proven first, by people using it regularly over time. Hence, they say "tools emerged as useful" instead of "somebody created useful tools".
That being said, if you are not familiar with this phrasing, it's probably best to change it. I'll go ahead with your second suggestion.

With your explanation the term does make sense. If you want to make it clearer that's what happened, you could make it more explicit too.

Suggested change

`software optimization resources <https://www.agner.org/optimize/>`__ have emerged as an invaluable

tool for this, especially his `C++ optimization guide <https://agner.org/optimize/optimizing_cpp.pdf>`__.

`software optimization resources <https://www.agner.org/optimize/>`__ have gained popularity as an invaluable

tool for this, especially his `C++ optimization guide <https://agner.org/optimize/optimizing_cpp.pdf>`__.

or similar! But either way, it's a nitpick.

engine/guidelines/optimization.rst

lawnjelly · 2025-10-13T11:39:52Z

Before going in more detail, I do think we should consider what we hope to achieve:

Is the aim in the contributor documentation:

provide already competent optimization programmers the information they need to make PRs specific to Godot
provide a comprehensive resource teaching beginners how to optimize code

In the first draft I tried to mostly cover (1), with the idea that we have tried to introduce the beginner skills for optimization in the main documentation, as they are relevant not just to people making PRs, but to users optimizing their own GDScript, C# and c++ modules and extensions.

So in my mind there is a reasonable argument to shifting as much information as possible to the main documentation.

We should also consider there are also extensive external resources on teaching optimization / references - do we intend to become a go to guide for learning optimization on the internet, or should we limit things as to what is relevant to the engine (i.e. is there is a risk of going out of scope here and maintaining far more documentation than we need 🤔 )?

Also at the risk of bringing up controversial philosophy / gatekeeping, should we be encouraging newbies to submit optimization PRs until they have some skills / experience? Core maintainers already have to spend significant time reviewing low quality PRs, and the submitters can end up being disappointed / disillusioned when they realise the subject isn't as simple as they thought.

Ivorforce · 2025-10-13T12:26:32Z

Is the aim in the contributor documentation:

provide already competent optimization programmers the information they need to make PRs specific to Godot

provide a comprehensive resource teaching beginners how to optimize code

Definitely just 1., not 2. I think we'd all agree to that.

In the first draft I tried to mostly cover (1), with the idea that we have tried to introduce the beginner skills for optimization in the main documentation, as they are relevant not just to people making PRs, but to users optimizing their own GDScript, C# and c++ modules and extensions.

So in my mind there is a reasonable argument to shifting as much information as possible to the main documentation.

I'm not sure if the information added here is relevant to users outside of those looking to optimizing the engine.
I would be on board with shortening these sections, but I think they all have a role to fulfill within the optimization PR workflow:

"Low level" vs "high level" clarifies our difference of expectations for both kinds of optimizations. Though I could see this information being shortened and interweaved into the "PR requirements" sections.
"Tools" enables us to provide a short explanation of our preferences (for PR purposes), to link resources, and to not have to intersperse information into other sections. The rest of the text is 'fluff' trying to rationalize our preferences, but I suppose it could be left out.

We should also consider there are also extensive external resources on teaching optimization / references - do we intend to become a go to guide for learning optimization on the internet, or should we limit things as to what is relevant to the engine (i.e. is there is a risk of going out of scope here and maintaining far more documentation than we need 🤔 )?

What we've provided here so far is very far from being a "go-to" guide. I'd love for our article to be a link hub to good resources instead, if we know any. The only ones I've used extensively so far are Agner Fog's and Compiler Explorer (both of which I've linked).

Also at the risk of bringing up controversial philosophy / gatekeeping, should we be encouraging newbies to submit optimization PRs until they have some skills / experience? Core maintainers already have to spend significant time reviewing low quality PRs, and the submitters can end up being disappointed / disillusioned when they realise the subject isn't as simple as they thought.

Hrm, yea, I can see us discouraging optimization PRs altogether for new contributors. But someone getting started on a performance PR can still save us some time, even if it doesn't turn out mergable on first try. Finding a good formulation for that would be nice.

engine/guidelines/optimization.rst

lawnjelly · 2025-10-13T13:00:32Z

engine/guidelines/optimization.rst

+
+    For micro-optimizations, C++ compilers will often be aware of basic tricks and
+    will already perform them in optimized builds. Therefore, not all changes that
+    look like they should optimize the code will actually make the code faster.


This is why viewing assembly can be important (particularly when benchmarks aren't super convincing).

Actually, good point. I'll try to weave this info in somehow.

lawnjelly · 2025-10-13T13:02:22Z

engine/guidelines/optimization.rst

+Profilers are the most important tool for everyone optimizing code.
+They show you which parts of the code are responsible for slow execution or heavy CPU load,
+and are therefore excellent for identifying what needs to be optimized. Profilers can
+also be used to identify whether the problem has been resolved, by profiling again after


"identify whether performance has been increased"?

See https://github.com/godotengine/godot-contributing-docs/pull/10/files#r2424446604

I ended up shortening this as well. Probably needs another round of feedback.

engine/guidelines/optimization.rst

lawnjelly · 2025-10-13T13:09:25Z

engine/guidelines/optimization.rst

+Assembly viewers
+^^^^^^^^^^^^^^^^
+
+Assembly viewers show the final compiled version of your code in a readable


Assembly is a human readable form of the final compiled machine code, specific to the architecture being used.

Yea, that's what I wanted to say. Any idea how to improve the wording?

I would say something like:

Assembly
^^^^^^^^^
In some cases when making low level optimizations, it can be a good idea to view the assembly language generated by the compiler. Assembly language is a human readable form of machine code, which the final binary instructions used to drive the CPU. As such the assembly language will depend on the architecture, the compiler and the compiler settings.

Viewing assembly allows you to compare the before / after result, and to confirm hypotheses used to guide optimization, and to see what the compiler is doing in general. It can also be helpful to reviewers in relevant pull requests, so please post assembly with your PR where appropriate.

There are many ways to view assembly, including:

godbolt

setting clang / gcc / cl to output assembly when compiling the file of interest

inside your IDE debugger when running Godot

inside your profiling tool (if supported, e.g. Callgrind)

You may find the following resources useful:
...

I'll see if I can dig out some example for compiling a single file to assembly, it's a little bit fiddly...

clang++ -S -o clang_asm.s -c -std=gnu++14 -O3 -pipe -fpie -fno-exceptions -DNDEBUG -DNO_EDITOR_SPLASH -DTOUCH_ENABLED -DALSA_ENABLED -DALSAMIDI_ENABLED -DPULSEAUDIO_ENABLED -D_REENTRANT -DSPEECHD_ENABLED -DJOYDEV_ENABLED -DUDEV_ENABLED -DX11_ENABLED -DUNIX_ENABLED -DOPENGL_ENABLED -DGLES_ENABLED -D_FILE_OFFSET_BITS=64 -DPTRCALL_ENABLED -DMINIZIP_ENABLED -DZSTD_STATIC_LINKING_ONLY -DGLAD_ENABLED -DGLES_OVER_GL -DBT_USE_OLD_DAMPING_METHOD -DBT_THREADSAFE -Ithirdparty/bullet -Ithirdparty/libpng -Ithirdparty/glad -Ithirdparty/zstd -Ithirdparty/zlib -Iplatform/x11 -I. -I/usr/include/speech-dispatcher -I/usr/include/glib-2.0 -I/usr/lib/x86_64-linux-gnu/glib-2.0/include modules/bullet/area_bullet.cpp

I think from memory it involved seeing the verbose command line for compiling the file, then adding the necessary flags to output assembly (-S maybe?).

That sounds nice, but going back to your own original point — should we be going into that much detail? Most contributors won't (and imo, shouldn't) be looking at assembly, so maybe we should leave detailed explanations to other articles. My main idea with mentioning it at all is that we can link resources that do go into detail, and to state our opinion about it (i.e. "can be an effective way to optimize low-level code").

Yeah, we could leave out the "many ways to view assembly" bit I guess.

I'm kind of the opposite 😀 , I think we should be encouraging anyone making low level PRs to investigate and learn some basic assembly (but then I wrote in assembly about a decade before learning c++). Admittedly I know very little x86, but enough to get a gist of what it is doing.

Even in low level PRs, I use my knowledge of assembly only to shortcut investigations, but end up optimizing the PR solely based on measurement results. That's why I think this knowledge is useful, but optional. (I suppose this is more relevant though to the point below, where we might ask people to provide their assembly investigations in the PR).

As for this discussion, I used your text as the general framework, but very harshly cut it to be minimal. Let me know if it works for you!

engine/guidelines/optimization.rst

lawnjelly · 2025-10-13T13:13:42Z

engine/guidelines/optimization.rst

 - Show that you improved the code either by profiling again, or running systematic benchmarks.
+  See `tools <#tools-for-optimization>`__ for more info.
 - Test on multiple platforms where appropriate, especially mobile.
- When micro-optimizing, show assembly before / after where appropriate.


I wouldn't remove this, this is part of the reason for the whole document.

We were getting people submitting PRs without assembly in cases where it was needed to show what was happening, to show that their PR worked as described. The "where appropriate" is vague enough that it shows not needed in a lot of cases.

I removed this point because I don't think it's relevant for most PRs, even most performance optimization PRs. Looking at assembly can be helpful, but it's far less important than actually measuring performance.

Ivorforce requested review from AThousandShips, Calinou, clayjohn and lawnjelly October 12, 2025 10:51

Ivorforce added the enhancement New feature or request label Oct 12, 2025

Ivorforce force-pushed the high-low-level-tools branch 2 times, most recently from 5b596cd to 878cdca Compare October 12, 2025 10:56

Ivorforce requested a review from a team October 12, 2025 11:25

Mickeon reviewed Oct 12, 2025

View reviewed changes

Ivorforce force-pushed the high-low-level-tools branch from 878cdca to 31bdaa1 Compare October 12, 2025 16:07

AThousandShips reviewed Oct 13, 2025

View reviewed changes

engine/guidelines/optimization.rst Outdated Show resolved Hide resolved

engine/guidelines/optimization.rst Outdated Show resolved Hide resolved

engine/guidelines/optimization.rst Outdated Show resolved Hide resolved

Ivorforce force-pushed the high-low-level-tools branch from 31bdaa1 to a357a9f Compare October 13, 2025 09:22

lawnjelly reviewed Oct 13, 2025

View reviewed changes

Ivorforce force-pushed the high-low-level-tools branch 2 times, most recently from 1e19224 to 2fa83a5 Compare October 13, 2025 16:08

Ivorforce changed the title ~~Add 'high level / low level' and 'tools' sections to the optimization article.~~ Add 'tools' sections to the optimization article. Oct 13, 2025

Ivorforce force-pushed the high-low-level-tools branch from 2fa83a5 to 5b622d5 Compare October 13, 2025 16:12

Ivorforce changed the title ~~Add 'tools' sections to the optimization article.~~ Add 'tools' section to the optimization article. Oct 13, 2025

Ivorforce force-pushed the high-low-level-tools branch 2 times, most recently from d8a0e26 to 0e2f6e3 Compare October 13, 2025 17:41

Add a 'tools' sections to the optimization article.

cc7d357

Ivorforce force-pushed the high-low-level-tools branch from 0e2f6e3 to cc7d357 Compare October 13, 2025 17:43

		For micro-optimizations, C++ compilers will often be aware of basic tricks and
		will already perform them in optimized builds.

	For micro-optimizations, C++ compilers will often be aware of basic tricks and
	will already perform them in optimized builds.
	C++ compilers already perform many basic optimization tricks when requested. Therefore, without proper care, micro-optimizing code can often reduce readability while providing no practical benefit in optimized builds.

		looks to test the code faithfully, but the benchmarks show no improvement. This might be
		indicative of a poorly written benchmark, which the compiler is able to 'optimize away' by using

	looks to test the code faithfully, but the benchmarks show no improvement. This might be
	indicative of a poorly written benchmark, which the compiler is able to 'optimize away' by using
	aims to test the code faithfully, but the benchmarks show no improvement. This might suggest the benchmarks are not written to account for optimizations performed by the compiler, such as

		Assembly viewers show the final compiled version of your code in readable
		assembly format. This can be an effective way to optimize low-level code. It is not effective

		`software optimization resources <https://www.agner.org/optimize/>`__ have emerged as an invaluable
		tool for this, especially his `C++ optimization guide <https://agner.org/optimize/optimizing_cpp.pdf>`__.

Uh oh!

Add 'tools' section to the optimization article. #10

Are you sure you want to change the base?

Add 'tools' section to the optimization article. #10

Uh oh!

Conversation

Ivorforce commented Oct 12, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Ivorforce Oct 12, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

lawnjelly commented Oct 13, 2025

Uh oh!

Ivorforce commented Oct 13, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

lawnjelly Oct 13, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Ivorforce Oct 13, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Ivorforce Oct 13, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Ivorforce commented Oct 12, 2025 •

edited

Loading

Ivorforce Oct 12, 2025 •

edited

Loading

Ivorforce commented Oct 13, 2025 •

edited

Loading

lawnjelly Oct 13, 2025 •

edited

Loading

Ivorforce Oct 13, 2025 •

edited

Loading

Ivorforce Oct 13, 2025 •

edited

Loading