-
-
Notifications
You must be signed in to change notification settings - Fork 21.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Animation Performance Problems #101494
Comments
@mrjustaguy Can you see if there is a version of godot 4 where it has better performance? We can't fix this in the Godot 4.4 release, but we could look into this for Godot Engine 4.5. |
Are you expecting Godot Engine to handle 2_400 animated characters? An older comparison on https://vxtwitter.com/duroxxigar/status/1779234802048053443 says Test case 500 skeleton meshes playing on loop: Flax: 132 FPS A faster animation subsystem requires some changes to the architecture if you want it to be like 5x faster. |
I'm not expecting 2.4k animated characters, and this skeleton issue is a separate issue, where the skeletons themselves are just expensive, but this has a lot to do with the Rendering Code for them being expensive on the CPU for some reason (Reducing the number of Skeletal Mesh Surfaces rendered a frame heavily helps here) What I am saying is the Animation system is obscenely heavy for something that's mostly just doing some math blending between various values plugged in, and that much more complicated logic and math (most of it implemented in a non performant oriented language, without attempts at optimizing the logic) is running orders of magnitude faster I've tested 3.6 now, and It's running MUUUUCH better with the same 2.4k animation player&tree setup, and isn't a slide show like 4.4 is. Here's the 3.6 project For reference I've gotten to 3x the balls in 3.6 before it getting into slide show territory, and I'd likely match 4.4 with 4x-5x the balls but the Editor is having trouble spawning them in 3.6, and it froze when trying to add more :/ |
Tested on an M2 MBP: It is approximately twice as fast on my device. There is a lot of room for improvement |
I'd like to see if we can get the 4.5 to match godot 3.6 on the vrm with an animation player & tree. Thanks for creating a minimal reproduction project. |
The major change there due to the change from 3.x to 4.0 is the use of time structs NodeTimeInfo, is that relevant? Also I remember that recently what affects the whole thing is the generation of AnimationInctance, which was implemented in 4.3. Maybe we should compare 4.2 and 4.3. I also think that the performance impact could be particularly significant with the inclusion of string processing and NodePath scene search, but this has existed since 3.x, so it may not matter much. For now, there may be some improvements that can be made through cpp language features if it is a problem with struct or other qualifiers. Of course, some manual optimizations such as reducing push_back, caching pointers, etc. should also be made. |
Possibly, I'm seeing a lot of time is spent in Overall there appears to be a lot of object creation happening on the heap for stuff that is being created and destroyed regularly. Ideally, the animation system wouldn't allocate from the heap regularly like that. In particular, I see the cost of creating and Vectors and Strings coming up a lot. I also see a lot of indexing into hash maps. We should investigate using a better data structure wherever we can to avoid the cost of hashing so frequently |
HashMap hashesHere is some low hanging fruit. There is a pattern of checking if a hashmap godot/scene/animation/animation_mixer.cpp Lines 402 to 406 in 24d7451
(Even worse, in some places we call (same problem with Using String additionThe String operations are really deadly. Over 10% of the time in the MRP is spent here: godot/scene/animation/animation_tree.cpp Lines 296 to 304 in 24d7451
|
You need to disable output.mp4But while I noticed it, I found |
I've been unable to reproduce that change on my end |
If you tested in the editor, then this is a problem of poor performance in the editor. |
No I've run the project, as I'm getting the data from Adrenalin which requires the game running at fullscreen, and disabling it didn't change a thing for me, I've even restarted Godot and ran again and it didn't do a thing for me... |
How much fps do you have in 4.4 and 3.6? |
in 3.6 I have over 3x better performance vs 4.4 |
I will also ask @clayjohn to test it. If the result is the same as mine, then you'll have to do the profiling https://docs.godotengine.org/en/stable/contributing/development/debugging/using_cpp_profilers.html yourself because it's unknown what might be slow in 4.x for you. |
What's your CPU? it may be that mine is much more memory bound in 4.4 with like 7 MB of cache, 3200mhz 2x ram IIRC M2 has quite a bit more memory bandwidth, and also is a different instruction set (ARM vs x86) |
Testing with 4.4 dev7, I get 10.3 mspf with AnimationMixer |
Godot v4.4.dev7 - Linux Mint 21.3 (Virginia) on X11 - X11 display driver, Multi-window, 1 monitor - Vulkan (Forward+) - dedicated NVIDIA GeForce RTX 3050 Laptop GPU - 11th Gen Intel(R) Core(TM) i5-11400H @ 2.70GHz (12 threads) |
Yeah you have significantly more cache compared to me, so over 3x improvement for 3.6 for me could be just 3.6 being less bandwidth hungry. Though that'd only be so if the 4.4 results vs 3.6 uplift you're seeing is smaller compared to mine like clay's is As clay is also not getting much of a difference I guess there's something else that's making the mixer cost significantly more for you. |
Edit: Oh nice, this is already addressed by #101548 Another fun one: We store the AnimationBlendTree nodes in an RBMap (it requires a O(log n) traversal for each lookup). For convenience, we do that same godot/scene/animation/animation_blend_tree.cpp Lines 1472 to 1476 in 4ce466d
5% of the CPU time in the MRP is spent in |
@Nazarwadim After some more testing, including your PRs, I think I am actually hitting some type of display sync issue. I can't get process time less than 10 ms even with other optimizations. I think I am capped at 100 FPS for some reason. I will profile on a different machine to see if I can replicate your results Edit: Tested on a Windows machine with a ryzen 3600 CPU. Applying #101564 and #101548 gives me about a 10% boost (30 mspf -> 27 mspf). Also tested with clayjohn@f5f3138 which reduced it down to about 24 mspf |
I have almost completed my 3rd PR, part of which is reducing the hash calculation. |
Great! Feel free to copy from clayjohn@f5f3138 if you want to |
Tested versions
Reproduced 4.4 Dev 7, 4.3 Stable
Not Reproduced 3.6 Stable
System information
Windows 11, i3 10105f
Issue description
I've noticed that Animation Tree and Animation Player seem to have a Ludicrously high cost
For Reference, in my game to have 60 FPS, I can have ~100 AI characters with active animation players and trees, but 3-4 times as many with them disabled.
Each of the AI's is dynamically typed GDScript, that isn't even all that optimized, gathering data of their environment and communicating with each other, moving around with Godot's physics and navigation calculations running in the background of all of that.
The GPU is not the limitation, given it's taking only a couple of milliseconds per frame to render and changing GPU taxing settings does nothing
Steps to reproduce
Open MRP, Observe poor performance, go to the instance, and set process mode of the animation tree and player to disabled, Observe Massive Improvement
Note: The MRP went a bit extreme (some 2.4k instances) to tank the performance even harder for those more powerful systems out there (It's a slide show on mine in the editor with animations activated in the instances)
Minimal reproduction project (MRP)
Animation.zip
The text was updated successfully, but these errors were encountered: