-
-
Notifications
You must be signed in to change notification settings - Fork 21k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Potential use-after-free in AnimationMixer due to dereferencing pointers to foreign objects, which may have been freed. #85365
Comments
I found another segfault, and I am wondering if it could be related (because of HashMap being mentioned in both). Run this command to reproduce:
Very quickly this causes a segfault after the game level is loaded: Thread 1 "godot.linuxbsd." received signal SIGSEGV, Segmentation fault.
HashSet<RendererSceneCull::Instance*, HashMapHasherDefault, HashMapComparatorDefault<RendererSceneCull::Instance*> >::insert (this=this@entry=0x38, p_key=@0x7fffffffb9f8: 0x5555665dd838) at ./core/templates/hash_set.h:408
408 uint32_t pos = _insert(p_key);
(gdb) bt
#0 HashSet<RendererSceneCull::Instance*, HashMapHasherDefault, HashMapComparatorDefault<RendererSceneCull::Instance*> >::insert (this=this@entry=0x38, p_key=@0x7fffffffb9f8: 0x5555665dd838) at ./core/templates/hash_set.h:408
#1 0x0000555559316b14 in RendererSceneCull::instance_geometry_set_lightmap (this=0x55555ca40270, p_instance=..., p_lightmap=..., p_lightmap_uv_scale=..., p_slice_index=0) at servers/rendering/renderer_scene_cull.cpp:1461
#2 0x0000555557ff5de7 in LightmapGI::_assign_lightmaps (this=0x55556a26fb00) at scene/3d/lightmap_gi.cpp:1337
#3 0x00005555599bcdf0 in Object::notification (this=0x55556a26fb00, p_notification=27, p_reversed=false) at core/object/object.cpp:837
#4 0x0000555557ac0d4a in Node::_propagate_ready (this=0x55556a26fb00) at scene/main/node.cpp:227
#5 0x0000555557ac0d24 in Node::_propagate_ready (this=this@entry=0x55555df01b40) at scene/main/node.cpp:222
#6 0x0000555557ac4417 in Node::_set_tree (this=0x55555df01b40, p_tree=0x55555da009d0) at scene/main/node.cpp:2938
#7 0x0000555557ad7396 in Node::_add_child_nocheck (this=0x55555e0bb8e0, p_child=<optimized out>, p_name=..., p_internal_mode=Node::INTERNAL_MODE_DISABLED) at scene/main/node.cpp:1394
#8 0x0000555557aeacbc in call_with_variant_args_helper<__UnexistingClass, Node*, bool, Node::InternalMode, 0ul, 1ul, 2ul> (r_error=..., p_args=0x7fffffffbc80, p_method=<optimized out>, p_instance=<optimized out>) at ./core/variant/binder_common.h:303
#9 call_with_variant_args_dv<__UnexistingClass, Node*, bool, Node::InternalMode> (default_values=..., r_error=..., p_argcount=<optimized out>, p_args=<optimized out>, p_method=<optimized out>, p_instance=<optimized out>) at ./core/variant/binder_common.h:450
#10 MethodBindT<Node*, bool, Node::InternalMode>::call (this=<optimized out>, p_object=<optimized out>, p_args=<optimized out>, p_arg_count=<optimized out>, r_error=...) at ./core/object/method_bind.h:335
#11 0x0000555555e2d700 in GDScriptFunction::call (this=<optimized out>, p_instance=<optimized out>, p_instance@entry=0x55555dfa27b0, p_args=p_args@entry=0x7fffffffc3a8, p_argcount=<optimized out>, r_err=..., p_state=<optimized out>) at modules/gdscript/gdscript_vm.cpp:1796
#12 0x0000555555ce9f2f in GDScriptInstance::callp (this=0x55555dfa27b0, p_method=..., p_args=<optimized out>, p_argcount=<optimized out>, r_error=...) at modules/gdscript/gdscript.cpp:1937
#13 0x00005555599bf01c in Object::callp (this=0x55555e0bb8e0, p_method=..., p_args=0x7fffffffc3a8, p_argcount=1, r_error=...) at core/object/object.cpp:753
#14 0x000055555976337a in Variant::callp (this=<optimized out>, p_method=..., p_args=0x7fffffffc3a8, p_argcount=1, r_ret=..., r_error=...) at core/variant/variant_call.cpp:1168
#15 0x0000555555e404dc in GDScriptFunction::call (this=<optimized out>, p_instance=<optimized out>, p_instance@entry=0x55555dfa27b0, p_args=p_args@entry=0x7fffffffc948, p_argcount=<optimized out>, r_err=..., p_state=<optimized out>) at modules/gdscript/gdscript_vm.cpp:1704
#16 0x0000555555ce9f2f in GDScriptInstance::callp (this=0x55555dfa27b0, p_method=..., p_args=<optimized out>, p_argcount=<optimized out>, r_error=...) at modules/gdscript/gdscript.cpp:1937
#17 0x00005555599bf01c in Object::callp (this=0x55555e0bb8e0, p_method=..., p_args=0x7fffffffc948, p_argcount=1, r_error=...) at core/object/object.cpp:753
#18 0x000055555976337a in Variant::callp (this=<optimized out>, p_method=..., p_args=0x7fffffffc948, p_argcount=1, r_ret=..., r_error=...) at core/variant/variant_call.cpp:1168
#19 0x0000555555e404dc in GDScriptFunction::call (this=<optimized out>, p_instance=<optimized out>, p_instance@entry=0x55555dfa27b0, p_args=p_args@entry=0x0, p_argcount=<optimized out>, r_err=..., p_state=<optimized out>) at modules/gdscript/gdscript_vm.cpp:1704
#20 0x0000555555ce9f2f in GDScriptInstance::callp (this=0x55555dfa27b0, p_method=..., p_args=<optimized out>, p_argcount=<optimized out>, r_error=...) at modules/gdscript/gdscript.cpp:1937
#21 0x0000555557ad631b in Node::_gdvirtual__ready_call<false> (this=0x55555e0bb8e0) at scene/main/node.h:322
#22 Node::_notification (this=0x55555e0bb8e0, p_notification=<optimized out>) at scene/main/node.cpp:187
#23 0x000055555806550d in Node::_notificationv (p_reversed=false, p_notification=13, this=0x55555e0bb8e0) at ./scene/main/node.h:49
#24 Node3D::_notificationv (this=0x55555e0bb8e0, p_notification=13, p_reversed=<optimized out>) at scene/3d/node_3d.h:52
#25 0x00005555599bcdf0 in Object::notification (this=0x55555e0bb8e0, p_notification=13, p_reversed=false) at core/object/object.cpp:837
#26 0x0000555557ac0d86 in Node::_propagate_ready (this=this@entry=0x55555e0bb8e0) at scene/main/node.cpp:231
#27 0x0000555557ac4417 in Node::_set_tree (this=0x55555e0bb8e0, p_tree=0x55555da009d0) at scene/main/node.cpp:2938
#28 0x0000555557ad7396 in Node::_add_child_nocheck (this=0x55555da01560, p_child=<optimized out>, p_name=..., p_internal_mode=Node::INTERNAL_MODE_DISABLED) at scene/main/node.cpp:1394
#29 0x0000555557aeacbc in call_with_variant_args_helper<__UnexistingClass, Node*, bool, Node::InternalMode, 0ul, 1ul, 2ul> (r_error=..., p_args=0x7fffffffcea0, p_method=<optimized out>, p_instance=<optimized out>) at ./core/variant/binder_common.h:303
#30 call_with_variant_args_dv<__UnexistingClass, Node*, bool, Node::InternalMode> (default_values=..., r_error=..., p_argcount=<optimized out>, p_args=<optimized out>, p_method=<optimized out>, p_instance=<optimized out>) at ./core/variant/binder_common.h:450
#31 MethodBindT<Node*, bool, Node::InternalMode>::call (this=<optimized out>, p_object=<optimized out>, p_args=<optimized out>, p_arg_count=<optimized out>, r_error=...) at ./core/object/method_bind.h:335
#32 0x00005555599beeb7 in Object::callp (this=0x55555da01560, p_method=..., p_args=0x7fffffffd000, p_argcount=<optimized out>, r_error=...) at core/object/object.cpp:775
#33 0x000055555970bdc4 in Callable::callp (this=0x55555d7994c8, p_arguments=0x7fffffffd000, p_argcount=1, r_return_value=..., r_call_error=...) at core/variant/callable.cpp:69
#34 0x00005555599b73a2 in CallQueue::_call_function (this=this@entry=0x55555c5f6580, p_callable=..., p_args=p_args@entry=0x55555d7994e0, p_argcount=1, p_show_error=true) at core/object/message_queue.cpp:219
#35 0x00005555599ba90d in CallQueue::flush (this=0x55555c5f6580) at core/object/message_queue.cpp:324
#36 0x0000555557b0a65b in SceneTree::physics_process (this=0x55555da009d0, p_time=0.016666666666666666) at scene/main/scene_tree.cpp:471
#37 0x0000555555b21a74 in Main::iteration () at main/main.cpp:3598
#38 0x0000555555ab5531 in OS_LinuxBSD::run (this=this@entry=0x7fffffffd270) at platform/linuxbsd/os_linuxbsd.cpp:933
#39 0x0000555555aa4f58 in main (argc=<optimized out>, argv=0x7fffffffd848) at platform/linuxbsd/godot_linuxbsd.cpp:7 |
cc @TokageItLab |
I can't tell anything without MRP. Guessing from the stack trace, there possibly could be a problem with the Continuous interpolation of StringName #84815. Or it could be hashmap-related, and the cache could be corrupted when switching animations by method track or something. In any case, I cannot say for sure since we do not know what you are doing. |
AnimationMixer::_blend_apply
The full project is open source and reproduces the crash, I just tested to confirm following the steps to reproduce in the OP. So that can be used to debug, even if it's a bit heavy, at least it lets you get into a debugger and inspect what is wrong. @unfa If the issue is confirmed to always happen when firing the Plasma Gun, maybe you can extract a more minimal project with just a bot shooting this gun, with their animations. |
Hi! I always try to create a minimal reproduction project for the bugs we find on Liblast, which is made in a big part to test the engine and help with the reporting, testing, feedback etc. However, we are mostly 2 learning devs and reproducing some of this more technical issues is out of my scope, as sometimes i'm unable to separate the problematic part of the game from the other parts without the project breaking in other way or i can not get the same error while trying to reproduce them from scratch because i can not replicate it. So i apology for the not ideal conditions, but please have in mind this is not because i didn't bother to, but because i could not. While i understand the project is heavy, it is totally open source and so is the toolchain used, so it can be used for this scenarios without any trouble with licensing and having the whole picture available, from the asset sources to the final project, so complex issues can be tracked to any point. For testing purposes, all you need from the repo is contained inside the game folder, so you can get that and skip all the other files, which are a big part of the total size. |
Yeah, I totally agree with you, @yosoyfreeman - the size of Liblast as a project allows us to find such weird engine bugs - thanks to the relative complexity - but it also makes it harder to isolate the issues. @akien-mga I will try to do that, and see what can I get. |
Ah, I had never heard that there was such an open source project. But it takes a lot of effort to isolate the source of a bug from a large project, so the reproduction project is the smaller the better. BTW, this may be a similar/related issue? |
@TokageItLab Here's an MRP: I've ripped out the guts from Liblast and managed to stop all the bleeding for long enough to reproduce this crash. |
@unfa Thanks! I tried that, but it crashed even without playing the animation, so the animation may not be the actual cause.. |
I tested this hypothesis, it doesn't fix this crash, but indeed the final crash location might be related. In some logs I've seen The PlasmaSecondaryCrash MRP crashes for me with this stacktrace:
What stacktrace do you get? There might be several bugs, but at least the crash originating in |
Since the backtrace isn't super helpful to really understand what's causing the crash exactly, I added some good old printf: diff --git a/core/object/object.cpp b/core/object/object.cpp
index 40df13849b..6437ebca05 100644
--- a/core/object/object.cpp
+++ b/core/object/object.cpp
@@ -298,6 +298,7 @@ void Object::set(const StringName &p_name, const Variant &p_value, bool *r_valid
#endif
// Something inside the object... :|
+ print_line(vformat("Object: %s\tProperty: %s\tValue: %s", to_string(), p_name, p_value.operator String()));
bool success = _setv(p_name, p_value);
if (success) {
if (r_valid) { That spams a lot, but lets me see what object/property/value combo it's crashing on (which gdb seemed to have trouble extracting, or I don't know how to use it :D):
Reproduced a few times, it seems pretty consistent (with different Node IDs, but somehow the same numbers of AnimationLibrary). The warning also appears before in the logs, but seems to consistently appear before the last Here's the full output: More printf debugging, got a bit further, it's crashing after setting a script property. That's probably still not what causes the crash, it's just how close I manage to get to the issue: diff --git a/core/object/object.cpp b/core/object/object.cpp
index 40df13849b..f0cb37bdde 100644
--- a/core/object/object.cpp
+++ b/core/object/object.cpp
@@ -227,6 +227,7 @@ void Object::set(const StringName &p_name, const Variant &p_value, bool *r_valid
#endif
if (script_instance) {
+ print_line(vformat("script_instance->set: Object: %s\tProperty: %s\tValue: %s", to_string(), p_name, p_value.operator String()));
if (script_instance->set(p_name, p_value)) {
if (r_valid) {
*r_valid = true;
@@ -298,7 +299,11 @@ void Object::set(const StringName &p_name, const Variant &p_value, bool *r_valid
#endif
// Something inside the object... :|
+ print_line(vformat("_setv: Object: %s\tProperty: %s\tValue: %s", to_string(), p_name, p_value.operator String()));
bool success = _setv(p_name, p_value);
+ if (p_name == "libraries") {
+ print_line("library");
+ }
if (success) {
if (r_valid) {
*r_valid = true;
|
I couldn't figure out how to get a stack trace in my environment, so I send a modified project. It is not always reproducible, but sometimes crashes randomly after a few attempts at this. |
That BTW the crash in both MRPs seems to always happen at the 6th explosion. I tried poking at the scenes in the MRP but many of the affected scenes and scripts are broken due to the process of gutting them out of the full game. So I'm not sure how representative the MRP still is of an actionable problem beside "things might crash when half the scenes and scripts are broken". |
Tested again with the full Liblast project (and #85388 merged locally as it seemed possibly related), and I still get the crash.
That It's starting to sound related to #85155. Overall this kind of crash is likely some dangling pointers to invalid data or memory corruption. |
I wonder... IIRC the animations for the PlasmaExplosion hit effect call |
I tested the Line 226 in eda44bf
The more relevant call is a few stack levels below in godot/scene/animation/animation_mixer.cpp Line 1711 in eda44bf
More detailed stack report below, uncollapse to see it. Seems like one potential hint is that the memory was allocated in ASAN crash report
|
… freed Works around godotengine#85365, but it's likely only a partial fix. The proper fix would be to remove the Object pointer from the TrackCache and always go back to the ObjectID before doing operations like this.
Thanks @bitsawer! I could work around the issue in #85461, which should be "good enough" for 4.2 hopefully. At least it fixes Liblast. I'll repurpose this issue to focus on the fact that AnimationMixer stores Object pointers in its TrackCache, and as seen here, this is really unsafe. We should remove |
Thanks a lots @bitsawer @akien-mga! I don't remember how long |
Another crash likely caused by the same We should aim to fix this sooner than later, ideally for 4.2.1. |
Using it could easily lead to use-after-free crashes, as the object may have been freed. Caching object pointers like this is unsafe. Instead, we rely only on the ObjectID and query ObjectDB with it each time we need access to the instance. - Fixes godotengine#85365. - Fixes godotengine#85572.
… freed Works around godotengine#85365, but it's likely only a partial fix. The proper fix would be to remove the Object pointer from the TrackCache and always go back to the ObjectID before doing operations like this.
Godot version
git master [1ba920f]
System information
Arch Linux, KDE 5, Radeon RX6800XT, X11
Issue description
The act of shooting a Plasma Gun in Liblast causes the engine to segfault not long after.
The backtrace point to AnimationMixer.
I a,m seeing a lot of errors related to animations for that weapon, but I don't expect the engine to segfault from that.
Backtrace:
Steps to reproduce
Clone Liblast repository:
git clone https://codeberg.org/Liblast/Liblast.git cd Liblast git lfs install git pull
Open the game in Godot 4.2 editor to import all media (if they all fail, check LFS again)
After the game is imported, run via GDB automatically starting a game:
Once the map loads, click on the "start now" button to avoid waiting 10 sec.
After that, the bots will do their thing and shoot their Plasma Guns causing a crash.
The crash seems to most often happen when shooting that weapon - hitscan weapons don't seem to cause it.
Minimal reproduction project
I am afraid I don't have one just yet.
The text was updated successfully, but these errors were encountered: