Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix GDScript leak avoidance (3.2) #42323

Merged

Conversation

RandomShaper
Copy link
Member

@RandomShaper RandomShaper commented Sep 25, 2020

Commit message:

Modify usage of types so that the Ref created from base_type.script_type doesn't involve converting first to Variant, which will use the constructor for Object *, as if the argument wasn't a Reference, and therefore will convert back to null.

For the records, this issue was much more complicated than it seems. Let's tell the full story:

On the surface, we could just say that just if we had been enforcing explicit, we would have detected that unintended conversion to Variant.

However, we can ask, why Variant wasn't doing the right thing? Answer: Variant has a constructor for Object * that is intended to be used only for non-Reference classes. For References you have to initialize it from a RefPtr.

Before the dangling Variant fix, the only issue with using the Object * constructor for a Reference was that the Variant could become dangling at a later point, when that shouldn't happen at all for Reference: the object would have just been treated as non ref-counted, so that its reference count wouldn't have been updated. If it was freed, yet the Variant would still believe it was around.

After the dangling Variant fix, in debug mode even non-References have a reference count mechanism. So if you construct a Variant with a Reference via the Object * constructor, you run into a very weird situation: the ref counting mechanism of Reference wouldn't be used at all, but the one for the other kinds of objects would be! And that's indeed what was happening in this bug: the Variant was initialized with a Script, but treated as a non-Reference, so when it was being asked back to be converted to RefPtr it was returning null, since that conversion only applies to References. And so the crash.

At first I thought that to make the fix complete, Variant should have an Object * constructor like this:

Variant::Variant(const Object *p_object) {

	type = OBJECT;
	Object *obj = const_cast<Object *>(p_object);

	memnew_placement(_data._mem, ObjData);
	Reference *ref = Object::cast_to<Reference>(obj);
#ifdef DEBUG_ENABLED
	_get_obj().rc = likely(obj && !ref) ? obj->_use_rc() : NULL;
#else
	_get_obj().obj = obj;
#endif
	if (unlikely(ref)) {
		*reinterpret_cast<Ref<Reference> *>(_get_obj().ref.get_data()) = Ref<Reference>(ref);
	}
}

That way it would realize p_object is a Reference and do the right thing.

But... The GDScript class is already taking advantage of the "feature": it has a Variant _static_ref member which is initialized to this. If Variant is fixed as in the code snippet above, a cyclic reference between GDscript and its member _static_ref is created, that makes it impossible for the script resource to be released ever.

And I haven't been able to come up with a reasonable fix that doesn't add hacks to fix that. Discarded ideas have been messing with the reference count, and creating special utility, high danger functions to initialize and clear a Reference Variant without touching the reference count, to let GDScript::_static_ref don't worry about the lifetime of the script by itself, because, as a member, it's lifetime is tied to the one of the script.

So my take has been to do the easy fix, document the problem and leave the door open for a deeper fix that addresses the whole problem up to its roots.

NOTE: Godot 4.0 is already fine.

Fixes #42305.

Modify usage of types so that the `Ref` created from `base_type.script_type` doesn't involve converting first to `Variant`, which will use the constructor for `Object *`, as if the argument wasn't a `Reference`, and therefore will convert back to null.
@RandomShaper RandomShaper added this to the 3.2 milestone Sep 25, 2020
@RandomShaper RandomShaper changed the title Fix GDScript leak avoidance Fix GDScript leak avoidance (3.2) Sep 25, 2020
@akien-mga akien-mga merged commit 4162df4 into godotengine:3.2 Sep 25, 2020
@akien-mga
Copy link
Member

Thanks!

@vnen
Copy link
Member

vnen commented Sep 25, 2020

But... The GDScript class is already taking advantage of the "feature": it has a Variant _static_ref member which is initialized to this. If Variant is fixed as in the code snippet above, a cyclic reference between GDscript and its member _static_ref is created, that makes it impossible for the script resource to be released ever.

And I haven't been able to come up with a reasonable fix that doesn't add hacks to fix that. Discarded ideas have been messing with the reference count, and creating special utility, high danger functions to initialize and clear a Reference Variant without touching the reference count, to let GDScript::_static_ref don't worry about the lifetime of the script by itself, because, as a member, it's lifetime is tied to the one of the script.

BTW this is changed in Godot 4.0, it doesn't have a _static_ref anymore (cf. #36358). This was moved to the function call exactly to remove the circular reference, since Variant now properly keeps a reference in those cases.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants