Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Accessing Dictionary from inside of Array results in a memory leak #1240

Closed
Lasuch69 opened this issue Sep 11, 2023 · 14 comments
Closed

Accessing Dictionary from inside of Array results in a memory leak #1240

Lasuch69 opened this issue Sep 11, 2023 · 14 comments
Labels
bug This has been identified as a bug confirmed
Milestone

Comments

@Lasuch69
Copy link

Lasuch69 commented Sep 11, 2023

Godot version

4.1.1

godot-cpp version

4.1

System information

Fedora Silverblue 38

Issue description

image
This function results in a memory leak.

Recreating this function line by line in GDScript doesn't result in memory leak.

I've tried to return Dictionary as is:
image
Which works fine.

Callable and callv() are affected too.

Steps to reproduce

N/A

Minimal reproduction project

gdextension-memory-leak-main.zip

You can also view it here: https://github.com/Lasuch69/gdextension-memory-leak

@Lasuch69
Copy link
Author

Lasuch69 commented Sep 13, 2023

image
I'd like to add that in this function, leak is also present. It seems like, accessing Dictionary from Array or TypedArray in any way leaks memory.

@dsnopek
Copy link
Collaborator

dsnopek commented Sep 13, 2023

Thanks for the report!

Can you explain how you're detecting the memory leak?

@Lasuch69
Copy link
Author

Lasuch69 commented Sep 13, 2023

Memory graph rises indefinitely. I can check it with Valgrind like I did originaly.

Edit:
image
Valgrind confirms my theory.

@Lasuch69 Lasuch69 changed the title Returning Dictionary from inside of Array results in a memory leak Accessing Dictionary from inside of Array results in a memory leak Sep 16, 2023
@Lasuch69 Lasuch69 changed the title Accessing Dictionary from inside of Array results in a memory leak Accessing Dictionary from inside of Array results in a memory leak Sep 16, 2023
@cyberpuffin-digital
Copy link

I've also hit this memory leak accessing an array of arrays. I narrowed it down by commenting and retrying >,<

/* Check if test input conflicts with other entries within a given column of the playfield */
bool GridCalc::valid_in_grid_col(const Array &grid, int col, int row, int row_count, int test)
{
    // std::vector<int> grid_col = (std::vector<int>)grid[col];
    const Array &grid_col = (const Array &)grid[col];
    godot::UtilityFunctions::print(vformat("GridCalc::valid_in_grid_col Checking grid_col vector: %d", (int)grid_col[0]));
    for (int check = 0; check < row_count; check++)
    {
        if (check == row)
        {
            continue;
        }
        // if ((int)grid_col[check] == test)
        // {
        //     return false;
        // }
    }
    return true;
}

grid is a 2d array of ints (0-9) and my function is one of several that checks if a test value has any conflicts in the grid along the column. I've gone through multiple iterations of trying to cast the sub-array as an Array or a Vector and the memory graph continually climbs when the code is invoked, until the system runs out of memory and crashes.

Based on my testing (left the commented code in for reference), all it takes is initializing the grid_col variable to trigger the leak.

Here is the end of my valgrind log (full log is >100mb, I can pastebin it if desired):

==98178== 25,468,800 bytes in 318,360 blocks are definitely lost in loss record 2,588 of 2,589
==98178==    at 0x86050C5: malloc (vg_replace_malloc.c:442)
==98178==    by 0x4E316E2: ??? (in /opt/godot/releases/Godot_v4.1.3-stable_linux.x86_64)
==98178==    by 0x4758CD1: ??? (in /opt/godot/releases/Godot_v4.1.3-stable_linux.x86_64)
==98178==    by 0xF419C15: godot::Variant::operator godot::Array() const (in ~/git/godot-sudoku/GDExtension/bin/libgridcalc.linux.template_debug.x86_64.so)
==98178==    by 0xF40252C: godot::GridCalc::valid_in_grid_table(godot::Array const&, int, int, int, int, int) (in ~/git/godot-sudoku/GDExtension/bin/libgridcalc.linux.template_debug.x86_64.so)
==98178==    by 0xF403209: godot::MethodBindTR<bool, godot::Array const&, int, int, int, int, int, int, int>::ptrcall(void*, void const* const*, void*) const (in ~/git/godot-sudoku/GDExtension/bin/libgridcalc.linux.template_debug.x86_64.so)
==98178==    by 0x129A8FD: ??? (in /opt/godot/releases/Godot_v4.1.3-stable_linux.x86_64)
==98178==    by 0x118EB27: ??? (in /opt/godot/releases/Godot_v4.1.3-stable_linux.x86_64)
==98178==    by 0x47DEFBA: ??? (in /opt/godot/releases/Godot_v4.1.3-stable_linux.x86_64)
==98178==    by 0x4640C79: ??? (in /opt/godot/releases/Godot_v4.1.3-stable_linux.x86_64)
==98178==    by 0x128B573: ??? (in /opt/godot/releases/Godot_v4.1.3-stable_linux.x86_64)
==98178==    by 0x118EB27: ??? (in /opt/godot/releases/Godot_v4.1.3-stable_linux.x86_64)
==98178==
==98178== 53,375,200 bytes in 667,190 blocks are definitely lost in loss record 2,589 of 2,589
==98178==    at 0x86050C5: malloc (vg_replace_malloc.c:442)
==98178==    by 0x4E316E2: ??? (in /opt/godot/releases/Godot_v4.1.3-stable_linux.x86_64)
==98178==    by 0x4758CD1: ??? (in /opt/godot/releases/Godot_v4.1.3-stable_linux.x86_64)
==98178==    by 0xF419C15: godot::Variant::operator godot::Array() const (in ~/git/godot-sudoku/GDExtension/bin/libgridcalc.linux.template_debug.x86_64.so)
==98178==    by 0xF402419: godot::GridCalc::valid_in_grid_row(godot::Array const&, int, int, int, int) (in ~/git/godot-sudoku/GDExtension/bin/libgridcalc.linux.template_debug.x86_64.so)
==98178==    by 0xF4026EC: godot::GridCalc::valid_in_grid(godot::Array const&, int, int, int, int, int, int, int) (in ~/git/godot-sudoku/GDExtension/bin/libgridcalc.linux.template_debug.x86_64.so)
==98178==    by 0xF403209: godot::MethodBindTR<bool, godot::Array const&, int, int, int, int, int, int, int>::ptrcall(void*, void const* const*, void*) const (in ~/git/godot-sudoku/GDExtension/bin/libgridcalc.linux.template_debug.x86_64.so)
==98178==    by 0x129A8FD: ??? (in /opt/godot/releases/Godot_v4.1.3-stable_linux.x86_64)
==98178==    by 0x118EB27: ??? (in /opt/godot/releases/Godot_v4.1.3-stable_linux.x86_64)
==98178==    by 0x47DEFBA: ??? (in /opt/godot/releases/Godot_v4.1.3-stable_linux.x86_64)
==98178==    by 0x4640C79: ??? (in /opt/godot/releases/Godot_v4.1.3-stable_linux.x86_64)
==98178==    by 0x128B573: ??? (in /opt/godot/releases/Godot_v4.1.3-stable_linux.x86_64)
==98178==
==98178== LEAK SUMMARY:
==98178==    definitely lost: 103,695,444 bytes in 1,296,367 blocks
==98178==    indirectly lost: 1,780 bytes in 8 blocks
==98178==      possibly lost: 171,184 bytes in 827 blocks
==98178==    still reachable: 670,687 bytes in 3,406 blocks
==98178==         suppressed: 0 bytes in 0 blocks
==98178==
==98178== For lists of detected and suppressed errors, rerun with: -s
==98178== ERROR SUMMARY: 176 errors from 176 contexts (suppressed: 0 from 0)

* NB: I ran valgrind with 4.1.3, but 4.2 hasn't alleviated the issue.

@dsnopek
Copy link
Collaborator

dsnopek commented Dec 21, 2023

I tried the MRP in the original description, and while Valgrind isn't reporting anything useful to me, I do see the memory rising constantly when this code is included:

		# Leaks
		var data: Dictionary = get_data()

... but the memory remains stable if I comment that out.

So, I'm seeing the leak too!

@dsnopek dsnopek added bug This has been identified as a bug confirmed labels Dec 21, 2023
@dsnopek dsnopek added this to the 4.x milestone Dec 21, 2023
@allenwp
Copy link
Contributor

allenwp commented Jan 26, 2024

System Info:

  • Godot v4.2.1.stable - Windows 10.0.22621 - Vulkan (Forward+) - dedicated NVIDIA GeForce GTX 980 Ti (NVIDIA; 31.0.15.3598) - 13th Gen Intel(R) Core(TM) i7-13700K (24 Threads)
  • GDExtension built with MSVC using godot-cpp 4.2.1 branch.

I have experienced a similar issue that I believe shares the same root cause. From looking at the allocation call stack of my example and the original example, it seems that the leak is caused by casting from Variant to Dictionary, Array, or TypedArray (and possibly other classes), but only in GDExtension.

The following code leaks, but only in C++ GDExtension. If I compile this code as a part of the main Godot source, it does not leak:

Array original_array;
for (int i = 0; i < 1000; i++) {
	Variant my_variant = original_array;
	Array my_array = my_variant; // This line causes memory to leak
}

The leak also happens if I used Dictionary or TypedArray instead of Array. I am able to see the leak in both the Godot editor as well as in Visual Studio when debugging my GDExtension:

image

Visual Studio Allocation Call Stack

Here is the Visual Studio Allocation Call Stack (as seen using a local MSVC build of Godot 4.2.1):

image
image

Comparison to Core Engine

I see that GDExtension uses this sort of cast operator in variant.cpp of godot-cpp:

Variant::operator Array() const {
	Array result;
	to_type_constructor[ARRAY](result._native_ptr(), _native_ptr());
	return result;
}

...which is introduces additional code paths compared to the variant.cpp of the core engine:

Variant::operator Array() const {
	if (type == ARRAY) {
		return *reinterpret_cast<const Array *>(_data._mem);
	} else {
		return _convert_array_from_variant<Array>(*this);
	}
}

My example of the leak does not trigger in the core engine because the Variant is correctly checked to see if it's already an Array, and because it is one, it simply does the return *reinterpret_cast<const Array *>(_data._mem); and everything's a-OK.

...But in GDExtension, although it seems like it does this check, it still eventually ends up at _p = memnew(ArrayPrivate); in Array::Array() to make a whole new Array instead of just referencing the old one. I'm still not sure why the reference counting fails and the leak happens, though...

@allenwp
Copy link
Contributor

allenwp commented Jan 29, 2024

It's possible this leak has some similarities to a chain of leaks from way back when ( #490 #356 #355). Is it possible some of these sorts of issues were reintroduced with e4ed489 or some other major change? (I'm just trying to do some git log detective work here 🕵️)

I don't have a good grasp of C++ rvalues yet, so I'm not sure I can debug this any further on my own...

I think this is a good time for me to learn about rvalues, Variant casting, Array ref counting, and how the GDExtension ABI works, so give me a week or two and I feel like I should be able to at least figure out exactly what's going on.

Leak Details

Here's a very boiled down version of the leak:

Array original_array;
Variant my_variant = original_array;
Array my_array = my_variant; // This line causes memory to leak

The cast that happens during that assignment creates an additional Array object that has its refcount go up to 2 and then back down to 1, but it never reaches 0. Here's a part of why that happens:

  1. GDExtension's Variant::operator Array() const is called
  2. GDExtension's Array::Array() is called, which creates a couple of new temporary Array:
  3. Array::Array(const Array &p_from) is called, which allocates a new ArrayPrivate. The refcount is now 1. Let's call this ref "A"
  4. Array::Array(const Array &p_from) is called, which adds to the refcount, which is now at 2. Let's call this ref "B"
  5. Array::~Array() is called, which unreferences and takes the refcount down to 1. I've analyzed the memory and found that this is clearing out ref "A".
  6. Array::Array(const Array &p_from) is called, which overwrites the _p pointer to the ArrayPrivate originally held by ref "B"!

Eventually another call to GDExtension's Array::~Array() is called, but _p is nullptr. I'm not sure what this is about, but I don't think it's the issue.

It seems clear that ref "B" in the above scenario is never unreferencing and causing the leak...

@allenwp
Copy link
Contributor

allenwp commented Jan 31, 2024

Alright, I understand what's happening now and I might have a fix(?) Here are the details:

I am going to use the Dictionary class for this example, as that is the class that was described in this original github issue, but this applies identically to the Array and TypedArray classes, and possibly others.

The leak happens when calling the Dictionary cast operator on GDExtension's Variant:

Dictionary original_dictionary;
Variant my_variant = original_dictionary;
Dictionary my_dictionary = my_variant; // The cast operator of GDExtenion's Variant causes a leak

When this cast happens, a temporary Dictionary will be created and then a placement new will be performed on this temporary Dictionary. Here's what GDExtension's current Variant cast operator looks like:

Variant::operator Dictionary() const {
	Dictionary result;
	to_type_constructor[DICTIONARY](result._native_ptr(), _native_ptr());
	return result;
}

The to_type_constructor[DICTIONARY](result._native_ptr(), _native_ptr()); line seems to perform a placement new using result, which has already been initialized with a DictionaryPrivate. When the placement new happens, it calls the Dictionary copy constructor, which reasonably assumes that the Dictionary has not been initialized and overwrites the DictionaryPrivate pointer with the new one, causing the old one to leak:

Dictionary::Dictionary(const Dictionary &p_from) {
	_p = nullptr;
	_ref(p_from);
}

If calling to_type_constructor with those two _native_ptr() will always perform a placement new, then a solution might be to do something like calling Dictionary::~Dictionary(), to make sure that the Dictionary has cleaned up its references and is ready to have its copy constructor called...

Variant::operator Dictionary() const {
	Dictionary result;
	result.~Dictionary(); // Adding this line fixes the leak
	to_type_constructor[DICTIONARY](result._native_ptr(), _native_ptr());
	return result;
}

I have tested it and found that it successfully fixes the leak that was demonstrated in the original issue as well as the leak that I was experiencing.

But, this seems sub-optimal to be constructing a Dictionary just to destruct it immediately afterwards. And maybe even risky to be calling a destructor on an object with the expectation that it will always put everything in a state where it is valid to call the copy constructor again... So I'm currently looking into seeing if there is a safe way to do this by allocating some memory on the stack and then returning it once it has been filled in with the Dictionary data...

@dsnopek
Copy link
Collaborator

dsnopek commented Jan 31, 2024

@allenwp Thanks for digging into this!

Can you try my draft PR #1378 and see if it fixes the leak for you?

I tested that it passed the automated tests, but I didn't have a chance to actually re-test the leak. If it works, I can try to use it in all the other applicable situations too.

@allenwp
Copy link
Contributor

allenwp commented Jan 31, 2024

Thanks @dsnopek!

Unfortunately it looks like this method leaks (it leaves an ArrayPrivate in memory with a refcount of 1). I'm going to look into why this is, probably tomorrow.

@dsnopek
Copy link
Collaborator

dsnopek commented Jan 31, 2024

Unfortunately it looks like this method leaks (it leaves an ArrayPrivate in memory with a refcount of 1). I'm going to look into why this is, probably tomorrow.

The draft PR only did Dictionary, given that that was the topic of your last comment, so if you were testing Array that could certainly be why. :-) I'll update the PR to do Array too.

@allenwp
Copy link
Contributor

allenwp commented Jan 31, 2024

Oh, sorry, my mistake on my comment — I adapted your PR code to Array, since that’s what I’ve been testing with, so that’s why I typed that. So I believe the issue I described does exist for all of these internally reference counted types.

Your proposed approach seems to work similar to some code I was fiddling with using uninitialized memory, but there seems to be an extra constructor call that I don’t quite understand yet.

EDIT: I've moved discussion of a fix to the PR draft #1378.

@allenwp
Copy link
Contributor

allenwp commented May 4, 2024

I've tried the original post's minimal reproduction project with the 4.2.2 godot-cpp release and:

Godot v4.2.2.stable - Windows 10.0.22631 - Vulkan (Forward+) - dedicated NVIDIA GeForce GTX 980 Ti (NVIDIA; 31.0.15.3598) - 13th Gen Intel(R) Core(TM) i7-13700K (24 Threads)

And it is no longer leaking:

image

I also tried with the old versions:

4.1.1 godot-cpp release and:

Godot v4.1.1.stable - Windows 10.0.22631 - Vulkan (Forward+) - dedicated NVIDIA GeForce GTX 980 Ti (NVIDIA; 31.0.15.3598) - 13th Gen Intel(R) Core(TM) i7-13700K (24 Threads)

image

This issue can now be closed! :)

@AThousandShips
Copy link
Member

Thank you for updating!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug This has been identified as a bug confirmed
Projects
None yet
Development

No branches or pull requests

5 participants