Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Hard crash without any output when running specific scene in Godot 3.5 on Windows 11. #61499

Closed
EzraT opened this issue May 28, 2022 · 41 comments

Comments

@EzraT
Copy link

EzraT commented May 28, 2022

Godot version

All of the 3.5 test releases to date

System information

Windows 11 only(?) - GLES2/GLES3

Issue description

All of the Godot 3.5 test releases (from beta to rc) crash consistently upon running a specific scene in one of my projects.
It runs for about 2 seconds, then audio starts to cut out and the engine freezes and crashes shortly after, without any information about the crash anywhere, not in console output with --verbose enabled, not in the logs, not in the editor.
The scene loads absolutely fine in 3.4.

Video

I've tried changing/re-importing/removing/toggling a lot of things in the scene to try to isolate a cause, but with no luck whatsoever.
It happens in both GLES2 and GLES3.
Resetting all project settings to default did not not make a difference either.

The only thing I do know about this crash is that it only seems to happen on my Windows 11 system, I tested it on Linux and the crash does not occur there.
Others have also tested it on Windows 10, and it does not seem to crash there either.

If anyone is interested in taking a look at this, let me know so I can share the project files privately.

Steps to reproduce

Open the included exported project(linked below) and run it, wait a bit for the crash to occur.

Reproduction binary.

Exported Project

@Calinou
Copy link
Member

Calinou commented May 28, 2022

Can you try compiling a debug build of the latest 3.x branch with MSVC? This should give you a crash backtrace when Godot crashes (only for editor or debug export template binaries).

@Calinou Calinou added this to the 3.5 milestone May 28, 2022
@EzraT
Copy link
Author

EzraT commented May 28, 2022

I have not compiled anything on Windows before, only on Linux so I'm not sure, but I'll give it a try.

@timothyqiu
Copy link
Member

@EzraT The MRP seems to have some permission issue. Its download permission is "viewers cannot download" :(

@EzraT
Copy link
Author

EzraT commented May 28, 2022

@timothyqiu Should be fixed now.

@univeous
Copy link
Contributor

An exported project may not provide much help.
If you want, you can send me the project file somehow and I can help you print the error stack.

@EzraT
Copy link
Author

EzraT commented May 28, 2022

@univeous @Calinou
Well, I just compiled a debug binary of the 3.x branch on Windows, and it does not seem to crash when I try run the project on that.
So I guess it was either fixed recently, or something else is going on.
Could this be related to the type of build? release/debug?
I will try testing with a release_debug build and see if it crashes with that.

@EzraT
Copy link
Author

EzraT commented May 28, 2022

Tested again using target=release_debug, and it runs without crashing just as with the debug build.
Any thoughts?

@univeous
Copy link
Contributor

You can compile the crashed debug version (e.g. 3.5rc2) to print the crash backtrace (if available) to locate the problem and confirm if it has indeed been fixed.

@EzraT
Copy link
Author

EzraT commented May 28, 2022

@univeous
Okay, I compiled another debug build from the 3.5rc2 source, and it still does not crash like the official RC2 build does.
I'm not experienced with C++ programming but, could this be related to how the official builds are compiled?
Are they cross compiled on linux or something like that?
Could that explain why they don't crash on Linux?

@timothyqiu
Copy link
Member

Yeah, I think that's the difference. The official builds are cross-compiled from Linux with MinGW.

@EzraT
Copy link
Author

EzraT commented May 28, 2022

@univeous Do you have the capability to compile in the same manner as the official builds?
If so, I could send you the project files.

@univeous
Copy link
Contributor

@univeous Do you have the capability to compile in the same manner as the official builds?
If so, I could send you the project files.

I can, but I may not be available until tomorrow :(
If you're not in a particular hurry, you can send it to my email
univeous@gmail.com

@Calinou
Copy link
Member

Calinou commented May 28, 2022

I'm compiling debug builds of 3.x b541b57 with MinGW right now.
Edit: Done in #61499 (comment).

Note that you can't use the MSVC debugger or WinDbg with those builds, and the crash handler won't work until #61006 is merged. You need to install GDB or LLDB and use it from the command line. You can install GDB using Scoop (scoop install gdb), then run it from a terminal:

gdb path/to/godot.exe

Run project until it crashes, then enter the following line in the GDB prompt to get a backtrace:

bt

@Calinou
Copy link
Member

Calinou commented May 28, 2022

Windows 64-bit builds of b541b57 compiled with MinGW on Fedora 36 (GCC 12.1.1), including full debugging symbols:

@EzraT
Copy link
Author

EzraT commented May 28, 2022

@Calinou
Okay, tested again with the build you provided, and it does not crash, so I guess this is possibly fixed in 3.x
I'll test this again in RC3 when that releases just to make sure.
Thanks for uploading the build.

@EzraT
Copy link
Author

EzraT commented Jun 1, 2022

@Calinou @univeous @timothyqiu

Tested again with the official RC3 build, and it still crashes.
I'm not sure what else I can do here, it seems to be very specific to the official 3.5 builds, since the ones Calinou provided don't crash at all, and they are supposedly compiled in the same manner?

I could still send you the project files univeous, but I get the feeling that won't yield any results since it seems only the official builds crash.

@MmAaXx500
Copy link
Contributor

MmAaXx500 commented Jun 4, 2022

@EzraT I tried what you uploaded as "Minimal reproduction project", but I can't reproduce the problem. It's working fine for me. (see attached video)
It's running on: Windows 10 (21H2), i7-8700K, RTX 2080

gd_61499-6000_v6_30.mp4

@EzraT
Copy link
Author

EzraT commented Jun 4, 2022

@MmAaXx500 @Calinou @univeous @timothyqiu
That is very strange, perhaps its a Windows 11 specific issue then, its the only thing I can think of at this point.

Calinou, univeous, timothyqiu, if you tested this on Windows, on what version of Windows did you test the binary?

I updated the issue a little to avoid confusion about the reproduction binary.

@univeous
Copy link
Contributor

univeous commented Jun 5, 2022

That is very strange, perhaps its a Windows 11 specific issue then, its the only thing I can think of at this point.

I tested your export project on Windows 11 and it does crash.

@EzraT EzraT changed the title Hard crash without any output when running specific scene in Godot 3.5 on Windows. Hard crash without any output when running specific scene in Godot 3.5 on Windows 11. Jun 5, 2022
@EzraT
Copy link
Author

EzraT commented Jun 7, 2022

I can confirm it is a Windows 11 issue, I tested it on Windows 10 on the same system, and it does not crash.

@akien-mga
Copy link
Member

@RPicster had a similar issue with Beat Invaders and had to compile custom templates with MSVC.

CC @bruvzg @hpvb

I guess I'll have to upgrade one of my laptops to Win11 :(

@bruvzg
Copy link
Member

bruvzg commented Jun 8, 2022

I can reproduce it with the official RC3 export templates and editor binary (only 64-bit version, both debug and release variants, 32-bit versions are OK), but not with the custom builds (neither MSVC, nor MinGW builds).

Backtrace without symbols is pretty useless:

#0  0x00007ffcabf6fbed in ntdll!RtlRaiseStatus () from C:\WINDOWS\SYSTEM32\ntdll.dll
#1  0x00007ffcabfa03be in ntdll!RtlNotifyFeatureUsage () from C:\WINDOWS\SYSTEM32\ntdll.dll
#2  0x00007ffcabf1523d in ntdll!RtlUnwind () from C:\WINDOWS\SYSTEM32\ntdll.dll
#3  0x00007ffcaa13686b in msvcrt!_setjmpex () from C:\WINDOWS\System32\msvcrt.dll
#4  0x00007ff79effaf6f in godot.windows.opt.debug.64!_Z5_mainv ()
#5  0x00007ff79f04bdb2 in godot.windows.opt.debug.64!_Z5_mainv ()
#6  0x00007ff79f04befc in godot.windows.opt.debug.64!_Z5_mainv ()
#7  0x00007ff79eff020f in godot.windows.opt.debug.64!_Z5_mainv ()
#8  0x00007ff79f003fdb in godot.windows.opt.debug.64!_Z5_mainv ()
#9  0x00007ff79f00417e in godot.windows.opt.debug.64!_Z5_mainv ()
#10 0x00007ff79f00489a in godot.windows.opt.debug.64!_Z5_mainv ()
#11 0x00007ff79f03f74b in godot.windows.opt.debug.64!_Z5_mainv ()
#12 0x00007ff79f03ec6f in godot.windows.opt.debug.64!_Z5_mainv ()
#13 0x00007ff79f07f6af in godot.windows.opt.debug.64!_Z5_mainv ()
#14 0x00007ff79e441772 in godot.windows.opt.debug.64!_Z5_mainv ()
#15 0x00007ff79e445fca in godot.windows.opt.debug.64!_Z5_mainv ()
#16 0x00007ff79e4462e7 in godot.windows.opt.debug.64!_Z5_mainv ()
#17 0x00007ff79de07ac9 in godot.windows.opt.debug.64!_Z5_mainv ()
#18 0x00007ff79e19be02 in godot.windows.opt.debug.64!_Z5_mainv ()
#19 0x00007ff79d4c07b9 in godot.windows.opt.debug.64!_Z5_mainv ()
#20 0x00007ff79ecc07ae in godot.windows.opt.debug.64!_Z5_mainv ()
#21 0x00007ff79f14a5d4 in godot.windows.opt.debug.64!_Z5_mainv ()
#22 0x00007ff79ecc64e7 in godot.windows.opt.debug.64!_Z5_mainv ()
#23 0x00007ff79dcd7585 in godot.windows.opt.debug.64!_Z5_mainv ()
#24 0x00007ff79d4bbee1 in godot.windows.opt.debug.64!_Z5_mainv ()
#25 0x00007ff79d47f472 in godot.windows.opt.debug.64!_Z13widechar_mainiPPw ()
#26 0x00007ff79d48ca20 in godot.windows.opt.debug.64!_Z5_mainv ()
#27 0x00007ff79d4513c1 in ?? ()
#28 0x00007ff79d4514d6 in ?? ()
#29 0x00007ffcab5d54e0 in KERNEL32!BaseThreadInitThunk () from C:\WINDOWS\System32\kernel32.dll
#30 0x00007ffcabee485b in ntdll!RtlUserThreadStart () from C:\WINDOWS\SYSTEM32\ntdll.dll
#31 0x0000000000000000 in ?? ()

@bruvzg
Copy link
Member

bruvzg commented Jun 8, 2022

OK seems like it an issue with LTO and FreeType, custom MinGW build with the LTO also crashes:

Thread 1 received signal ?, Unknown signal.
0x00007ffcabf6fbed in ntdll!RtlRaiseStatus () from C:\WINDOWS\SYSTEM32\ntdll.dll
(gdb) bt
#0  0x00007ffcabf6fbed in ntdll!RtlRaiseStatus () from C:\WINDOWS\SYSTEM32\ntdll.dll
#1  0x00007ffcabfa03be in ntdll!RtlNotifyFeatureUsage () from C:\WINDOWS\SYSTEM32\ntdll.dll
#2  0x00007ffcabf1523d in ntdll!RtlUnwind () from C:\WINDOWS\SYSTEM32\ntdll.dll
#3  0x00007ffcaa13686b in msvcrt!_setjmpex () from C:\WINDOWS\System32\msvcrt.dll
#4  0x00007ff78894507f in gray_set_cell ()
#5  0x00007ff78898bdf0 in gray_render_line.lto_priv ()
#6  0x00007ff78898bf60 in gray_line_to ()
#7  0x00007ff78893897f in FT_Outline_Decompose ()
#8  0x00007ff78894f239 in gray_convert_glyph_inner ()
#9  0x00007ff78894f3d9 in gray_convert_glyph ()
#10 0x00007ff78894fb1a in gray_raster_render.lto_priv ()
#11 0x00007ff78897f4ab in ft_smooth_render.lto_priv ()
#12 0x00007ff78897e924 in FT_Render_Glyph_Internal ()
#13 0x00007ff788a95fee in FT_Glyph_To_Bitmap.constprop.0 ()
#14 0x00007ff787ed34ff in DynamicFontAtSize::_make_outline_char(int) ()
#15 0x00007ff787edcf4e in DynamicFontAtSize::draw_char(RID, Vector2 const&, wchar_t, wchar_t, Color const&, Vector<Ref<DynamicFontAtSize> > const&, bool, bool) const ()
#16 0x00007ff787edd21e in DynamicFont::draw_char(RID, Vector2 const&, wchar_t, wchar_t, Color const&, bool) const ()
#17 0x00007ff787943595 in Label::_notification(int) ()
#18 0x00007ff787c79b42 in CanvasItem::_update_callback() ()
#19 0x00007ff78725b47c in MethodBind0::call(Object*, Variant const**, int, Variant::CallError&) ()
#20 0x00007ff7886264c3 in Object::call(StringName const&, Variant const**, int, Variant::CallError&) ()
#21 0x00007ff788b533b3 in MessageQueue::_call_function(Object*, StringName const&, Variant const*, int, bool) [clone .constprop.0] ()
#22 0x00007ff788624b07 in MessageQueue::flush() ()
#23 0x00007ff78780c8da in SceneTree::idle(float) ()
#24 0x00007ff7871a949f in Main::iteration() ()
#25 0x00007ff78716f4ef in widechar_main(int, wchar_t**) ()
#26 0x00007ff7871711de in _main() ()
#27 0x00007ff7871413ae in __tmainCRTStartup () at C:/M/mingw-w64-crt-git/src/mingw-w64/mingw-w64-crt/crt/crtexe.c:321
#28 0x00007ff7871414c6 in WinMainCRTStartup () at C:/M/mingw-w64-crt-git/src/mingw-w64/mingw-w64-crt/crt/crtexe.c:176

@bruvzg
Copy link
Member

bruvzg commented Jun 8, 2022

Not sure what's exactly going wrong but, it's working with FreeType 2.10.4 but not with 2.11.0+. I'll try to bisect it.

@akien-mga
Copy link
Member

akien-mga commented Jun 8, 2022

All of the Godot 3.5 test releases (from alpha to rc) crash consistently upon running a specific scene in one of my projects.

To clarify, we didn't have alphas for 3.5 but can you confirm that 3.5.beta1 crashes and 3.4.4.stable is fine?

I'm trying to think about what could have caused the regression, and I think in 3.5.beta1 it's still using the exact same MinGW version and build options as 3.4.4 used. For the record, subsequent builds were made with these versions:

  • 3.4.4.stable: MinGW-GCC 10.2.1 on Fedora 34
  • 3.5.beta1 to 3.5.beta5: MinGW-GCC 10.2.1 on Fedora 34
  • 3.5.rc1: MinGW-GCC 11.2.1 on Fedora 35
  • 3.5.rc2 to 3.5.rc3: MinGW-GCC 11.2.1 on Fedora 36

We did update FreeType twice during the 3.5 development:

And just as I was typing this @bruvzg confirmed that it seems to be a regression from FreeType 2.11.0+ so this checks out :)

But that should mean that 3.5.beta1 should not crash, but 3.5.beta2+ should.

@bruvzg
Copy link
Member

bruvzg commented Jun 8, 2022

Bad commit seems to be - freetype/freetype@80bda80

@EzraT
Copy link
Author

EzraT commented Jun 8, 2022

All of the Godot 3.5 test releases (from alpha to rc) crash consistently upon running a specific scene in one of my projects.

To clarify, we didn't have alphas for 3.5 but can you confirm that 3.5.beta1 crashes and 3.4.4.stable is fine?

I'm trying to think about what could have caused the regression, and I think in 3.5.beta1 it's still using the exact same MinGW version and build options as 3.4.4 used. For the record, subsequent builds were made with these versions:

  • 3.4.4.stable: MinGW-GCC 10.2.1 on Fedora 34
  • 3.5.beta1 to 3.5.beta5: MinGW-GCC 10.2.1 on Fedora 34
  • 3.5.rc1: MinGW-GCC 11.2.1 on Fedora 35
  • 3.5.rc2 to 3.5.rc3: MinGW-GCC 11.2.1 on Fedora 36

We did update FreeType twice during the 3.5 development:

And just as I was typing this @bruvzg confirmed that it seems to be a regression from FreeType 2.11.0+ so this checks out :)

But that should mean that 3.5.beta1 should not crash, but 3.5.beta2+ should.

I'm currently unable to test because I recently factory reset my Windows laptop back to Windows 10 for unrelated reasons.
But I do remember testing this in RC1 and having the same crash on Windows 11, even the later beta versions crashed, although I do not remember which ones specifically.

I also know a 100% certain 3.4.4 does not crash.

Edit: Corrected alpha to beta in the OP.

@akien-mga
Copy link
Member

akien-mga commented Jun 8, 2022

Bad commit seems to be - freetype/freetype@80bda80

The goto does seem like something that can trip up LTO, though it was just moved to a different function. But what seems interesting in the diff is the removal of the return; before the goto label:
image

I wonder if adding it back solves the crash?

Edit: Nevermind, that would break the intended logic. Maybe just this?

diff --git a/thirdparty/freetype/src/smooth/ftgrays.c b/thirdparty/freetype/src/smooth/ftgrays.c
index 622035aa79..19cf41c1fe 100644
--- a/thirdparty/freetype/src/smooth/ftgrays.c
+++ b/thirdparty/freetype/src/smooth/ftgrays.c
@@ -605,7 +605,10 @@ typedef ptrdiff_t  FT_PtrDist;
           break;
 
         if ( cell->x == ex )
-          goto Found;
+        {
+          ras.cell = cell;
+          return;
+        }
 
         pcell = &cell->next;
       }
@@ -622,7 +625,6 @@ typedef ptrdiff_t  FT_PtrDist;
       cell->next  = *pcell;
       *pcell      = cell;
 
-    Found:
       ras.cell = cell;
     }
   }

Either way, this would be good to report upstream https://gitlab.freedesktop.org/freetype/freetype so we can make sure that it gets solved eventually there too - or reported further to binutils or GCC devs.

@bruvzg
Copy link
Member

bruvzg commented Jun 8, 2022

Maybe just this?

No, still crashes.

@akien-mga
Copy link
Member

Just to check if this would be fixed in newer mingw-binutils or newer GCC, I made a test build on Fedora 37: https://downloads.tuxfamily.org/godotengine/testing/Godot_v3.5-rc3-f37_win64.exe.zip

  • Fedora 36 (3.5.rc2 and 3.5.rc3): MinGW 9.0.0, GCC 11.2.1-5, binutils 2.37-4
  • Fedora 37 (test build "3.5-rc3-f37"): MinGW 10.0.0, GCC 12.1.1-1, binutils 2.38-2

@bruvzg
Copy link
Member

bruvzg commented Jun 8, 2022

Just to check if this would be fixed in newer mingw-binutils or newer GCC, I made a test build on Fedora 37: https://downloads.tuxfamily.org/godotengine/testing/Godot_v3.5-rc3-f37_win64.exe.zip

Broken.

@bruvzg
Copy link
Member

bruvzg commented Jun 8, 2022

Not sure why it was working before @80bda80, but it seems like LTO is removing setjmp related code - fix #61803.

@akien-mga
Copy link
Member

Awesome, thanks!

Would be good to report this upstream to https://gitlab.freedesktop.org/freetype/freetype so that they're aware of the issue, even if it's a toolchain problem and not directly their code being wrong.

And then further up I'm not sure if it's a GCC or a binutils issue. Maybe @marxin can advise us?

@akien-mga
Copy link
Member

Fixed by #61803.

@akien-mga
Copy link
Member

Awesome, thanks!

Would be good to report this upstream to gitlab.freedesktop.org/freetype/freetype so that they're aware of the issue, even if it's a toolchain problem and not directly their code being wrong.

FreeType bug report: https://gitlab.freedesktop.org/freetype/freetype/-/issues/1164

The real problem is probably still in either MinGW, GCC or binutils, but I don't know which yet and how to give them a useful reproducer.

@akien-mga
Copy link
Member

Here's a compiled version of current Godot 3.x (19fec70) with @bruvzg's fix: https://downloads.tuxfamily.org/godotengine/testing/Godot_v3.5-rc-freetype-fix_win64.exe.zip

Could you confirm that it solves your crash @EzraT @univeous?

@EzraT
Copy link
Author

EzraT commented Jun 9, 2022

I am currently unable to test, I had to send my Windows laptop back to the manufacturer for repairs, (For a problem unrelated to this issue.) and I am not planning on upgrading it to Windows 11 again once I get it back, at-least not in the foreseeable future.
Poor timing I know, but there wasn't much else I could do sadly.

If you want to test this @univeous, let me know and I'll send you the project files.

@univeous
Copy link
Contributor

univeous commented Jun 9, 2022

If you want to test this @univeous, let me know and I'll send you the project files.

If you want, of course. I can test it tomorrow or the day after.

Edit:
I can confirm that the project provided by @EzraT crashes in 3.5 rc3 (as expected) and does not crash in the version provided above (19fec70).

@marxin
Copy link
Contributor

marxin commented Jun 15, 2022

Hard to guess what's responsible for that. Am I right it's only related to Windows target?
Would it be possible to construct a self-reproducible test case? Thanks.

@akien-mga
Copy link
Member

It's only reproducible on Windows 11 so far. It was bisected to be a regression in FreeType, which was fixed by adding a volatile: https://gitlab.freedesktop.org/freetype/freetype/-/issues/1164
But it's not clear why this got optimized out and why this only happens with MinGW-GCC with LTO when running on Windows 11 (same binary works fine on Windows 10).

Making a self-reproducible case sounds quite difficult, we experienced this FreeType bug in Godot in scenes using specific fonts at big sizes - to narrow this down one would have to make a new test program using only FreeType to see if that can trigger the bug, and then dissect FreeType to try to extract something more minimal.

@marxin
Copy link
Contributor

marxin commented Jun 20, 2022

Yeah, I fully understand that test case isolation would be a pretty tough job there.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment