-
Notifications
You must be signed in to change notification settings - Fork 199
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Segmentation fault on Exit (Qt) #1095
Comments
Same error here after Upgrade from Ubuntu 20.04 LTS to Ubuntu 22.04 LTS. |
Thanks for reporting. But please add some more context information's
|
Can you please report the backintime versions that are used in that two Ubuntu release? Thanks. |
Hi @Codeberg-AsGithubAlternative-buhtz In both installations I used |
Dear Benjamin, |
I've also been seeing segfaults upon exiting backintime-qt on Kubuntu as well as Manjaro (both using Python 3.10). They seem to have no impact on functionality, and I'm not yet sure if they're 100% reproducible. Any good ideas on debugging? |
There is further debugging information in #1227, and in the downstream bug (mentioned above): https://bugzilla.redhat.com/show_bug.cgi?id=1844781 |
Another stacktrace is reported in #1271, which I'm closing as a duplicate. |
@emtiu Good work (sharp eyes ;-) I didn't realize that there are duplicated issues. I have just briefly checked the stack traces and think these issues are really duplicates since the segfault is caused in the the same location: #1095: #0 0x00007f20a6756be0 in PyCFunction_Type () from /lib64/libpython3.9.so.1.0 If have searched all issues containing the keyword
Debugging of segfaults if difficult (requires eg. Also the developer/debugger needs to understand the segfault-causing library a little bit to debug it a) try to make the crash reproducible with a minimal reproducible (code) example (MRE) |
Hmm, that's tricky. From what I've seen, it' not 100% reproducable, but it happens almost every time that backintime exits. My working hypothesis is: backintime segfaults almost always on GUI quit, but most users don't notice, because: a) there's no loss of functionality, and b) the graphical desktop environment doesn't show the error when it happens. |
If I find the time I could write some basic GUI-related unit tests (just checking for the best tools to do this) and hopefully one unit tests provokes this segfault... |
Digging around the discussions in https://bugzilla.redhat.com/show_bug.cgi?id=1844781 and https://forum.manjaro.org/t/python-crash-when-exiting-back-in-time/102856/11, I see three hints on a possible root of the problem:
I don't know enough about Qt or C++ to make sense of any of it, but maybe it helps someone else looking at this. |
Good summary and yes Matt Fagnani did a damn good job to narrow down the problem and in the end the challenge is (if it should really be caused by "use-after-freed") that at the moment the error happens (segfault) the stacktrace does not directly point to the code line that freed the object before (because freeing memory happens non-deterministically and most probably async - edit: not true for C). This is why I suggest I am out, I know Qt, C, C++, Valgrind and gdb, but this bug is too time consuming for me ATM, I hope it will be diagnosed and fixed upstream. |
Adding
Python-Specific
|
Here's another piece to the puzzle – but be careful, there's a good chance it will melt your brain: 1d63ced#commitcomment-21596448 |
In my Manjaro VM I also get a crash with memory dump now-and-then when I exit BiT-qt. Again in PyCFunction_Type with Python 3.10.7:
|
If sip is really the (only) reason for the segfaults (which I am not sure in case of the https://www.riverbankcomputing.com/hg/sip/rev/072b8949de41 It was fixed in all sip versions (or at least in |
Yes and would like to keep this issue open until I have so much time left that I can do a multi-day trace & debug session since I can reproduce the problem on several distros (almost deterministically)... |
I wrote a script to run and close backintime-qt 10 times, logging to a file. I consistently get the segfault message about 5 out of 10 times. This is with the distro installation of 1.2.1 on Linux Mint 21.2. If I add this line anywhere in
|
Hello Derek, I remember the Can you provide that script? I would like to increase the repetitions. Depending on earlier experiments with the seg fault problem I am assuming that the "connection" between that problem and the mouse event filter it is only a coincidence. But we should give it a try of course. Best, EDIT: The event filter instance is a member of the MainWindow but it is installed to the QApplication object. When the window closes, Python might garbage collect its members including the filter. The QApplication is destroyed later. There the filter is still installed but garbage collected by Python interpreter. QApplication object might touch the event filter at its end again and seg fault because it is not present anymore. EDIT2: I support the proposal of Derek. Would you like to provide a PR for this? On a long run we encapsulate the widgets of the main window into their own classes. Then we might get rid of the need for an global event filter and then can simply use |
I was looking for something that might be destroyed out of order, and seeing that filter added without a corresponding removal looked like a possibility. Here is the test script I'm using: run_and_quit_backintime.sh You can call it with a numeric argument for how many times to run and close. It uses xdotool to close. It's using I would be happy to make a PR for it. |
Wonderful! This problem has been bugging us for so long. Thanks for the good work :) |
Great. Let me know if you need assistance. Don't hesitate to ask. |
Thanks. I tried to conform my commit message and such, but I would be glad to get any pointers on my first PR here. |
…entFilter when closing main window (bit-team#1095) Fix bit-team#1095
…entFilter when closing main window (bit-team#1095) Fix bit-team#1095
@DerekVeit Excellent work (and approach), thanks a lot for helping us! What I have learned as "take-away" from this issue is that debugging is not always the most-efficient way (but hard-core) for non-deterministic segfaults but white-box code-analysis and perhaps even disabling some code (perfectly via bisecting) is another excellent way to find the culprit 😄 |
@aryoda Thanks! That's what I was thinking after reading your posts and Benjamin's. I had just started along that strategy of methodically removing things. But since the widget structure should probably all get torn down in the normal way, I was starting by looking for anything special outside of that, and I've been using Back In Time for some years, so I'm grateful and glad to help too. |
Hello Derek, It seems there are still segmentation faults. Was reported as a side problem in #1828. I can confirm and reproduce with latest dev version. Steps to reproduce:
|
I was able to reproduce this too. I've made a modified version of the testing script for it. run_and_quit_backintime_no-config.sh And I see the reason. I added the fix in Moving the If this sounds good, I can make another PR for this improved version of the fix. |
Sounds like a good solution to me. "Make it so." 🚀 Thank you very much for your efforts to help with this. |
We got a downstream bug report that backintime fails with a segmentation fault, whenever it is closed in the current fedora rawhide.
Please see the downstream bug report at
https://bugzilla.redhat.com/show_bug.cgi?id=1844781
The text was updated successfully, but these errors were encountered: