-
-
Notifications
You must be signed in to change notification settings - Fork 125
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
terminate called after throwing an instance of 'BadMagicNumberException' #231
Comments
First observations: I tried it on my desktop PC where I run OpenSUSE Leap 15.5 (rpm based), using a freshly built QDirStat from the current master branch. I could not reproduce the problem. I tried it with I tried it again, restarting I'll try again with my Laptop that still has an ancient Xubuntu 18.04 LTS (yes, out of support, I know), but also the latest QDirStat. |
It's the same on the Laptop with Xubuntu 18.04 LTS (dpkg based) and a freshly built QDirStat from master: I cannot reproduce this. Please try a completely fresh rebuild; sometimes even today's sophisticated build systems fail in subtle ways and link some ancient object file to a binary. Please make sure everthing you want to keep is checked in, then
( And then try again with that freshly built QDirStat. |
Just to confirm since I didn't make it explicit. This is using a dpkg-based install. Probably not relevant, but who knows. Some more attempts today. A package filter that returns no packages (eg. "xorg1") can also cause the exception. The package query doesn't have to be against the same package that was opened in the full tree. Might be a red herring, but going with packages near the end of the alphabet seems to help. Still haven't ever got this with firefox-esr or linux-image which sit right up at the top of the list. Sorting the list first (twice, to get alphabetical descending by name) might help. Just once, after much ctrl-P-ing and opening of packages without hitting the exception, qdirstat just disappeared with no hint in the log. The problem might be entirely elsewhere, but the last logged line is: With a completely fresh build, the only difference is the line number for the exception is 588, which matches the code. Obviously lots of library differences, but the Qt version might make a big difference. I have 5.15.8 here. I have about 1,500 packages which is fairly modest. I have amd64 and i386 architectures set in dpkg. Can't think what else might be relevant and major. I have the treemap open. Closed it and it took several attempts but eventually got the exception. Open the treemap pane again and it crashed on the first attempt. Not a statistical sample, but does seem to make a difference. |
OK. I tried all kinds of things on both machines, but no problem so far. My Leap 15.5 uses Qt 5.15, the Xubuntu 18.04 LTS a much older version, of course. But I really don't think that this makes any difference in this context. The obvious suspect is some That's why all of those objects have a Now, rebuilding the Such things should happen only very rarely, and definitely not when the whole tree is re-read; that should clear everything. In particular, no dangling And from your description, that seems to be what happened to you: Somehow, one or more When, how and why that happens is the question now. But that will be difficult to investigate if the problem is hard to reproduce. Maybe start the whole thing in a debugger and set a breakpoint to the code location where that exception is thrown to get a backtrace? |
You didn't by any chance trigger a new read while one was still running? I just saw that the actions are not disabled, as they should, while reading. |
According to your log snippet in the first comment here, the code location should really be this: https://github.com/shundhammer/qdirstat/blob/master/src/DirTreeModel.cpp#L588 This is a quite some lines away from |
The problem with this code location is that it's a reimplemented virtual method that is called a gazillion times inside the Qt code, so it's not at all clear when, how and why it was called. The Qt model/view classes for those widgets all support one internal pointer in the central The And somewhere along that way, it gives us a It is not completely impossible that a newer Qt version holds on to some more |
Yes, line 588 with the latest build. The Debian build claims to be 1.8.1, but the changelog for it hails from January. I don't think I was interrupted ongoing reads. The treemap had displayed which pretty much guarantees the read is complete. Then I started opening up the tree, then reloaded with a package filter. I've tried quite a lot and can't reproduce this with anything except the package queries. |
I'll have a play with this on a different system at the weekend. I can go with a rather old Ubuntu. I might have to build a new qdirstat for it. |
So I tried an Ubuntu machine. The distro version was 1.6.1, but I thought I'd give that a try anyway. Got the exception first time:
Magic number exception. I had the details pane and treemap open for time out, but the same thing happens with them closed. |
So I've managed to catch this thing in the debugger. Not sure how much it helps. It is being called from collapseAll() from readPkg() in MainWindow. Hasn't the tree already been cleared at that point? Looks like Qt didn't get the memo in time? I can dig out more data about the state of variables if you want, but I'm not sure how much it will help. |
Thanks for investigating this further. I'll have a look. |
This is the code of https://github.com/qt/qtbase/blob/5.15/src/widgets/itemviews/qtreeview.cpp#L2740-L2755 I don't see where this would call Maybe a Debian patch on top of that Qt version does? |
Anyway, commit e4b3f11 is a more defensive approach to checking the internalPointer of that It does leave a bad taste in my mouth, though, since somewhere somehow somebody holds on to a Why? I don't know. Is it dangerous, or is it only in that transitional phase where the Please test this with the current master; I could still not reproduce the problem, not on my (aging, admittedly) Xubuntu 18.04 LTS nor on my openSUSE Leap 15.5 (which has that latest Qt 5.15 LTS). |
I tested this for a few minutes and nothing bad happened. I'm fairly sure the problem I was seeing is gone. I also had a quick look in the Debian patch tracker. Lots of Qt patches, but I couldn't see anything that looked like it affected QTreeView. In the debugger, there was a fairly substantial stack trace of Qt function calls between collapseAll() and the magic check, but not very informative without a debug version of the library. |
OK. Thank you for coming back to this. So, for the time being, I'll close this, even if it's just a workaround, not a real fix. Let's reopen it if the problem reappears. And let's have a watchful eye on it. |
I can now reproduce it, too; on my openSUSE Leap 15.5 with RPM. The problem moved on, of course, to the next
It fails pretty reliably when I do this:
-> Exception and crash. |
I was trying to reconstruct how the signal / slot protocol works between the I saw that the In past attempts, I had probably always used the Working TheoryMaybe the |
After enabling core dumps on that systemd-controlled system, And it happens reliably since I added that And it appears that those paint operations, needing data from the model, use stored Backtrace
|
Commit 4ac7e75 changes the order in
|
I couldn't reproduce this by the same method as you, but it does seem like the same thing. Just taking out the collapseAll() call in the original code made my issue go away. I can't provoke it at all in the latest code. Do you think this fixes the whole problem? |
AFAICS yes. A look into the Qt lib sources seems to indicate that it's a problem of persistent indexes where the |
So, the latest thing. I don't know if it is the same or not.
|
No, that's definitely something completely different; it's s different code path. Opening a large directory tree to tree level 5 is asking for trouble: It needs to iterate over all directories in the entire tree down to that level and sort the items in that subdirectory (which has O(n * ln(n))) and store the sorted If you have enough RAM and swap space, it will return eventually, but it will take a looong time, and it will consume a lot of RAM. If you might begin to swap like crazy. Initially, that "open to tree level" action was available even down to tree level 9; I reduced it to that much more reasonable level 5. Still, if your tree is large, it might explode in your face. That is a function to be used with caution. And a lot of RAM. ;-) |
Opening the tree I can do, it takes no time at all. I have RAM to burn although it doesn't seem to use up that much. I can navigate in the opened tree and select items in the DirTreeView, but as soon as I click on a tile, it freezes. I waited it out and it did come back eventually. The tree was closed back down again except for the selected branch, as expected I suppose, but I think that is what is taking the time. It is closing the open branches one by one and there are thousands. See closeAllExcept() in DirTreeView. |
For This appears to me to be a clear case of "Doctor, it hurts when I do that!" - "Then don't do that." ;-) |
To be specific, from the log:
2023-12-03 21:39:19.922 [30721] DirTreeModel.cpp:645 parent(): THROW Exception: Magic number check failed for address 0x58fb8f221ac0
Note that the line number seems to have changed since the Debian version was built. but the location in parent() should be obvious. I can also reproduce it with a build from master.
To reproduce, fairly reliably but not every time:
It doesn't always crash, but it seems to be much more reliable after opening some packages than others. I don't see what they have in common. Or it might be just packages that I happen to have been looking at.
The text was updated successfully, but these errors were encountered: