Skip to content
This repository was archived by the owner on Oct 12, 2022. It is now read-only.
/ druntime Public archive

GC debugging aids#3523

Merged
RazvanN7 merged 8 commits intodlang:masterfrom
CyberShadow:pull-20210724-141933
Aug 1, 2021
Merged

GC debugging aids#3523
RazvanN7 merged 8 commits intodlang:masterfrom
CyberShadow:pull-20210724-141933

Conversation

@CyberShadow
Copy link
Member

These tweaks were useful while debugging dlang-community/libdparse#445.

They might as well be upstreamed.

CC @rainers

@dlang-bot
Copy link
Contributor

Thanks for your pull request, @CyberShadow!

Bugzilla references

Your PR doesn't reference any Bugzilla issue.

If your PR contains non-trivial changes, please reference a Bugzilla issue or create a manual changelog.

Testing this PR locally

If you don't have a local development environment setup, you can use Digger to test this PR:

dub run digger -- build "master + druntime#3523"

@CyberShadow CyberShadow force-pushed the pull-20210724-141933 branch from 2f1d92c to e4e2732 Compare July 24, 2021 15:35
&_d_gcx_atfork_prepare,
&_d_gcx_atfork_parent,
&_d_gcx_atfork_child);
version (COLLECT_PARALLEL)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why are the fork handlers only needed when marking in parallel?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You tell me. _d_gcx_atfork_prepare etc. are inside the version (COLLECT_PARALLEL): section.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Initially the fork handler only dealt with variables for parallel marking, but #2817 made it more general, so should have moved it out of the version.
It currently doesn't make much difference, though, as version COLLECT_PARALLEL is available unconditionally ATM.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, I see. So, I screwed up then by adding these handlers in the wrong place.

Also I had completely forgotten that it was me who added them. :)

I will move them out.

COLLECT_PARALLEL is unconditional in that it is always turned on at the top of the file, but the point of such switches is to toggle them during debugging, so I think turning it off in this way ought to either work or the switch should just be removed (and the behavior controlled through the runtime setting).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will move them out.

Done.

if (hasFinalizer)
sweepBin!true();
else
sweepBin!false();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this compile to nothing if all debug options are off? I doubt that, so maybe prepend debug.

Copy link
Member Author

@CyberShadow CyberShadow Jul 25, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In DMD, it doesn't quite compile to nothing. There is an empty loop, and some O(1) setup. I think it should compile to nothing in LDC/GDC, but I haven't checked.

debug isn't quite right, there's a few conditions affecting this code.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In DMD, it doesn't quite compile to nothing.

I think that would be a blocker for adding the else path unconditionally. An appropriate compile time condition should be possible.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suspect it would be inobservable in benchmarks, because the loop just increments registers.

But, I think it would be better to just split this into several loops, each with a top-level conditional.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it should compile to nothing in LDC/GDC, but I haven't checked.

It does not:

https://forum.dlang.org/post/aejdlmkvuwgbtxktpjby@forum.dlang.org

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But, I think it would be better to just split this into several loops, each with a top-level conditional.

Done.

The best solution I could come up with here is a string mixin.



private void log_free(void *p) nothrow @nogc
private void log_free(void *p, size_t size) nothrow @nogc
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

size unused? I suspect you added some additional checks in your local copy. Add it to the printfs here instead?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please see the commit message.

This patch isn't quite right yet (as the failing tests show).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add it to the printfs here instead?

Good idea, done.

This patch isn't quite right yet (as the failing tests show).

Fixed.

@CyberShadow CyberShadow force-pushed the pull-20210724-141933 branch from e4e2732 to 8a28ad5 Compare July 25, 2021 06:51
Move the fork handlers out of the version(COLLECT_PARALLEL): section,
and only version-out the code dealing with parallel scanning.
@CyberShadow CyberShadow marked this pull request as draft July 30, 2021 14:53
@CyberShadow CyberShadow force-pushed the pull-20210724-141933 branch 5 times, most recently from 9849efb to b9093d6 Compare July 31, 2021 07:08
@CyberShadow
Copy link
Member Author

OK, I think this is good to go.

  • Addressed suggestions from code review.
  • Fixed some bitrot in old tests.
  • Fixed and moved an old commented-out unittest fragment to its own test.
  • Added a test for the debug hooks fix.

@CyberShadow CyberShadow marked this pull request as ready for review July 31, 2021 07:46
@CyberShadow CyberShadow requested a review from rainers July 31, 2021 07:46
runLocked!(go, otherTime, numOthers)(gcx);
}

debug (GC_RECURSIVE_LOCK) static bool lockedOnThisThread;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add the new condition to the list of conditions at the top with a short description.

BTW: does this change really help? IIRC throwing involves capturing the stack which is saved to memory allocated from the GC, so a staticError is not really @nogc. Maybe this has changed in the mean time, though.

Copy link
Member Author

@CyberShadow CyberShadow Jul 31, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good idea, done. (Edit: and pushed.) (Edit 2: moved it the right commit. 🙃 )

Yep, it does seem to work! The tests I added make use of it to catch the Error.

Copy link
Member Author

@CyberShadow CyberShadow Jul 31, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IIRC throwing involves capturing the stack which is saved to memory allocated from the GC

Yeah, InvalidMemoryOperationError errors do not have a stack trace, which incidentally makes them difficult to debug. Room for improvement there...

@CyberShadow CyberShadow force-pushed the pull-20210724-141933 branch from b9093d6 to cffb146 Compare July 31, 2021 12:01
When a debug GC build fails an assertion, this manifests as the
program freezing. This is because the failed assertion causes an
allocation (to allocate memory for AssertError), however, as the GC
lock was held at the time that the assertion failed, the allocation
attempt results in another attempt at acquiring the GC lock. As the
locking mechanism used by the GC does not support same-thread
re-entry, the program hangs in a deadlock.

This is acceptable for casual debugging, as it's easy enough to attach
a debugger and examine the stack trace. However, it does make it
complicated to write automated tests which verify that the GC code is
asserting when it should.

For this purpose, introduce debug=GC_RECURSIVE_LOCK, which adds
runtime checking if we are attempting to re-acquire the GC lock in the
same thread, and call onInvalidMemoryOperationError if so.
Because this test involves crashing the GC and catching an Error, this
requires it to be a stand-alone test, so that the undefined state of
the GC left behind by the thrown Error does not interfere with other
tests.
@CyberShadow CyberShadow force-pushed the pull-20210724-141933 branch from cffb146 to befef1a Compare July 31, 2021 12:03
Copy link
Member

@rainers rainers left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM but the mixin

}
}`;

debug (SENTINEL)
Copy link
Member

@rainers rainers Jul 31, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please avoid the string mixin, it makes the code undebuggable.

I was rather thinking of:

debug(SENTINEL)
   enum doLoop = true;
else version(assert)
   enum doLoop = true;
else debug(...)
   ...
else
   bool doLoop = hasFinalizer;
if (doLoop)

followed by the existing code. If any of the debug conditions are enabled, performance is not a goal.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Undebuggable how? You mean with an interactive debugger?

I wanted to avoid duplicating the condition and putting the other instance too far away from the first one, as they will inevitably go out of sync, as they already have in this very case.

Copy link
Member

@rainers rainers Jul 31, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Undebuggable how? You mean with an interactive debugger?

Yes.

BTW: I just noticed that hasFinalizer is already defined a few lines above as I proposed. Maybe it is just missing the SENTINEL condition?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

BTW: I just noticed that hasFinalizer is already defined a few lines above as I proposed.

Hah, well spotted.

OK, I'll change it to the old style. But, if this bitrots again...

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks. No guarantees, though ;-)

The `if (hasFinalizer)` condition wraps a lot more than just calling
finalizers.

Update the list of conditions, and rename the boolean variable to a
more descriptive name, with the hope that this will help avoid future
bitrot.
Tests the fix in the previous commit.
This function is a good site to add debug hooks to trace GC
deallocations. However, though the size of the allocation is known to
the GC in all deallocation circumstances, this information is not
passed along.

Add a parameter to log_free to thus aid debugging.  Without
debug=LOGGING, the value is unused (and therefore optimized out).
With debug=LOGGING, we can additionally print the allocation size in
case LeakDetector detects a double free.
@CyberShadow CyberShadow force-pushed the pull-20210724-141933 branch from befef1a to 4b401a8 Compare July 31, 2021 12:52
@RazvanN7 RazvanN7 merged commit 6abd0a9 into dlang:master Aug 1, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants