-
Notifications
You must be signed in to change notification settings - Fork 4.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add new version of the ContentionStart event #72627
Add new version of the ContentionStart event #72627
Conversation
In this new event, we add the lock object id and the id of the thread that owns it.
I couldn't figure out the best area label to add to this PR. If you have write-permissions please help me learn by adding exactly one area label. |
@@ -1742,6 +1742,21 @@ | |||
</UserData> | |||
</template> | |||
|
|||
<template tid="ContentionStart_V2"> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder if putting the LockId and OwnerId wouldn't be better on the Stop event, WDYT?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it makes more sense on the start event, since you want to know what was holding it that triggered the contention
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The change looks good to me, but @kouvel should take a look before we merge. He is out of the office for a few more days though
src/coreclr/vm/syncblk.cpp
Outdated
@@ -2549,14 +2554,9 @@ BOOL AwareLock::EnterEpilogHelper(Thread* pCurThread, INT32 timeOut) | |||
QueryPerformanceCounter(&startTicks); | |||
|
|||
// Fire a contention start event for a managed contention | |||
FireEtwContentionStart_V1(ETW::ContentionLog::ContentionStructs::ManagedContention, GetClrInstanceId()); | |||
FireEtwContentionStart_V2(ETW::ContentionLog::ContentionStructs::ManagedContention, GetClrInstanceId(), OBJECTREFToObject(obj), m_HoldingThread); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Rather than writing out m_HoldingThread
which is the address of the Thread
object in memory, can we change this to m_HoldingThread->GetOSThreadId64()
? This will allow a profiler to display what the thread that currently holds the lock is doing. With the pointer value, you would need to provide another event that correlates between Thread*
and the actual OS thread ID.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Another potential issue here is that the value of obj
can change during GC relocation. Is the goal to also watch the GC survival and movement events, or do you want to try and have a more durable ID? I'm wondering if AwareLock
should also have a set of creation/destruction events so that you can see the full lifetime of the lock.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@brianrob good point. I'll make the change to replace the pointer by the OS Thread id.
Another potential issue here is that the value of obj can change during GC relocation.
Does it mean that the address is changed also ?
The goal, at least in this PR, is not to watch the lock object lifetime, but more having an ID to do some correlation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does it mean that the address is changed also ?
Yes the object address also changes when it's moved. Could pass in this
instead to include the AwareLock
pointer as just an identifier for the lock.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Using this
is a good idea. Worth calling out though that it is possible to destroy and then create a new lock at the same address. To make sure that you know about this, you may wish to write an event in the AwareLock
ctor so that you know when the aware lock is created. This type of pattern is other places such as module and type loads. We use the address of the module or type in memory, but handle questions of lifetime using load/unload events.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the info.
So if I understand correctly, in the AwareLock
ctor I emit an event (we know the lock is created) and I will have to emit an event in the dtor to know when the lock is deleted, right ?
I'll do the change taking into account your comment.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To avoid emitting too many events, I think you can safely emit just the creation event. If you see two creation events at the same address, you know that it's a new lock.
c0b7d3c
to
d6d6398
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A couple of fixups, but overall coming together.
src/coreclr/vm/ClrEtwAll.man
Outdated
<template tid="ContentionStart_V2"> | ||
<data name="ContentionFlags" inType="win:UInt8" map="ContentionFlagsMap" /> | ||
<data name="ClrInstanceID" inType="win:UInt16" /> | ||
<data name="LockObjectID" inType="win:Pointer" /> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Recommend calling this LockID
, and then match with the same name in the AwareLockCreated
event.
src/coreclr/vm/syncblk.cpp
Outdated
{ | ||
QueryPerformanceCounter(&startTicks); | ||
|
||
// Fire a contention start event for a managed contention | ||
FireEtwContentionStart_V1(ETW::ContentionLog::ContentionStructs::ManagedContention, GetClrInstanceId()); | ||
FireEtwContentionStart_V2(ETW::ContentionLog::ContentionStructs::ManagedContention, GetClrInstanceId(), OBJECTREFToObject(obj), this); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The two new parameters are LockID and OwningThreadID, so the first new parameter should be this
, and the second should be pCurThread->GetOSThreadId64()
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh yeah you are right... I forgot this one. I was focused on the AwareLock event.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it should be m_HoldingThread
instead of pCurThread
, because we want the thread that currently holds the lock
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, you're right.
src/coreclr/vm/ClrEtwAll.man
Outdated
@@ -3652,6 +3685,12 @@ | |||
task="Contention" | |||
symbol="ContentionStop_V1" message="$(string.RuntimePublisher.ContentionStop_V1EventMessage)"/> | |||
|
|||
<event value="414" version="0" level="win:Informational" template="AwareLockCreated" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks like event ID 90
is not used:
<event value="414" version="0" level="win:Informational" template="AwareLockCreated" | |
<event value="90" version="0" level="win:Informational" template="AwareLockCreated" |
src/coreclr/vm/ClrEtwAll.man
Outdated
@@ -211,6 +211,7 @@ | |||
value="8" eventGUID="{561410f5-a138-4ab3-945e-516483cddfbc}" | |||
message="$(string.RuntimePublisher.ContentionTaskMessage)"> | |||
<opcodes> | |||
<opcode name="AwareLockCreated" message="$(string.RuntimePublisher.AwareLockCreatedOpcodeMessage)" symbol="CLR_AWARELOCK_CREATED_OPCODE" value="10"> </opcode> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe a new opcode is not necessary, the win:Start
opcode can be used instead
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think win:Info is a good choice here since we are not starting an action, and there won't be a corresponding stop.
src/coreclr/vm/ClrEtwAll.man
Outdated
@@ -3652,6 +3685,12 @@ | |||
task="Contention" | |||
symbol="ContentionStop_V1" message="$(string.RuntimePublisher.ContentionStop_V1EventMessage)"/> | |||
|
|||
<event value="414" version="0" level="win:Informational" template="AwareLockCreated" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The name AwareLock
is specific to the current implementation. The implementation may be changed, or another implementation may be added in the future (eg. #34812) and may want to reuse the same events, maybe just LockCreated
would be more generic
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed, this is a good idea.
Not yet. This is waiting for the most recent round of code review feedback to be addressed. |
Thanks for moving the milestone to 8.0 @kouvel. Given that this PR is still under iteration moving to 8.0 is correct. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. @kouvel, what do you think?
@davmason, it looks like the test failures are related to the new |
Looking at it, it should be possible to declare that it is legal to take them in that order and everything should work. Our crst code tries to make everything illegal unless specifically whitelisted. @gleocadie can you do the following to update it:
|
There are two failures, for the second failure it looks like the place where the LockCreated event is emitted needs to change, which would also make the above lock ordering unnecessary. I also found a separate minor issue. I'll go ahead and post my fixes. |
Changes in the latest commit:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looking good. A couple of small things.
src/coreclr/vm/syncblk.cpp
Outdated
} | ||
|
||
// Fire the LockCreated event last, such that it is outside the FAULT_FORBID region above and outside the SyncBlockCache | ||
// lock. The eventing system may need to allocate. | ||
syncBlock->FireLockCreatedEvent(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does this guarantee that we only fire this event when a lock is created? There are other paths that create syncblocks, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I didn't see any other paths that create sync blocks
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
From what I can see, there are others. Here's an example: https://github.com/dotnet/runtime/blob/main/src/coreclr/vm/object.cpp#L987
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It looks like this path also goes through GetSyncBlock()
, which is modified here, so it would also emit the event
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right. There are others too, so I don't think this is the right place to emit the event, as it will be potentially much more verbose (and duplicative) than we want.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I thought the sync block would be created once per object and no more? I
That's only true for precious syncblocks. Syncblocks that are not marked as precious can be reclaimed and re-created multiple times.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's only true for precious syncblocks. Syncblocks that are not marked as precious can be reclaimed and re-created multiple times.
Wouldn't that mean that each time a sync block is recreated for an object, that it may have a different pointer to the AwareLock
and hence warrant emitting the event?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, that's correct.
I think that Brian's concern is that we will emit this event for all SyncBlocks even the ones that are not ever used for locks. SyncBlocks are created for number of different reasons (the name "syncblock" is misleading). For example, interop creates them a lot for COM objects and marshalled delegates.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes. What do you think about having the AwareLock
log that it was created lazily - on first acquisition?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see. Yes, it could probably be emitted lazily. Taking a step back, is the lock ID really necessary? The stack trace of the contention event would probably be sufficient to identify the lock. I figure the LockCreated
event's stack trace would provide similar information. A benefit may be to have only a few LockCreated
events and avoid taking stack traces for contention events, but at the cost of missing those created events in some cases. Is there any other benefit?
src/coreclr/vm/syncblk.cpp
Outdated
ETW::ContentionLog::ContentionStructs::ManagedContention, | ||
GetClrInstanceId(), | ||
this, | ||
m_HoldingThread); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The last argument here should actually be the OS thread ID. This allows users of the event to match the lock up with the thread that is holding it in a trace. The memory address of the lock won't allow for this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I commented on this in #72627 (comment). It looks like the holding thread pointer cannot be dereferenced here because there's not guarantee that it would be alive. The ThreadRunning
event is emitted from a newly started thread and can be used to map the Thread
pointer to the OS thread ID.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For threads that are created from outside the runtime and then enter the runtime, I believe the ThreadCreated
event can be used to do the same mapping.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think we can depend on the ThreadRunning
event because there is nothing that guarantees that this event is present in the trace. If the thread already exists when the trace is started, then we have no way to correlate it back to the OS thread. That said, this call is made in a function that requires cooperative mode. Given this, even if the thread dies, the memory shouldn't be reclaimed until after the next GC when the managed object Finalizer executes, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think we can depend on the ThreadRunning event because there is nothing that guarantees that this event is present in the trace. If the thread already exists when the trace is started, then we have no way to correlate it back to the OS thread.
That's also true for the new LockCreated
event. It seems like the ordering of those events would matter, so having a rundown of thread and lock objects may not help much to identify the thread. Perhaps the thread object is stored as the holding thread because it is easily accessible and is the minimum work necessary to identify the holding thread. Storing the OS thread ID instead would involve an indirection, and OS thread IDs may also be reused at some point, I suspect this kind of issue would exist in multiple places.
That said, this call is made in a function that requires cooperative mode. Given this, even if the thread dies, the memory shouldn't be reclaimed until after the next GC when the managed object Finalizer executes, right?
I'm not sure about that, coop GC mode would ensure that the managed thread object would be alive if accessible, but once the managed thread object is deemed unreachable and the finalizer runs, perhaps the native thread object can be deleted irrespective of coop GC mode. Maybe there are further constraints that I'm not aware of?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
emitting the LockCreated event more often than we should (e.g. every time there is a call to GetSyncBlock)
Why would the LockCreated
event be emitted every time GetSyncBlock()
is called?
With regard to the OS thread ID. You're right that these can be re-used. This is addressed because the OS will tell us when a thread is created/destroyed. Thus, if there is a re-use within a trace, we'd see this from the OS thread events.
Why would there not be a similar problem to what you mentioned before regarding missing events when starting a trace on an already-running process? For instance, the create/destroy OS events for threads may not be captured by the trace.
A way to address this would be for the thread that acquires the lock to save its thread ID into the lock when tracing is on and clear it when the lock is relinquished. What do you think of that?
The information stored is probably minimal, in the sense what is stored is what is most easily accessible for correct behavior, a different implementation may store something else and that should be ok. As I mentioned before, storing the thread ID would involve an indirection. Changing the implementation for this particular reason is probably something that would need a lot more scrutiny, I don't see enough reason for it yet.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why would the LockCreated event be emitted every time GetSyncBlock() is called?
You're right, it shouldn't. I missed the fact that the cache lookup occurs at the beginning of this function, and returns early.
Why would there not be a similar problem to what you mentioned before regarding missing events when starting a trace on an already-running process? For instance, the create/destroy OS events for threads may not be captured by the trace.
You technically don't need to see the thread creation event to know that it existed at the beginning of the trace, as long as you see samples or events on the thread. What you do need is to have a thread creation event during the trace to know that the thread ID has been re-used. That said, the OS provides thread/process rundown events at the beginning of the trace to tell profilers about the existence of all processes and traces.
The information stored is probably minimal, in the sense what is stored is what is most easily accessible for correct behavior, a different implementation may store something else and that should be ok. As I mentioned before, storing the thread ID would involve an indirection. Changing the implementation for this particular reason is probably something that would need a lot more scrutiny, I don't see enough reason for it yet.
I'm not sure I follow. Can you tell me a little bit more here? Are you saying that you don't think having the OS Thread ID is worthwhile here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That said, the OS provides thread/process rundown events at the beginning of the trace to tell profilers about the existence of all processes and traces.
Curious, does this work with circular buffers? More generally, is there some mechanism by which a trace taken with a circular buffer, where the buffer fills up and is overwritten, can still have an accurate snapshot of live objects through events at the beginning of the trace?
Are you saying that you don't think having the OS Thread ID is worthwhile here?
No, mapping to a thread ID would probably be the most useful part. I don't think we should change or add to what is stored for the lock-holding thread. Perhaps taking the thread store lock could enable emitting the OS thread ID.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A thread that orphans a lock does not clear the holding thread pointer before its Thread
object is destroyed, so acquiring the thread store lock still would not make it feasible to dereference the holding thread pointer. There seem to be a few other related issues and fixing them properly would likely be more involved. For now, we could also store the OS thread ID when the event keyword is enabled as you suggested and update it after the other issues are fixed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated in the latest commit to store and send the OS thread ID, since I think it would be possible to continue doing so after the complete fixes. Still curious about the question in #72627 (comment), as I've run into that issue before and a solution to it may be useful elsewhere.
Updated in the latest commit to store and send the OS thread ID. I think the main remaining question is #72627 (comment). The |
Sorry, this got lost in my inbox. I think that the latest change @kouvel added is great. As for the remaining question, I am not quite sure how to address this. Let's say that we don't have a creation event. We will never know how many locks we're dealing with. For example, let's say that there is a type of object that gets processed, and some of the work that happens must happen under lock. The processing is likely to happen with the same stack over and over, but there may be a particular instance or instances that have more contention than others just based on the data. Given that sync blocks can be re-used, it would be quite difficult for us to know how many locks are in play here - we'd have the LockID that would help, but as those IDs are re-used over time, we would not be able to tell. This is a fairly classic pattern that we see across the runtime - other examples of this type of thing include assembly loads being assigned address space, jitted code for collectible assemblies, and GC heap survival events to report object movement during compaction. I think that if we don't have a LockID or LockCreated event, and just try to depend upon the stack, the value of the additional information decays fairly heavily because you can no longer determine lock lifetime or which stacks are acquiring which lock. |
Having the |
Should the lock created event have |
Just paging this back into my memory, so let me know if I miss something or state the opposite of what I was wanting earlier. I think using the
If this is easily available, I think it's a great idea. For long-running processes, capturing a dump and a trace together could provide some great insight into contention patterns. |
I guess having both the |
The one thing we'll need to be aware of is that the objectID can change and so we may need to use something like the heap survival and movement events to make sure that we get the right object id. That said, I don't think we should spec that out completely just yet. |
Updated and added PerfView changes in microsoft/perfview#1731 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
The goal of this PR is to enrich the information provided in the ContentionStart event.
Today, we have the callstack (attached to the ContentionStart event), the contention duration (with the ContentionEnd event).
By adding the ObjectID of the lock object and the ThreadID of the thread that currently holds the lock, we will be able to build the view
which thread is blocking other threads on which lock object
.