-
Notifications
You must be signed in to change notification settings - Fork 5.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
8212107: VMThread issues and cleanup #228
Conversation
👋 Welcome back rehn! A progress list of the required criteria for merging this PR into |
@robehn The following label will be automatically applied to this pull request: When this pull request is ready to be reviewed, an RFR email will be sent to the corresponding mailing list. If you would like to change these labels, use the |
Webrevs
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I find juggling the _next_vm_operation
a bit confusing at the first glance, but that seems superficially okay.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I probably should have waited to review this after all of Aleksey's
comments were resolved. I'm gonna have to take a look at
src/hotspot/share/runtime/vmThread.cpp again via a webrev;
it's just too hard to review via this snippet UI.
I'll re-review after all of Aleksey's changes are done.
Mailing list message from David Holmes on hotspot-dev: Hi Robbin, On 18/09/2020 6:34 am, Robbin Ehn wrote:
Can you explain why it was necessary to remove the queue and exactly Thanks, |
VM operations now rare and when we do them they are now also faster compared to when the queue was introduced. So we replace the queue with a "next safepoint operation".
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks like a nice cleanup. I had a couple of questions.
@robehn This change now passes all automated pre-integration checks. ℹ️ This project also has non-automated pre-integration requirements. Please see the file CONTRIBUTING.md for more details. After integration, the commit message for the final commit will be:
You can use pull request commands such as /summary, /contributor and /issue to adjust it as needed. At the time when this comment was updated there had been 28 new commits pushed to the
As there are no conflicts, your changes will automatically be rebased on top of these commits when integrating. If you prefer to avoid this automatic rebasing, please check the documentation for the /integrate command for further details. ➡️ To integrate this PR with the above commit message to the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This generally looks good. Mapping between the old way and the new way is a little tricky but I think I made all the connections.
One thing I did notice is that it seems that nested VM operations are now restricted to a nesting depth of one - is that correct? (And the code could be a lot simpler if nesting was not needed. :) ).
A couple of minor comments/suggestions below.
Thanks.
David
Hi, David. The support should be the same as before the previous operation is stored on stack while we change to the nested operation. Thanks, Robbin |
Sorry my mistake. Thanks. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have only minor comments, without diving into the logic machinery. I am relying on others to review this more substantially.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good!
Mailing list message from David Holmes on hotspot-dev: On 23/09/2020 9:27 pm, Robbin Ehn wrote:
I envisaged simply moving the nesting check out of inner_execute and // psuedo-code Cheers, |
I'm looking at vmThread.cpp via the webrev and the "next" button https://openjdk.github.io/cr/?repo=jdk&pr=228&range=05#frames-6 The "Scroll Down" button is working so I'll push thru it... |
// Check for a cleanup before SafepointALot to keep stats correct. | ||
long interval_ms = SafepointTracing::time_since_last_safepoint_ms(); | ||
bool max_time_exceeded = GuaranteedSafepointInterval != 0 && | ||
(interval_ms >= GuaranteedSafepointInterval); | ||
if (max_time_exceeded && SafepointSynchronize::is_cleanup_needed()) { | ||
return &cleanup_op; | ||
if (!max_time_exceeded) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You've changed the meaning of SafepointALot here. If max_time_exceeded
is false, then you never check the SafepointALot flag and miss causing a
safepointALot_op to happen next.
Here's the old code:
394 if (max_time_exceeded && SafepointSynchronize::is_cleanup_needed()) {
395 return &cleanup_op;
396 }
397 if (SafepointALot) {
398 return &safepointALot_op;
399 }
In the old code if max_time_exceeded and we need a cleanup,
then cleanup_op is the priority, but if that wasn't the case, then
we always checked the SafepointALot flag.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The old behavior could create a SafepointALot when we had no 'safepoint priority' ops in queue when woken.
To get this behavior we need more logic to avoid back to back SafepointALot and we need to peek at _next_vm_operation to determine if it's a safepoint op or not (handshake).
During a normal test run the old behavior only creates around 1% more safepoints.
And if you want more safepoints you can decrease GuaranteedSafepointInterval (not exactly the same).
So I didn't think adding that complexity to exactly mimic the old behavior was worth it.
What you want me to do?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmmm.... The old SafepointALot was intended to safepoint as frequently
as possible to stress the system. Now we do very little at safepoints so
maybe it is time for SafepointALot to evolve. Can you make it so that a
SafepointALot happens some fraction of GuaranteedSafepointInterval, e.g.,
(GuaranteedSafepointInterval / 4) so four times as often?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
All test using SafepointALot the already set the GuaranteedSafepointInterval to a low value in range of ~1-300ms.
(except for vm boolean flag test which uses SafepointALot to test a boolean flag)
For example jni/FastGetField sets GuaranteedSafepointInterval to 1.
The only case it would really differ is when adhoc adding SafepointALot without GuaranteedSafepointInterval.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If GuaranteedSafepointInterval is set to a lower value than the default on the command line, then I'm okay if SafepointALot does not do anything extra. However, if GuaranteedSafepointInterval is either the default value or is set to a higher value, then I would like SafepointALot to cause a safepoint more frequently than the GuaranteedSafepointInterval. Every GuaranteedSafepointInterval/4 would be a fine definition of "a lot".
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Mulling on this more... is it too radical to consider that we no longer need SafepointALot?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would like SafepointALot(and HandshakeALot) to be executed in a separate thread and that randomly request a safepoint (preferably with some validation inside the operation).
Since VM thread now handles this it can not do this request while busy.
Also having the VM thread 'more' sporadic waking up will be confusing for the VM thread loop.
So I agree with you that we need a better SafepointALot, but I think it wrong to use the VM thread to drive it.
I suggest we create an enhancement for it.
Most of my comments this round are not critical. The only real issue |
Mailing list message from David Holmes on hotspot-dev: <trimming> Hi Dan, On 25/09/2020 6:39 am, Daniel D.Daugherty wrote:
This is the whole premise of making this change: we no longer need a Cheers, |
@dholmes-ora and @robehn - I'm good with the rationale about |
@@ -58,7 +58,7 @@ extern Monitor* CodeSweeper_lock; // a lock used by the sweeper o | |||
extern Mutex* MethodData_lock; // a lock on installation of method data | |||
extern Mutex* TouchedMethodLog_lock; // a lock on allocation of LogExecutedMethods info | |||
extern Mutex* RetData_lock; // a lock on installation of RetData inside method data | |||
extern Monitor* VMOperationQueue_lock; // a lock on queue of vm_operations waiting to execute | |||
extern Monitor* VMOperation_lock; // a lock on queue of vm_operations waiting to execute | |||
extern Monitor* VMOperationRequest_lock; // a lock on Threads waiting for a vm_operation to terminate |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can the declaration of VMOperationRequest_lock be removed now too? Since it's no longer being defined in mutexLocker.cpp
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixing, pushing later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Still LGTM.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm okay with leaving SafepointALot as you have it now
and leaving any future cleanup/refinement to a new RFE.
Thanks all! /integrate |
@robehn Since your change was applied there have been 37 commits pushed to the
Your commit was automatically rebased without conflicts. Pushed as commit 431338b. 💡 You may see a message that your pull request was closed with unmerged commits. This can be safely ignored. |
After JDK-8283091, the loop below can be vectorized partially. Statement 1 can be vectorized but statement 2 can't. ``` // int[] iArr; long[] lArrFld; int i1,i2; for (i1 = 6; i1 < 227; i1++) { iArr[i1] += lArrFld[i1]++; // statement 1 iArr[i1 + 1] -= (i2++); // statement 2 } ``` But we got incorrect results because the vector packs of iArr are scheduled incorrectly like: ``` ... load_vector XMM1,[R8 + openjdk#16 + R11 << openjdk#2] movl RDI, [R8 + openjdk#20 + R11 << openjdk#2] # int load_vector XMM2,[R9 + openjdk#8 + R11 << openjdk#3] subl RDI, R11 # int vpaddq XMM3,XMM2,XMM0 ! add packedL store_vector [R9 + openjdk#8 + R11 << openjdk#3],XMM3 vector_cast_l2x XMM2,XMM2 ! vpaddd XMM1,XMM2,XMM1 ! add packedI addl RDI, openjdk#228 # int movl [R8 + openjdk#20 + R11 << openjdk#2], RDI # int movl RBX, [R8 + openjdk#24 + R11 << openjdk#2] # int subl RBX, R11 # int addl RBX, openjdk#227 # int movl [R8 + openjdk#24 + R11 << openjdk#2], RBX # int ... movl RBX, [R8 + openjdk#40 + R11 << openjdk#2] # int subl RBX, R11 # int addl RBX, openjdk#223 # int movl [R8 + openjdk#40 + R11 << openjdk#2], RBX # int movl RDI, [R8 + openjdk#44 + R11 << openjdk#2] # int subl RDI, R11 # int addl RDI, openjdk#222 # int movl [R8 + openjdk#44 + R11 << openjdk#2], RDI # int store_vector [R8 + openjdk#16 + R11 << openjdk#2],XMM1 ... ``` simplified as: ``` load_vector iArr in statement 1 unvectorized loads/stores in statement 2 store_vector iArr in statement 1 ``` We cannot pick the memory state from the first load for LoadI pack here, as the LoadI vector operation must load the new values in memory after iArr writes 'iArr[i1 + 1] - (i2++)' to 'iArr[i1 + 1]'(statement 2). We must take the memory state of the last load where we have assigned new values ('iArr[i1 + 1] - (i2++)') to the iArr array. In JDK-8240281, we picked the memory state of the first load. Different from the scenario in JDK-8240281, the store, which is dependent on an earlier load here, is in a pack to be scheduled and the LoadI pack depends on the last_mem. As designed[2], to schedule the StoreI pack, all memory operations in another single pack should be moved in the same direction. We know that the store in the pack depends on one of loads in the LoadI pack, so the LoadI pack should be scheduled before the StoreI pack. And the LoadI pack depends on the last_mem, so the last_mem must be scheduled before the LoadI pack and also before the store pack. Therefore, we need to take the memory state of the last load for the LoadI pack here. To fix it, the pack adds additional checks while picking the memory state of the first load. When the store locates in a pack and the load pack relies on the last_mem, we shouldn't choose the memory state of the first load but choose the memory state of the last load. [1]https://github.com/openjdk/jdk/blob/0ae834105740f7cf73fe96be22e0f564ad29b18d/src/hotspot/share/opto/superword.cpp#L2380 [2]https://github.com/openjdk/jdk/blob/0ae834105740f7cf73fe96be22e0f564ad29b18d/src/hotspot/share/opto/superword.cpp#L2232 Jira: ENTLLT-5482 Change-Id: I341d10b91957b60a1b4aff8116723e54083a5fb8 CustomizedGitHooks: yes
We simplify the vmThread by removing the queue and refactor the the main loop.
This solves the issues listed:
Passes t1-8, and a benchmark run.
If you want a smaller diff the commits contains the incremental progress and each commit passed t1.
Progress
Issue
Reviewers
Download
$ git fetch https://git.openjdk.java.net/jdk pull/228/head:pull/228
$ git checkout pull/228