-
Notifications
You must be signed in to change notification settings - Fork 5.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
8277850: C2: optimize mask checks in counted loops #6697
Conversation
👋 Welcome back roland! A progress list of the required criteria for merging this PR into |
Webrevs
|
@rwestrel there is issue with build on Windows:
|
src/hotspot/share/opto/mulnode.cpp
Outdated
return TypeInt::ZERO; | ||
} | ||
|
||
return MulINode::Value(phase); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is no MulINode::value()
or MulLNode::value()
, only based class MulNode
have it.
You missing initial check done by MulNode::value()
.
I suggest to move this code into AndINode::mul_ring()
and AndLNode::mul_ring()
.
Also in mul_ring()
you can pass r1->get_con()
as mask
value to AndIL_shift_and_mask()
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for reviewing this.
MulNode::AndIL_shift_and_mask() needs the shift node so it can test for shift->Opcode() == Op_ConvI2L. mult_ring() only gets the Type* as input. Are you suggesting to change mul_ring's signature?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You are right, I missed that.
It is unfortunate - we will have many duplicated checks.
src/hotspot/share/opto/mulnode.cpp
Outdated
@@ -1683,3 +1713,58 @@ const Type* RotateRightNode::Value(PhaseGVN* phase) const { | |||
return TypeLong::LONG; | |||
} | |||
} | |||
|
|||
bool MulNode::AndIL_shift_and_mask(PhaseGVN* phase, Node* mask, Node* shift, BasicType bt) const { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pass mask
as value here.
|
||
Node* MulNode::AndIL_add_shift_and_mask(PhaseGVN* phase, BasicType bt) { | ||
Node* in1 = in(1); | ||
Node* in2 = in(2); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Caller already determine that in2
is const int mask = t2->get_con()
. You can pass it here.
@@ -1683,3 +1713,58 @@ const Type* RotateRightNode::Value(PhaseGVN* phase) const { | |||
return TypeLong::LONG; | |||
} | |||
} | |||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add comment here too which pattern it is looking for.
|
||
bool MulNode::AndIL_shift_and_mask(PhaseGVN* phase, Node* mask, Node* shift, BasicType bt) const { | ||
if (mask == NULL || shift == NULL) { | ||
return false; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You need to check shift
for TOP
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code below:
const TypeInteger* shift_t = phase->type(shift)->isa_integer(bt);
if (mask_t == NULL || shift_t == NULL) {
catches the case where shift is top, I think.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You are right. It is type check.
src/hotspot/share/opto/mulnode.cpp
Outdated
return TypeInt::ZERO; | ||
} | ||
|
||
return MulINode::Value(phase); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You are right, I missed that.
It is unfortunate - we will have many duplicated checks.
src/hotspot/share/opto/mulnode.hpp
Outdated
@@ -82,6 +82,9 @@ class MulNode : public Node { | |||
virtual int min_opcode() const = 0; | |||
|
|||
static MulNode* make(Node* in1, Node* in2, BasicType bt); | |||
|
|||
bool AndIL_shift_and_mask(PhaseGVN* phase, Node* mask, Node* shift, BasicType bt) const; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This one could be static method.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me, I submitted testing.
|
||
/* | ||
* @test | ||
* @bug JDK-8277850 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should be @bug 8277850
(not sure if that is a requirement but all other tests use that format).
* @run driver compiler.c2.irTests.TestShiftAndMask | ||
*/ | ||
|
||
public class TestShiftAndMask { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice tests!
@@ -675,7 +699,7 @@ Node *AndLNode::Ideal(PhaseGVN *phase, bool can_reshape) { | |||
const jlong mask = t2->get_con(); | |||
|
|||
Node* in1 = in(1); | |||
uint op = in1->Opcode(); | |||
int op = in1->Opcode(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I got curious and checked the code, we have similar patterns everywhere. Filed JDK-8278328 to clean up this mess.
return false; | ||
} | ||
|
||
Node* MulNode::AndIL_add_shift_and_mask(PhaseGVN* phase, BasicType bt) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please add a comment describing the pattern.
Build on Windows failed:
|
@rwestrel Something went wrong with rebasing. It shows changes in 533 files. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good. Let me test it before approval.
Unfortunately I can't test these changes because they depend on #6607 |
I did the same as Tobias and applied #6607 changes first. |
The dependent pull request has now been integrated, and the target branch of this pull request has been updated. This means that changes from the dependent pull request can start to show up as belonging to this pull request, which may be confusing for reviewers. To remedy this situation, simply merge the latest changes from the new target branch into this pull request by running commands similar to these in the local repository for your personal fork: git checkout JDK-8262341
git fetch https://git.openjdk.java.net/jdk master
git merge FETCH_HEAD
# if there are conflicts, follow the instructions given by git merge
git commit -m "Merge master"
git push |
@rwestrel this pull request can not be integrated into git checkout JDK-8277850
git fetch https://git.openjdk.java.net/jdk master
git merge FETCH_HEAD
# resolve conflicts and follow the instructions given by git merge
git commit -m "Merge master"
git push |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That looks good to me. Tests are running.
@rwestrel This change now passes all automated pre-integration checks. ℹ️ This project also has non-automated pre-integration requirements. Please see the file CONTRIBUTING.md for details. After integration, the commit message for the final commit will be:
You can use pull request commands such as /summary, /contributor and /issue to adjust it as needed. At the time when this comment was updated there had been 5 new commits pushed to the
Please see this link for an up-to-date comparison between the source branch of this pull request and the ➡️ To integrate this PR with the above commit message to the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Build issue solved. Thanks.
Testing looks good too so far. It is still running but should be finished soon.
@rwestrel, please, push it today before RDP1 cut off.
@vnkozlov @TobiHartmann thanks for the reviews. |
/integrate |
Going to push as commit 8af3b27.
Your commit was automatically rebased without conflicts. |
// patterns similar to (v << 2) & 3 to 0 | ||
// and | ||
// patterns similar to (v1 + (v2 << 2)) & 3 transformed to v1 & 3 | ||
bool MulNode::AndIL_shift_and_mask(PhaseGVN* phase, Node* mask, Node* shift, BasicType bt) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- The operand order is distracting and perhaps confusing, since mask is in(2) and shift is in(1).
- The mask is not necessarily in(2), because it doesn't have to be a constant; it can be a bounded value.
- I wrote an expanded comment.
- There's a redundant NULL check.
- You should use a different name for the adjusted
bt
value.
Here are the changes I suggest to this function:
// Given an expression (AndX shift mask) or (AndX mask shift),
// determine if the AndX must always produce zero, because the
// the shift (x<<N) is bitwise disjoint from the mask #M.
// The X in AndX must be I or L, depending on bt.
// Specifically, the following cases fold to zero,
// when the shift value N is large enough to zero out
// all the set positions of the and-mask M.
// (AndI (LShiftI _ #N) #M) => #0
// (AndL (LShiftL _ #N) #M) => #0
// (AndL (ConvI2L (LShiftI _ #N)) #M) => #0
// The M and N values must satisfy ((-1 << N) & M) == 0.
// Because the optimization might work for a non-constant
// mask M, we check the AndX for both operand orders.
bool MulNode::AndIL_shift_and_mask(PhaseGVN* phase, Node* shift, Node* mask, BasicType bt, bool check_reverse) const {
if (mask == NULL || shift == NULL) {
return false;
}
const TypeInteger* mask_t = phase->type(mask)->isa_integer(bt);
const TypeInteger* shift_t = phase->type(shift)->isa_integer(bt);
if (mask_t == NULL || shift_t == NULL) {
return false;
}
+ BasicType shift_bt = bt;
+ if (bt == T_LONG && shift->Opcode() == Op_ConvI2L) {
Node* val = shift->in(1);
if (val == NULL) {
return false;
}
+ if (val->Opcode() == Op_LShiftI) {
+ shift_bt = T_INT;
+ shift = val;
+ }
}
#s/bt/shift_bt/
if (shift->Opcode() != Op_LShift(shift_bt)) {
+ if (check_reverse &&
+ (mask->Opcode() == Op_LShift(bt) ||
+ (bt == T_LONG && mask->Opcode() == Op_ConvI2L))) {
+ // try it the other way around
+ return AndIL_shift_and_mask(phase, shift, mask, bt, false);
+ }
return false;
}
Node* shift2 = shift->in(2);
if (shift2 == NULL) {
return false;
}
const Type* shift2_t = phase->type(shift2);
if (!shift2_t->isa_int() || !shift2_t->is_int()->is_con()) {
return false;
}
#s/bt/shift_bt/
jint shift_con = shift2_t->is_int()->get_con() & ((shift_bt == T_INT ? BitsPerJavaInteger : BitsPerJavaLong) - 1);
if ((1L << shift_con) > mask_t->hi_as_long() && mask_t->lo_as_long() >= 0) {
return true;
}
return false;
}
Suggest signature change:
|
||
// Helper method to transform: | ||
// patterns similar to (v1 + (v2 << 2)) & 3 to v1 & 3 | ||
Node* MulNode::AndIL_add_shift_and_mask(PhaseGVN* phase, BasicType bt) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Similar comments for this function also.
// Given an expression (AndX (AddX v1 (LShiftX v2 #N)) #M)
// determine if the AndX must always produce (AndX v1 #M),
// because the shift (v2<<N) is bitwise disjoint from the mask #M.
// The X in AndX will be I or L, depending on bt.
// Specifically, the following cases fold,
// when the shift value N is large enough to zero out
// all the set positions of the and-mask M.
// (AndI (AddI v1 (LShiftI _ #N)) #M) => v1
// (AndL (AddI v1 (LShiftL _ #N)) #M) => v1
// (AndL (AddL v1 (ConvI2L (LShiftI _ #N))) #M) => v1
// The M and N values must satisfy ((-1 << N) & M) == 0.
// Because the optimization might work for a non-constant
// mask M, and because the AddX operands can come in either
// order, we check for every operand order.
Node* MulNode::AndIL_add_shift_and_mask(PhaseGVN* phase, BasicType bt) {
Node* add = in(1);
Node* mask = in(2);
+ if (add == NULL || mask == NULL) {
+ return NULL;
+ }
+ int addidx = 0;
if (add->Opcode() == Op_Add(bt)) {
+ addidx = 1;
+ } else if (mask->Opcode() == Op_Add(bt)) {
+ mask = add;
+ addidx = 2;
+ add = in(addidx);
+ }
+ if (addidx > 0) {
Node* add1 = add->in(1);
Node* add2 = add->in(2);
if (add1 != NULL && add2 != NULL) {
if (AndIL_shift_and_mask(phase, add1, mask, bt, false)) {
set_req_X(addidx, add2, phase);
return this;
} else if (AndIL_shift_and_mask(phase, add2, mask, bt, false)) {
set_req_X(addidx, add1, phase);
return this;
}
}
}
return NULL;
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a good change. I have made some stylistic suggestions in context, and also suggested some further inspection of reordered operands.
You could do a bit more with the same analysis by checking for AndNot as well as And operations. (AndNot is not in our IR, but you can look at complemented constant masks, and look for Xors that perform complement non-constant ones.) Wherever an And would return identically zero, the corresponding AndNot would return its unmasked operand unchanged. This would allow you to simplify (x << 2) & -4
to x
, etc., with only a modest extension on the present analysis. Perhaps a followup RFE could cover this, or it could be folded in to this work with a few more "mode bits" in the function signatures.
Much of this analysis would be subsumed by a more general bitwise type analysis which would track each bit position in every integral type, whether it is constant zero or constant one. Such an analysis is straightforward to develop (extending the _hi
and _lo
fields with _up
and _dn` fields), and the rules are math that any Hacker's Delight reader can develop. Although this is way too much to consider as a point fix, it will be useful at some point in the future, especially when we begin to optimize on types larger than 64 bits. For the record, you can see the math of bitwise inference in action in this test class (which contains formulas to infer up/dn masks from arithmetic expressions, and hi/lo bounds from bitwise ones):
Extends the optimization of masked sums introduced in openjdk#6697 to cover constant values, which currently break the optimization. Such constant values arise in an expression of the following form, for example from MemorySegmentImpl#isAlignedForElement: (base + (index + 1) << 8) & 255 => MulNode (base + (index << 8 + 256)) & 255 => AddNode ((base + index << 8) + 256) & 255 Currently, "256" is not being recognized as a shifted value. This PR enables: ((base + index << 8) + 256) & 255 => MulNode (base + index << 8) & 255 => MulNode (PR openjdk#6697) base & 255
This is another fix that addresses a performance issue with panama and that was brought up by Maurizio. The pattern to optimize is:
if ((base + (offset << 2)) & 3) != 0) {
}
where base is loop independent but offset depends on a loop variable. This can be transformed to:
if ((base & 3) != 0) {
That check becomes loop independent and be optimized by loop predication (or I suppose loop unswitching but that wasn't the case of the micro benchmark I worked on).
This change also optimizes the pattern:
(offset << 2) & 3
to return 0.
Progress
Issue
Reviewers
Reviewing
Using
git
Checkout this PR locally:
$ git fetch https://git.openjdk.java.net/jdk pull/6697/head:pull/6697
$ git checkout pull/6697
Update a local copy of the PR:
$ git checkout pull/6697
$ git pull https://git.openjdk.java.net/jdk pull/6697/head
Using Skara CLI tools
Checkout this PR locally:
$ git pr checkout 6697
View PR using the GUI difftool:
$ git pr show -t 6697
Using diff file
Download this PR as a diff file:
https://git.openjdk.java.net/jdk/pull/6697.diff