Skip to content

Conversation

@e1iu
Copy link
Member

@e1iu e1iu commented Nov 13, 2020

This patch is a supplement for
https://bugs.openjdk.java.net/browse/JDK-8248830.

With the implementation of rotate node in IR, this patch:

  1. canonicalizes RotateLeft into RotateRight when shift is a constant,
    so that GVN could identify the pre-existing node better.
  2. implements scalar rotate match rules and removes the original
    combinations of Or and Shifts on AArch64.

This patch doesn't implement vector rotate due to the lack of
corresponding vector instructions on AArch64.

Test case below is an explanation for this patch.

    public static int test(int i) {
        int a =  (i >>> 29) | (i << -29);
        int b = i << 3;
        int c = i >>> -3;
        int d = b | c;
        return a ^ d;
    }

Before:

    lsl     w12, w1, #3
    lsr     w10, w1, #29
    add     w11, w10, w12
    orr     w12, w12, w10
    eor     w0, w11, w12

After:

    ror     w10, w1, #29
    eor     w0, w10, w10

Tested jtreg TestRotate.java, hotspot::hotspot_all_no_apps,
jdk::jdk_core, langtools::tier1.


Progress

  • Change must not contain extraneous whitespace
  • Commit message must refer to an issue
  • Change must be properly reviewed

Issue

Reviewers

Download

$ git fetch https://git.openjdk.java.net/jdk pull/1199/head:pull/1199
$ git checkout pull/1199

This patch is a supplement for
https://bugs.openjdk.java.net/browse/JDK-8248830.

With the implementation of rotate node in IR, this patch:

1. canonicalizes RotateLeft into RotateRight when shift is a constant,
   so that GVN could identify the pre-existing node better.
2. implements scalar rotate match rules and removes the original
   combinations of Or and Shifts on AArch64.

This patch doesn't implement vector rotate due to the lack of
corresponding vector instructions on AArch64.

Test case below is an explanation for this patch.

        public static int test(int i) {
            int a =  (i >>> 29) | (i << -29);
            int b = i << 3;
            int c = i >>> -3;
            int d = b | c;
            return a ^ d;
        }

Before:

        lsl     w12, w1, openjdk#3
        lsr     w10, w1, openjdk#29
        add     w11, w10, w12
        orr     w12, w12, w10
        eor     w0, w11, w12

After:

        ror     w10, w1, openjdk#29
        eor     w0, w10, w10

Tested jtreg TestRotate.java, hotspot::hotspot_all_no_apps,
jdk::jdk_core, langtools::tier1.

Change-Id: Id7d00935945f1697247fff7041b0707107862786
@bridgekeeper
Copy link

bridgekeeper bot commented Nov 13, 2020

👋 Welcome back erik1iu! A progress list of the required criteria for merging this PR into master will be added to the body of your pull request. There are additional pull request commands available for use with this pull request.

@openjdk openjdk bot added the rfr Pull request is ready for review label Nov 13, 2020
@openjdk
Copy link

openjdk bot commented Nov 13, 2020

@erik1iu The following label will be automatically applied to this pull request:

  • hotspot-compiler

When this pull request is ready to be reviewed, an "RFR" email will be sent to the corresponding mailing list. If you would like to change these labels, use the /label pull request command.

@openjdk openjdk bot added the hotspot-compiler hotspot-compiler-dev@openjdk.org label Nov 13, 2020
@mlbridge
Copy link

mlbridge bot commented Nov 13, 2020

Webrevs

Copy link
Contributor

@theRealAph theRealAph left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks very nice in general, but please add one more case for the EOR (shifted register) instructions with rotation. At present we only do LSR, ASL, and ASR. You could add ROR to

define(ALL_SHIFT_KINDS', BOTH_SHIFT_INSNS($1, $2, URShift, LSR)
BOTH_SHIFT_INSNS($1, $2, RShift, ASR)
BOTH_SHIFT_INSNS($1, $2, LShift, LSL)')dnl

This is used in, for example, Java code for SHA 3.

@mlbridge
Copy link

mlbridge bot commented Nov 13, 2020

Mailing list message from John Rose on hotspot-compiler-dev:

On Nov 13, 2020, at 2:39 AM, Eric Liu <github.com+10482586+erik1iu at openjdk.java.net> wrote:

This patch is a supplement for
https://bugs.openjdk.java.net/browse/JDK-8248830.

With the implementation of rotate node in IR, this patch:

1. canonicalizes RotateLeft into RotateRight when shift is a constant,
so that GVN could identify the pre-existing node better.
2. implements scalar rotate match rules and removes the original
combinations of Or and Shifts on AArch64.

This patch doesn't implement vector rotate due to the lack of
corresponding vector instructions on AArch64.

Test case below is an explanation for this patch.

   public static int test\(int i\) \{
       int a \=  \(i >>> 29\) \| \(i \<\< \-29\)\;
       int b \= i \<\< 3\;
       int c \= i >>> \-3\;
       int d \= b \| c\;
       return a \^ d\;
   \}

Because of shift-count masking, this parses to nodes
equivalent to:

public static int test(int i) {
int a = (i >>> 29) | (i << 3);
int d = (i << 3) | (i >>> 29);
// not detected: a == d
int r = a ^ d;
// not detected: r == 0
return r;
}

If we were to work a little harder at canonicalizing
commutative expressions in IGVN, we could detect
that a==d. (See AddNode::Ideal.) It?s tempting to
pull on this very long string, but it?s not clear when
to stop, if not now.

In this case the better road is to canonicalize both
a and d to the same rotate node. But maybe there?s
some benefit in reordering x|y|z and x^y^z when
x and z could combine to a rotate node. (This isn?t
your problem!)

Before:

   lsl     w12\, w1\, \#3
   lsr     w10\, w1\, \#29
   add     w11\, w10\, w12
   orr     w12\, w12\, w10
   eor     w0\, w11\, w12

After:

   ror     w10\, w1\, \#29
   eor     w0\, w10\, w10

Amazingly, w10^w10 does not GVN to zero!

Your test appears to rely on that weakness.
I think the weakness should be fixed in a
separate investigation.

Anyway, none of these remarks reflects
on your patch.

? John

@mlbridge
Copy link

mlbridge bot commented Nov 16, 2020

Mailing list message from Eric Liu on hotspot-compiler-dev:

Hi Andrew,

Thanks for your review. I will update those cases soon.

B&R
Eric
-------------------------------------------------------------------------------
From: hotspot-compiler-dev <hotspot-compiler-dev-retn at openjdk.java.net> on behalf of Andrew Haley <aph at openjdk.java.net>
Sent: 14 November 2020 1:56
To: hotspot-compiler-dev at openjdk.java.net <hotspot-compiler-dev at openjdk.java.net>
Subject: Re: RFR: 8254872: Optimize Rotate on AArch64
?
On Fri, 13 Nov 2020 10:33:41 GMT, Eric Liu <github.com+10482586+erik1iu at openjdk.org> wrote:

This patch is a supplement for
https://bugs.openjdk.java.net/browse/JDK-8248830.

With the implementation of rotate node in IR, this patch:

1. canonicalizes RotateLeft into RotateRight when shift is a constant,
??? so that GVN could identify the pre-existing node better.
2. implements scalar rotate match rules and removes the original
??? combinations of Or and Shifts on AArch64.

This patch doesn't implement vector rotate due to the lack of
corresponding vector instructions on AArch64.

Test case below is an explanation for this patch.

???????? public static int test(int i) {
???????????? int a =? (i >>> 29) | (i << -29);
???????????? int b = i << 3;
???????????? int c = i >>> -3;
???????????? int d = b | c;
???????????? return a ^ d;
???????? }

Before:

???????? lsl???? w12, w1, #3
???????? lsr???? w10, w1, #29
???????? add???? w11, w10, w12
???????? orr???? w12, w12, w10
???????? eor???? w0, w11, w12

After:

???????? ror???? w10, w1, #29
???????? eor???? w0, w10, w10

Tested jtreg TestRotate.java, hotspot::hotspot_all_no_apps,
jdk::jdk_core, langtools::tier1.

Looks very nice in general, but please add one more case for the EOR (shifted register) instructions with rotation. At present we only do LSR, ASL, and ASR. You could add ROR to

define(`ALL_SHIFT_KINDS',
`BOTH_SHIFT_INSNS($1, $2, URShift, LSR)
BOTH_SHIFT_INSNS($1, $2, RShift, ASR)
BOTH_SHIFT_INSNS($1, $2, LShift, LSL)')dnl

This is used in, for example, Java code for SHA 3.

-------------

Changes requested by aph (Reviewer).

PR: https://git.openjdk.java.net/jdk/pull/1199

@mlbridge
Copy link

mlbridge bot commented Nov 16, 2020

Mailing list message from Eric Liu on hotspot-compiler-dev:

Hi John,

Thanks for your review. I'd like to take some time to investigate the weakness.

B&R
Eric
-----------------------------------------------------------------------------------

From: hotspot-compiler-dev <hotspot-compiler-dev-retn at openjdk.java.net> on behalf of John Rose <john.r.rose at oracle.com>
Sent: 14 November 2020 2:43
To: Eric Liu <github.com+10482586+erik1iu at openjdk.java.net>
Cc: hotspot-compiler-dev at openjdk.java.net <hotspot-compiler-dev at openjdk.java.net>
Subject: Re: RFR: 8254872: Optimize Rotate on AArch64
?
On Nov 13, 2020, at 2:39 AM, Eric Liu <github.com+10482586+erik1iu at openjdk.java.net> wrote:

This patch is a supplement for
https://bugs.openjdk.java.net/browse/JDK-8248830.

With the implementation of rotate node in IR, this patch:

1. canonicalizes RotateLeft into RotateRight when shift is a constant,
?? so that GVN could identify the pre-existing node better.
2. implements scalar rotate match rules and removes the original
?? combinations of Or and Shifts on AArch64.

This patch doesn't implement vector rotate due to the lack of
corresponding vector instructions on AArch64.

Test case below is an explanation for this patch.

??????? public static int test(int i) {
??????????? int a =? (i >>> 29) | (i << -29);
??????????? int b = i << 3;
??????????? int c = i >>> -3;
??????????? int d = b | c;
??????????? return a ^ d;
??????? }

Because of shift-count masking, this parses to nodes
equivalent to:

?????? public static int test(int i) {
?????????? int a =? (i >>> 29) | (i << 3);
?????????? int d = (i << 3) | (i >>> 29);
?????????? // not detected: a == d?
?????????? int r = a ^ d;
?????????? // not detected: r == 0
?????????? return r;
?????? }

If we were to work a little harder at canonicalizing
commutative expressions in IGVN, we could detect
that a==d.? (See AddNode::Ideal.)? It?s tempting to
pull on this very long string, but it?s not clear when
to stop, if not now.

In this case the better road is to canonicalize both
a and d to the same rotate node.? But maybe there?s
some benefit in reordering x|y|z and x^y^z when
x and z could combine to a rotate node.? (This isn?t
your problem!)

Before:

??????? lsl???? w12, w1, #3
??????? lsr???? w10, w1, #29
??????? add???? w11, w10, w12
??????? orr???? w12, w12, w10
??????? eor???? w0, w11, w12

After:

??????? ror???? w10, w1, #29
??????? eor???? w0, w10, w10

Amazingly, w10^w10 does not GVN to zero!

Your test appears to rely on that weakness.
I think the weakness should be fixed in a
separate investigation.

Anyway, none of these remarks reflects
on your patch.

? John

@e1iu
Copy link
Member Author

e1iu commented Nov 16, 2020

Hi Andrew,

I just thought about this a bit more.

Looks very nice in general, but please add one more case for the EOR (shifted register) instructions with rotation. At present we only do LSR, ASL, and ASR. You could add ROR to

define(ALL_SHIFT_KINDS', BOTH_SHIFT_INSNS($1, $2, URShift, LSR)
BOTH_SHIFT_INSNS($1, $2, RShift, ASR)
BOTH_SHIFT_INSNS($1, $2, LShift, LSL)')dnl

This is used in, for example, Java code for SHA 3.

I prefer to integrate this patch first.

  • I added those cases for EOR instructions in my local, they work fine in general but I suppose it still needs more strict regressions.
  • For another rule dst (AddI (LShiftI src1 lshift) (URShiftI src2 rshift)), I presume it can be transformed to Rotate in middle-end if lshift + rshift = 0 but I didn't implement it in this patch.
  • @vnkozlov (Vladimir Kozlov) left over an issue(https://bugs.openjdk.java.net/browse/JDK-8252776), which asks for refactoring the test cases in TestRotate.java, It's also a good chance to add other new test cases.

This patch is basically pure without regressions, and for above tasks I prefer to finish them in the next patch once for all.

Copy link
Contributor

@theRealAph theRealAph left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On Fri, 13 Nov 2020 17:54:34 GMT, Andrew Haley aph@openjdk.org wrote:

Looks very nice in general, but please add one more case for the EOR (shifted register) instructions with rotation. At present we only do LSR, ASL, and ASR. You could add ROR to

define(ALL_SHIFT_KINDS', BOTH_SHIFT_INSNS($1, $2, URShift, LSR)
BOTH_SHIFT_INSNS($1, $2, RShift, ASR)
BOTH_SHIFT_INSNS($1, $2, LShift, LSL)')dnl

This is used in, for example, Java code for SHA 3.
I prefer to integrate this patch first.

  • I added those cases for EOR instructions in my local, they work fine in general but I suppose it still needs more strict regressions.

Perhaps so.

  • For another rule dst (AddI (LShiftI src1 lshift) (URShiftI src2 rshift)), I presume it can be transformed to Rotate in middle-end if lshift + rshift = 0

and src1 is the same register as src2.

but I didn't implement it in this patch.

I see.

This patch is basically pure without regressions, and for above tasks I prefer to finish them in the next patch once for all.

I suppose that's OK, but it seems to me odd to do half the job.

@openjdk
Copy link

openjdk bot commented Nov 16, 2020

@erik1iu This change now passes all automated pre-integration checks.

ℹ️ This project also has non-automated pre-integration requirements. Please see the file CONTRIBUTING.md for details.

After integration, the commit message for the final commit will be:

8254872: Optimize Rotate on AArch64

Reviewed-by: aph, kvn

You can use pull request commands such as /summary, /contributor and /issue to adjust it as needed.

At the time when this comment was updated there had been 101 new commits pushed to the master branch:

As there are no conflicts, your changes will automatically be rebased on top of these commits when integrating. If you prefer to avoid this automatic rebasing, please check the documentation for the /integrate command for further details.

As you do not have Committer status in this project an existing Committer must agree to sponsor your change. Possible candidates are the reviewers of this PR (@theRealAph, @vnkozlov) but any other Committer may sponsor as well.

➡️ To flag this PR as ready for integration with the above commit message, type /integrate in a new comment. (Afterwards, your sponsor types /sponsor in a new comment to perform the integration).

@openjdk openjdk bot added the ready Pull request is ready to be integrated label Nov 16, 2020
@e1iu
Copy link
Member Author

e1iu commented Nov 16, 2020

/integrate

@openjdk openjdk bot added the sponsor Pull request is ready to be sponsored label Nov 16, 2020
@openjdk
Copy link

openjdk bot commented Nov 16, 2020

@erik1iu
Your change (at version be71bf6) is now ready to be sponsored by a Committer.

Copy link
Contributor

@vnkozlov vnkozlov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

opto/ changes look good to me

@nick-arm
Copy link
Contributor

/sponsor

@openjdk openjdk bot closed this Nov 17, 2020
@openjdk openjdk bot added integrated Pull request has been integrated and removed sponsor Pull request is ready to be sponsored ready Pull request is ready to be integrated rfr Pull request is ready for review labels Nov 17, 2020
@openjdk
Copy link

openjdk bot commented Nov 17, 2020

@nick-arm @erik1iu Since your change was applied there have been 102 commits pushed to the master branch:

Your commit was automatically rebased without conflicts.

Pushed as commit 30a2ad5.

💡 You may see a message that your pull request was closed with unmerged commits. This can be safely ignored.

openjdk-notifier bot referenced this pull request Nov 17, 2020
@mlbridge
Copy link

mlbridge bot commented Nov 19, 2020

Mailing list message from Eric Liu on hotspot-compiler-dev:

Hi John,

I thought about this a bit more.

From: hotspot-compiler-dev <hotspot-compiler-dev-retn at openjdk.java.net> on behalf of John Rose <john.r.rose at oracle.com>
Sent: 14 November 2020 2:43
To: Eric Liu <github.com+10482586+erik1iu at openjdk.java.net>
Cc: hotspot-compiler-dev at openjdk.java.net <hotspot-compiler-dev at openjdk.java.net>
Subject: Re: RFR: 8254872: Optimize Rotate on AArch64
?
On Nov 13, 2020, at 2:39 AM, Eric Liu <github.com+10482586+erik1iu at openjdk.java.net> wrote:

This patch is a supplement for
https://bugs.openjdk.java.net/browse/JDK-8248830.

With the implementation of rotate node in IR, this patch:

1. canonicalizes RotateLeft into RotateRight when shift is a constant,
?? so that GVN could identify the pre-existing node better.
2. implements scalar rotate match rules and removes the original
?? combinations of Or and Shifts on AArch64.

This patch doesn't implement vector rotate due to the lack of
corresponding vector instructions on AArch64.

Test case below is an explanation for this patch.

??????? public static int test(int i) {
??????????? int a =? (i >>> 29) | (i << -29);
??????????? int b = i << 3;
??????????? int c = i >>> -3;
??????????? int d = b | c;
??????????? return a ^ d;
??????? }

Because of shift-count masking, this parses to nodes
equivalent to:

?????? public static int test(int i) {
?????????? int a =? (i >>> 29) | (i << 3);
?????????? int d = (i << 3) | (i >>> 29);
?????????? // not detected: a == d?
?????????? int r = a ^ d;
?????????? // not detected: r == 0

?????????? return r;
?????? }

If we were to work a little harder at canonicalizing
commutative expressions in IGVN, we could detect
that a==d.? (See AddNode::Ideal.)? It?s tempting to
pull on this very long string, but it?s not clear when
to stop, if not now.

In this case the better road is to canonicalize both
a and d to the same rotate node.? But maybe there?s
some benefit in reordering x|y|z and x^y^z when
x and z could combine to a rotate node.? (This isn?t
your problem!)

Before:

??????? lsl???? w12, w1, #3
??????? lsr???? w10, w1, #29
??????? add???? w11, w10, w12
??????? orr???? w12, w12, w10
??????? eor???? w0, w11, w12

After:

??????? ror???? w10, w1, #29
??????? eor???? w0, w10, w10

Amazingly, w10^w10 does not GVN to zero!

I think this was cased by the lack of Ideal on Xor
node. Perhaps we can add some rules for it:

1). x ^ x ==> 0, this can solve the above issue.
2). Const ^ x ==> x ^ Const, so that GVN could replace with
the pre-existed node.
3). x ^ y ==> x, if y is constant zero.
4). x ^ y ==> ~x, if y is constant bit mask value.

Your test appears to rely on that weakness.
I think the weakness should be fixed in a
separate investigation.

Anyway, none of these remarks reflects
on your patch.

? John

Thanks,
Eric

@e1iu e1iu deleted the rotate branch December 2, 2020 07:48
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

hotspot-compiler hotspot-compiler-dev@openjdk.org integrated Pull request has been integrated

Development

Successfully merging this pull request may close these issues.

4 participants