8311207: Cleanup for Optimization for UUID.toString #14745

wenshao · 2023-07-01T01:44:15Z

PR 14578 still has unresolved discussions, continue to make improvements.

Benchmark Result

sh make/devkit/createJMHBundle.sh
bash configure --with-jmh=build/jmh/jars
make test TEST="micro:java.util.UUIDBench.toString"

1. aliyun_ecs_c8i.xlarge

cpu : intel xeon sapphire rapids (x64)

-Benchmark           (size)   Mode  Cnt   Score   Error   Units (baseline)
-UUIDBench.toString   20000  thrpt   15  62.019 ± 0.622  ops/us

+Benchmark           (size)   Mode  Cnt   Score   Error   Units
+UUIDBench.toString   20000  thrpt   15  82.998 ± 0.739  ops/us (+33.82%)

2. aliyun_ecs_c8a.xlarge

cpu : amd epc genoa (x64)

-Benchmark           (size)   Mode  Cnt   Score   Error   Units (baseline)
-UUIDBench.toString   20000  thrpt   15  88.668 ± 0.672  ops/us

+Benchmark           (size)   Mode  Cnt   Score   Error   Units
+UUIDBench.toString   20000  thrpt   15  89.229 ± 0.271  ops/us (+0.63%)

3. aliyun_ecs_c8y.xlarge

cpu : aliyun yitian 710 (aarch64)

-Benchmark           (size)   Mode  Cnt   Score   Error   Units (baseline)
-UUIDBench.toString   20000  thrpt   15  49.382 ± 2.160  ops/us

+Benchmark           (size)   Mode  Cnt   Score   Error   Units
+UUIDBench.toString   20000  thrpt   15  49.636 ± 1.974  ops/us (+0.51%)

4. MacBookPro M1 Pro

-Benchmark           (size)   Mode  Cnt    Score   Error   Units (baseline)
-UUIDBench.toString   20000  thrpt   15  103.543 ± 0.963  ops/us

+Benchmark           (size)   Mode  Cnt    Score   Error   Units
+UUIDBench.toString   20000  thrpt   15  110.976 ± 0.685  ops/us (+7.17%)

5. Orange Pi 5 Plus

-Benchmark           (size)   Mode  Cnt   Score   Error   Units (baseline)
-UUIDBench.toString   20000  thrpt   15  33.532 ± 0.396  ops/us

+Benchmark           (size)   Mode  Cnt   Score   Error   Units (PR)
+UUIDBench.toString   20000  thrpt   15  33.054 ± 0.190  ops/us (-4.42%)

Progress

Change must be properly reviewed (1 review required, with at least 1 Reviewer)
Change must not contain extraneous whitespace
Commit message must refer to an issue

Issue

JDK-8311207: Cleanup for Optimization for UUID.toString (Enhancement - P4)

Reviewers

Chen Liang (@liach - Author) ⚠️ Review applies to e5274ddc
Roger Riggs (@RogerRiggs - Reviewer) ⚠️ Review applies to 384354d9

Reviewing

Using git

Checkout this PR locally:
$ git fetch https://git.openjdk.org/jdk.git pull/14745/head:pull/14745
$ git checkout pull/14745

Update a local copy of the PR:
$ git checkout pull/14745
$ git pull https://git.openjdk.org/jdk.git pull/14745/head

Using Skara CLI tools

Checkout this PR locally:
$ git pr checkout 14745

View PR using the GUI difftool:
$ git pr show -t 14745

Using diff file

Download this PR as a diff file:
https://git.openjdk.org/jdk/pull/14745.diff

Webrev

Link to Webrev Comment

bridgekeeper · 2023-07-01T01:45:16Z

👋 Welcome back wenshao! A progress list of the required criteria for merging this PR into master will be added to the body of your pull request. There are additional pull request commands available for use with this pull request.

openjdk · 2023-07-01T01:46:49Z

@wenshao The following label will be automatically applied to this pull request:

core-libs

When this pull request is ready to be reviewed, an "RFR" email will be sent to the corresponding mailing list. If you would like to change these labels, use the /label pull request command.

mlbridge · 2023-07-01T01:53:16Z

Webrevs

liach · 2023-07-01T02:09:36Z

Is using Unsafe directly consistently faster than using ByteArray? It should have similar performance as ByteArray's VarHandle is simply a wrapper around Unsafe's put/get methods.

src/java.base/share/classes/java/util/HexDigits.java

wenshao · 2023-07-01T02:33:11Z

Is using Unsafe directly consistently faster than using ByteArray? It should have similar performance as ByteArray's VarHandle is simply a wrapper around Unsafe's put/get methods.

Using Unsafe on aliyun_ecs_c8i.xlarge and MacBookPro M1 Pro is faster than ByteArray, and I haven't figured out why

liach · 2023-07-01T02:35:35Z

Is using Unsafe directly consistently faster than using ByteArray? It should have similar performance as ByteArray's VarHandle is simply a wrapper around Unsafe's put/get methods.

Using Unsafe on aliyun_ecs_c8i.xlarge and MacBookPro M1 Pro is faster than ByteArray, and I haven't figured out why

Then it's probably VarHandle's overhead. No worries; your change to use Unsafe is totally fine.

Meanwhile, can you enable GitHub actions on your fork, so it can detect compile and test errors? Like this:

wenshao · 2023-07-01T02:40:22Z

it's enabled

liach · 2023-07-01T02:45:58Z

@wenshao I have made my suggestions into a patch for you: wenshao#1
Feel free to review.

Suggested HexDigits change

openjdk · 2023-07-01T02:54:41Z

⚠️ @wenshao This pull request contains merges that bring in commits not present in the target repository. Since this is not a "merge style" pull request, these changes will be squashed when this pull request in integrated. If this is your intention, then please ignore this message. If you want to preserve the commit structure, you must change the title of this pull request to Merge <project>:<branch> where <project> is the name of another project in the OpenJDK organization (for example Merge jdk:master).

liach · 2023-07-01T03:06:31Z

@wenshao I have made my suggestions into a patch for you: wenshao#1 Feel free to review.

your version performance is a bit worse, if i can't find the reason, I will restore the previous version

Can you post the benchmark results? And do you have like a baseline for the benchmarks, as there may be other factors that affect performance from run to run?

Also, my fault for actions: can you go to your actions tab and enable actions for your fork like this?

wenshao · 2023-07-01T03:08:57Z

@liach Your version performance is a bit worse, If I can't find the reason, I will revert to the previous version

wenshao · 2023-07-01T03:43:19Z

@liach

MacBookPro M1 Pro

Update 02
current version 76dba964

Benchmark           (size)   Mode  Cnt    Score   Error   Units 
UUIDBench.toString   20000  thrpt   15  104.262 ± 2.199  ops/us

Update 01
your version 425559f6

Benchmark           (size)   Mode  Cnt   Score   Error   Units
UUIDBench.toString   20000  thrpt   15  81.622 ± 0.194  ops/us

wenshao · 2023-07-01T03:54:47Z

@liach I guess your version is slower because it doesn't support out-of-order execution.

src/java.base/share/classes/java/util/UUID.java

Co-authored-by: liach <liach@users.noreply.github.com>

wenshao · 2023-07-01T16:02:56Z

/integrate

openjdk · 2023-07-01T16:03:14Z

@wenshao This pull request has not yet been marked as ready for integration.

openjdk · 2023-09-08T20:37:18Z

@wenshao
Your change (at version 46b4b05) is now ready to be sponsored by a Committer.

src/java.base/share/classes/java/util/HexDigits.java

wenshao · 2023-09-10T18:47:56Z

/sponsor

openjdk · 2023-09-10T18:49:24Z

@wenshao Only Committers are allowed to sponsor changes.

RogerRiggs · 2023-09-10T18:54:33Z

Given the endian-ness issues with https://git.openjdk.org/jdk/pull/14699.
I'll need to run a more complete set of tests first (before its sponsored).

wenshao · 2023-09-11T10:09:13Z

@TheRealMDoerr Can you help me test this PR on AIX (big endian) ?

TheRealMDoerr · 2023-09-11T13:32:07Z

I have run a couple of tests on linux Big Endian. They have passed. So, it's probably correct. However, I can't tell if it's good to use ByteArrayLittleEndian. I don't really like such platform details in the Java classes. Is that necessary for better performance on x86?

wenshao · 2023-09-11T13:58:22Z

@RogerRiggs Can it be merged now?

liach · 2023-09-11T14:20:03Z

@TheRealMDoerr ByteArrayLittleEndian only means that the input long/int/short/char will be seen as little-endian when written to a byte array; do you mean that assuming little-endian writes are faster is too platform-specific?

An alternative approach tried before is to pack the digits platform-specifically and use Unsafe (which bypasses platform-endianness reversals) to write directly; I recall it was rejected before, for using unsafe directly seems... unsafe :)

TheRealMDoerr · 2023-09-11T14:24:31Z

I think making sure C2 optimizes it would be a better approach. Java classes shouldn't be optimized for performance on any endianness version IMHO. Rather for readability.
@offamitkumar, @deepa181, @JoKern65, @TOatGithub: You may want to check performance impact on s390x and AIX.

offamitkumar · 2023-09-11T14:38:29Z

@offamitkumar, @deepa181, @JoKern65, @TOatGithub: You may want to check performance impact on s390x and AIX.

@TheRealMDoerr Testing on s390 is not possible for now, as build is broken due to field resolution changes.

# Conflicts: # src/java.base/share/classes/java/util/UUID.java # src/java.base/share/classes/jdk/internal/util/HexDigits.java

RogerRiggs · 2023-09-12T19:23:13Z

src/java.base/share/classes/jdk/internal/util/HexDigits.java

+        return DIGITS[b0 & 0xff]
+                | (DIGITS[b1 & 0xff] << 16)
+                | (((long) DIGITS[b2 & 0xff]) << 32)
+                | (((long) DIGITS[b3 & 0xff]) << 48);


Can you reverse the order of these source lines to put the shifts of the higher order bits before the lower order bit shifts. 3333222211110000. Its easier to understand where the bits end up in the long.
The rest of the change is better focused.

if reverse packDigits order, performance will be slow, I don't know why yet.

The following is the data running on MacBookPro M1 Max :

make test TEST="micro:java.util.UUIDBench.toString" Benchmark (size) Mode Cnt Score Error Units (current order 4f6ed3e6) UUIDBench.toString 20000 thrpt 15 96.396 ? 0.946 ops/us Benchmark (size) Mode Cnt Score Error Units (reverse packDigits order) UUIDBench.toString 20000 thrpt 15 86.496 ? 0.542 ops/us

Looks like something that might be an interesting puzzler for JIT compiler folks. Perhaps added implicit casts to long messes something up?

wenshao · 2023-09-13T00:41:20Z

/integrate

openjdk · 2023-09-13T00:41:58Z

@wenshao
Your change (at version 4f6ed3e) is now ready to be sponsored by a Committer.

cl4es · 2023-09-13T08:23:22Z

/sponsor

openjdk · 2023-09-13T08:25:21Z

Going to push as commit f8df754.
Since your change was applied there have been 16 commits pushed to the master branch:

fecd2fd: 8315898: Open source swing JMenu tests
bb6b3f2: 8315761: Open source few swing JList and JMenuBar tests
2d168c5: 8313202: MutexLocker should disallow null Mutexes
36552e7: 8316123: ProblemList serviceability/dcmd/gc/RunFinalizationTest.java on AIX
fe5ef5f: 8315677: Open source few swing JFileChooser and other tests
ece9bdf: 8299614: Shenandoah: STW mark should keep nmethod/oops referenced from stack chunk alive
a36f5a5: 8315663: Open source misc awt tests
cbbfa0d: 8315652: RISC-V: Features string uses wrong separator for jtreg
1ebf510: 8315743: RISC-V: Cleanup masm lr()/sc() methods
bd52bbf: 8316060: test/hotspot/jtreg/runtime/reflect/ReflectOutOfMemoryError.java may fail if heap is huge
... and 6 more: https://git.openjdk.org/jdk/compare/e0845163aa57cc8f68b11e1a553885676358f2a6...master

Your commit was automatically rebased without conflicts.

openjdk · 2023-09-13T08:25:33Z

@cl4es @wenshao Pushed as commit f8df754.

💡 You may see a message that your pull request was closed with unmerged commits. This can be safely ignored.

8311207: Optimization for j.u.UUID.toString

468bc61

openjdk bot added the rfr Pull request is ready for review label Jul 1, 2023

openjdk bot added the core-libs core-libs-dev@openjdk.org label Jul 1, 2023

use big endian

244c0eb

wenshao mentioned this pull request Jul 1, 2023

JDK-8310502 : Optimization for j.l.Long.fastUUID() #14578

Closed

3 tasks

liach reviewed Jul 1, 2023

View reviewed changes

src/java.base/share/classes/java/util/HexDigits.java Outdated Show resolved Hide resolved

src/java.base/share/classes/java/util/HexDigits.java Outdated Show resolved Hide resolved

Suggested HexDigits change

b6c59e0

Merge pull request #1 from liachmodded/optimization_for_uuid_to_string

425559f

Suggested HexDigits change

revert to the previous version

76dba96

ExE-Boss reviewed Jul 1, 2023

View reviewed changes

src/java.base/share/classes/java/util/UUID.java Outdated Show resolved Hide resolved

wenshao and others added 3 commits July 1, 2023 17:41

private static final Unsafe

43ec99d

Explain the rationale

6abf314

Co-authored-by: liach <liach@users.noreply.github.com>

code style

e5274dd

liach approved these changes Jul 1, 2023

View reviewed changes

wenshao changed the title ~~8311207: Optimization for j.u.UUID.toString~~ 8311207: Cleanup for Optimization for UUID.toString Jul 1, 2023

openjdk bot added the sponsor Pull request is ready to be sponsored label Sep 8, 2023

ExE-Boss reviewed Sep 8, 2023

View reviewed changes

src/java.base/share/classes/java/util/HexDigits.java Show resolved Hide resolved

cl4es reviewed Sep 9, 2023

View reviewed changes

src/java.base/share/classes/java/util/HexDigits.java Show resolved Hide resolved

wenshao mentioned this pull request Sep 10, 2023

8315968: Move java.util.Digits to jdk.internal.util and refactor to reduce duplication #15651

Closed

3 tasks

wenshao added 2 commits September 13, 2023 00:41

Merge branch 'master' into optimization_for_uuid_to_string

10b1e6e

# Conflicts: # src/java.base/share/classes/java/util/UUID.java # src/java.base/share/classes/jdk/internal/util/HexDigits.java

merge from master

4f6ed3e

openjdk bot removed the sponsor Pull request is ready to be sponsored label Sep 12, 2023

RogerRiggs reviewed Sep 12, 2023

View reviewed changes

openjdk bot added the sponsor Pull request is ready to be sponsored label Sep 13, 2023

openjdk bot added the integrated Pull request has been integrated label Sep 13, 2023

openjdk bot closed this Sep 13, 2023

openjdk bot removed ready Pull request is ready to be integrated rfr Pull request is ready for review sponsor Pull request is ready to be sponsored labels Sep 13, 2023

liach mentioned this pull request Jul 26, 2024

请帮忙提名OpenJDK Committer liachmodded/talk#21

Closed

8311207: Cleanup for Optimization for UUID.toString #14745

8311207: Cleanup for Optimization for UUID.toString #14745

Uh oh!

Conversation

wenshao commented Jul 1, 2023 • edited by openjdk bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Benchmark Result

1. aliyun_ecs_c8i.xlarge

2. aliyun_ecs_c8a.xlarge

3. aliyun_ecs_c8y.xlarge

4. MacBookPro M1 Pro

5. Orange Pi 5 Plus

Progress

Issue

Reviewers

Reviewing

Webrev

Uh oh!

bridgekeeper bot commented Jul 1, 2023

Uh oh!

openjdk bot commented Jul 1, 2023

Uh oh!

mlbridge bot commented Jul 1, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Webrevs

Uh oh!

liach commented Jul 1, 2023

Uh oh!

Uh oh!

Uh oh!

wenshao commented Jul 1, 2023

Uh oh!

liach commented Jul 1, 2023

Uh oh!

wenshao commented Jul 1, 2023

Uh oh!

liach commented Jul 1, 2023

Uh oh!

openjdk bot commented Jul 1, 2023

Uh oh!

liach commented Jul 1, 2023

Uh oh!

wenshao commented Jul 1, 2023

Uh oh!

wenshao commented Jul 1, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

MacBookPro M1 Pro

Uh oh!

wenshao commented Jul 1, 2023

Uh oh!

Uh oh!

wenshao commented Jul 1, 2023

Uh oh!

openjdk bot commented Jul 1, 2023

Uh oh!

openjdk bot commented Sep 8, 2023

Uh oh!

Uh oh!

Uh oh!

wenshao commented Sep 10, 2023

Uh oh!

openjdk bot commented Sep 10, 2023

Uh oh!

RogerRiggs commented Sep 10, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

wenshao commented Sep 11, 2023

Uh oh!

TheRealMDoerr commented Sep 11, 2023

Uh oh!

wenshao commented Sep 11, 2023

Uh oh!

liach commented Sep 11, 2023

Uh oh!

TheRealMDoerr commented Sep 11, 2023

Uh oh!

offamitkumar commented Sep 11, 2023

Uh oh!

RogerRiggs Sep 12, 2023

Choose a reason for hiding this comment

wenshao commented Jul 1, 2023 •

edited by openjdk bot

Loading

mlbridge bot commented Jul 1, 2023 •

edited

Loading

wenshao commented Jul 1, 2023 •

edited

Loading

RogerRiggs commented Sep 10, 2023 •

edited

Loading

wenshao Sep 12, 2023 •

edited

Loading