Skip to content

Conversation

@wenshao
Copy link
Contributor

@wenshao wenshao commented Jun 27, 2024

Currently, the java.util.Formatter$Conversion::isValid method is implemented based on switch, which cannot be inlined because codeSize > 325. This problem can be avoided by implementing it with ImmutableBitSetPredicate.

use -XX:+UnlockDiagnosticVMOptions -XX:+PrintInlining to see the master branch:

@ 109   java.util.Formatter$Conversion::isValid (358 bytes)   failed to inline: hot method too big

current version

@ 109   java.util.Formatter$Conversion::isValid (10 bytes)   inline (hot)
  @ 4   jdk.internal.util.ImmutableBitSetPredicate$SmallImmutableBitSetPredicate::test (50 bytes)   inline (hot)

Progress

  • Change must be properly reviewed (1 review required, with at least 1 Reviewer)
  • Change must not contain extraneous whitespace
  • Commit message must refer to an issue

Issue

  • JDK-8335252: Reduce size of j.u.Formatter.Conversion#isValid (Enhancement - P4)

Reviewers

Reviewing

Using git

Checkout this PR locally:
$ git fetch https://git.openjdk.org/jdk.git pull/19926/head:pull/19926
$ git checkout pull/19926

Update a local copy of the PR:
$ git checkout pull/19926
$ git pull https://git.openjdk.org/jdk.git pull/19926/head

Using Skara CLI tools

Checkout this PR locally:
$ git pr checkout 19926

View PR using the GUI difftool:
$ git pr show -t 19926

Using diff file

Download this PR as a diff file:
https://git.openjdk.org/jdk/pull/19926.diff

Webrev

Link to Webrev Comment

@wenshao wenshao changed the title use ImmutableBitSetPredicate optimize Formatter.Conversion#isValid Use ImmutableBitSetPredicate optimize Formatter.Conversion#isValid Jun 27, 2024
@bridgekeeper
Copy link

bridgekeeper bot commented Jun 27, 2024

👋 Welcome back wenshao! A progress list of the required criteria for merging this PR into master will be added to the body of your pull request. There are additional pull request commands available for use with this pull request.

@openjdk
Copy link

openjdk bot commented Jun 27, 2024

@wenshao This change now passes all automated pre-integration checks.

ℹ️ This project also has non-automated pre-integration requirements. Please see the file CONTRIBUTING.md for details.

After integration, the commit message for the final commit will be:

8335252: Reduce size of j.u.Formatter.Conversion#isValid

Reviewed-by: redestad

You can use pull request commands such as /summary, /contributor and /issue to adjust it as needed.

At the time when this comment was updated there had been 25 new commits pushed to the master branch:

  • 486aa11: 8335237: ubsan: vtableStubs.hpp is_vtable_stub exclude from ubsan checks
  • f4d8c00: 8334562: Automate com/sun/security/auth/callback/TextCallbackHandler/Default.java test
  • 49eb00d: 8299813: java/nio/channels/DatagramChannel/Disconnect.java fails with jtreg test timeout due to lost datagram
  • 8ec378a: 8277949: (dc) java/nio/channels/DatagramChannel/AdaptorBasic.java failed in timeout
  • c798316: 8269657: Test java/nio/channels/DatagramChannel/Loopback.java failed: Unexpected message
  • 99d2bbf: 8334433: jshell.exe runs an executable test.exe on startup
  • 6f4ddc2: 8335142: compiler/c1/TestTraceLinearScanLevel.java occasionally times out with -Xcomp
  • 3b3a19e: 8335314: Problem list compiler/uncommontrap/DeoptReallocFailure.java
  • d457609: 8319947: Recursive lightweight locking: s390x implementation
  • c47a0e0: 8334147: Shenandoah: Avoid taking lock for disabled free set logging
  • ... and 15 more: https://git.openjdk.org/jdk/compare/79a23017fc7154738c375fbb12a997525c3bf9e7...master

As there are no conflicts, your changes will automatically be rebased on top of these commits when integrating. If you prefer to avoid this automatic rebasing, please check the documentation for the /integrate command for further details.

As you do not have Committer status in this project an existing Committer must agree to sponsor your change. Possible candidates are the reviewers of this PR (@cl4es) but any other Committer may sponsor as well.

➡️ To flag this PR as ready for integration with the above commit message, type /integrate in a new comment. (Afterwards, your sponsor types /sponsor in a new comment to perform the integration).

@openjdk
Copy link

openjdk bot commented Jun 27, 2024

@wenshao The following labels will be automatically applied to this pull request:

  • core-libs
  • i18n

When this pull request is ready to be reviewed, an "RFR" email will be sent to the corresponding mailing lists. If you would like to change these labels, use the /label pull request command.

@openjdk openjdk bot added core-libs core-libs-dev@openjdk.org i18n i18n-dev@openjdk.org labels Jun 27, 2024
@wenshao
Copy link
Contributor Author

wenshao commented Jun 27, 2024

@cl4es This is a scenario optimized using ImmutableBitSetPredicate

@wenshao
Copy link
Contributor Author

wenshao commented Jun 27, 2024

Execute the following command, we can see that the java.util.Formatter$Conversion.isValid method generates a tableswitch of size 85. This large tableswitch makes the code size 358 bytes. Can C2 be optimized to use jdk.internal.util.ImmutableBitSetPredicate$SmallImmutableBitSetPredicate?

javap -c java.util.Formatter.Conversion
  • byte codes of Conversion.isValid
static boolean isValid(char);
    Code:
       0: iload_0
       1: tableswitch   { // 37 to 120
                    37: 352
                    38: 356
                    39: 356
                    40: 356
                    41: 356
                    42: 356
                    43: 356
                    44: 356
                    45: 356
                    46: 356
                    47: 356
                    48: 356
                    49: 356
                    50: 356
                    51: 356
                    52: 356
                    53: 356
                    54: 356
                    55: 356
                    56: 356
                    57: 356
                    58: 356
                    59: 356
                    60: 356
                    61: 356
                    62: 356
                    63: 356
                    64: 356
                    65: 352
                    66: 352
                    67: 352
                    68: 356
                    69: 352
                    70: 356
                    71: 352
                    72: 352
                    73: 356
                    74: 356
                    75: 356
                    76: 356
                    77: 356
                    78: 356
                    79: 356
                    80: 356
                    81: 356
                    82: 356
                    83: 352
                    84: 356
                    85: 356
                    86: 356
                    87: 356
                    88: 352
                    89: 356
                    90: 356
                    91: 356
                    92: 356
                    93: 356
                    94: 356
                    95: 356
                    96: 356
                    97: 352
                    98: 352
                    99: 352
                   100: 352
                   101: 352
                   102: 352
                   103: 352
                   104: 352
                   105: 356
                   106: 356
                   107: 356
                   108: 356
                   109: 356
                   110: 352
                   111: 352
                   112: 356
                   113: 356
                   114: 356
                   115: 352
                   116: 356
                   117: 356
                   118: 356
                   119: 356
                   120: 352
               default: 356
          }
     352: iconst_1
     353: goto          357
     356: iconst_0
     357: ireturn

@wenshao wenshao changed the title Use ImmutableBitSetPredicate optimize Formatter.Conversion#isValid 8335252: Use ImmutableBitSetPredicate optimize j.u.Formatter.Conversion#isValid Jun 27, 2024
@openjdk openjdk bot added the rfr Pull request is ready for review label Jun 27, 2024
@mlbridge
Copy link

mlbridge bot commented Jun 27, 2024

Webrevs

@cl4es
Copy link
Member

cl4es commented Jun 27, 2024

  • byte codes of Conversion.isValid

It's surprising that javac generates synthetic cases identical to the default case for every char in the 37 to 120 range (the bounds seem to be the lowest and highest of the constants).

Is there a benchmark where you see a benefit from this?

EDIT: Not surprising once I was reminded that tableswitches need consecutive ranges.

@wenshao
Copy link
Contributor Author

wenshao commented Jun 27, 2024

I found using @ForceInline gave me better performance with smaller code changes,

Here are the performance numbers running on Mac M1 Max:

# baseline
Benchmark                     Mode  Cnt   Score   Error  Units
StringFormat.stringIntFormat  avgt   15  93.299 ? 1.268  ns/op

# WebRevs 00 fb5c8001
Benchmark                     Mode  Cnt   Score   Error  Units
StringFormat.stringIntFormat  avgt   15  89.164 ? 2.188  ns/op

# ForceInline
Benchmark                     Mode  Cnt   Score   Error  Units
StringFormat.stringIntFormat  avgt   15  84.473 ? 4.287  ns/op

@wenshao wenshao changed the title 8335252: Use ImmutableBitSetPredicate optimize j.u.Formatter.Conversion#isValid 8335252: ForceInline j.u.Formatter.Conversion#isValid Jun 27, 2024
@PaulSandoz
Copy link
Member

PaulSandoz commented Jun 27, 2024

Can you provide some additional context here? I think we need to be very careful about the general use @ForceInline in core libraries. While you show a modest performance benefit using a the micro benchmark, will it actually make any difference overall given formatting strings is not particular efficient?

String templates, currently removed, provided good string formatting performance, and the redesign will i think also provide good performance.

@cl4es
Copy link
Member

cl4es commented Jun 28, 2024

So the compound issue here is that there's a heuristic which prevents inlining of large methods, and that a tableswitch on inputs spanning from '%' (37) to 'x' (120) become expansive enough (350bytes) to make the JIT enforce that rule.

As an alternative to @ForceInline I tested splitting uppercase and lowercase chars into two switches. This looks a bit nasty, but actually reduces the size of the method:

diff --git a/src/java.base/share/classes/java/util/Formatter.java b/src/java.base/share/classes/java/util/Formatter.java
index a7d95ee5780..50419f1ea23 100644
--- a/src/java.base/share/classes/java/util/Formatter.java
+++ b/src/java.base/share/classes/java/util/Formatter.java
@@ -4947,30 +4947,36 @@ static class Conversion {
         static final char PERCENT_SIGN        = '%';

         static boolean isValid(char c) {
-            return switch (c) {
-                case BOOLEAN,
-                     BOOLEAN_UPPER,
-                     STRING,
-                     STRING_UPPER,
-                     HASHCODE,
-                     HASHCODE_UPPER,
-                     CHARACTER,
-                     CHARACTER_UPPER,
-                     DECIMAL_INTEGER,
-                     OCTAL_INTEGER,
-                     HEXADECIMAL_INTEGER,
-                     HEXADECIMAL_INTEGER_UPPER,
-                     SCIENTIFIC,
-                     SCIENTIFIC_UPPER,
-                     GENERAL,
-                     GENERAL_UPPER,
-                     DECIMAL_FLOAT,
-                     HEXADECIMAL_FLOAT,
-                     HEXADECIMAL_FLOAT_UPPER,
-                     LINE_SEPARATOR,
-                     PERCENT_SIGN -> true;
-                default -> false;
-            };
+            if (c >= 'a') {
+                return switch (c) {
+                    case BOOLEAN,
+                         STRING,
+                         HASHCODE,
+                         CHARACTER,
+                         DECIMAL_INTEGER,
+                         OCTAL_INTEGER,
+                         HEXADECIMAL_INTEGER,
+                         SCIENTIFIC,
+                         GENERAL,
+                         DECIMAL_FLOAT,
+                         HEXADECIMAL_FLOAT,
+                         LINE_SEPARATOR -> true;
+                    default -> false;
+                };
+            } else {
+                return switch (c) {
+                    case BOOLEAN_UPPER,
+                         STRING_UPPER,
+                         HASHCODE_UPPER,
+                         CHARACTER_UPPER,
+                         HEXADECIMAL_INTEGER_UPPER,
+                         SCIENTIFIC_UPPER,
+                         GENERAL_UPPER,
+                         HEXADECIMAL_FLOAT_UPPER,
+                         PERCENT_SIGN -> true;
+                    default -> false;
+                };
+            }
         }

         // Returns true iff the Conversion is applicable to all objects.

javap -v java.lang.Formatter\$Conversion

  static boolean isValid(char);
    descriptor: (C)Z
    flags: (0x0008) ACC_STATIC
    Code:
      stack=2, locals=1, args_size=1
         0: iload_0
         1: bipush        97
         3: if_icmplt     122
         6: iload_0
         7: tableswitch   { // 97 to 120
                      97: 116
                      98: 116
                      99: 116
                     100: 116
                     101: 116
                     102: 116
                     103: 116
                     104: 116
                     105: 120
                     106: 120
                     107: 120
                     108: 120
                     109: 120
                     110: 116
                     111: 116
                     112: 120
                     113: 120
                     114: 120
                     115: 116
                     116: 120
                     117: 120
                     118: 120
                     119: 120
                     120: 116
                 default: 120
            }
       116: iconst_1
       117: goto          121
       120: iconst_0
       121: ireturn
       122: iload_0
       123: lookupswitch  { // 9
                      37: 204
                      65: 204
                      66: 204
                      67: 204
                      69: 204
                      71: 204
                      72: 204
                      83: 204
                      88: 204
                 default: 208
            }
       204: iconst_1
       205: goto          209
       208: iconst_0
       209: ireturn

And sees an improvement similar to @ForceInline on the micro:

Name            Cnt   Base   Error    Test   Error  Unit  Change
stringIntFormat  15 92,447 ± 2,497  82,912 ± 3,317 ns/op   1,11x (p = 0,000*)
 * = significant

Shrinking the size of the .class file by 121 bytes is a fun bonus.

@wenshao
Copy link
Contributor Author

wenshao commented Jun 28, 2024

@cl4es provided a better idea, which is to change the branch of switch instead of using ForceInline. I have another approach:

            return switch (c) {
                case BOOLEAN,
                     BOOLEAN_UPPER,
                     STRING,
                     STRING_UPPER,
                     HASHCODE,
                     HASHCODE_UPPER,
                     CHARACTER,
                     CHARACTER_UPPER,
                     DECIMAL_INTEGER,
                     OCTAL_INTEGER,
                     HEXADECIMAL_INTEGER,
                     HEXADECIMAL_INTEGER_UPPER,
                     SCIENTIFIC,
                     SCIENTIFIC_UPPER,
                     GENERAL,
                     GENERAL_UPPER,
                     DECIMAL_FLOAT,
                     HEXADECIMAL_FLOAT,
                     HEXADECIMAL_FLOAT_UPPER,
                     LINE_SEPARATOR -> true;
                default -> c == PERCENT_SIGN;
            };

@cl4es
Copy link
Member

cl4es commented Jun 28, 2024

This looks better. Avoids a branch up front, minimized code changes. The comment is good, but perhaps move it down into the default case?

Update bug description to something like "Reduce size of j.u.Formatter.Conversion#isValid" ?

@wenshao wenshao changed the title 8335252: ForceInline j.u.Formatter.Conversion#isValid 8335252: Reduce size of j.u.Formatter.Conversion#isValid Jun 28, 2024
Copy link
Member

@cl4es cl4es left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the updates. I think this is now a harmless and straightforward improvement.

@openjdk openjdk bot added the ready Pull request is ready to be integrated label Jun 28, 2024
@wenshao
Copy link
Contributor Author

wenshao commented Jun 28, 2024

/integrate

@openjdk openjdk bot added the sponsor Pull request is ready to be sponsored label Jun 28, 2024
@openjdk
Copy link

openjdk bot commented Jun 28, 2024

@wenshao
Your change (at version 5707dc6) is now ready to be sponsored by a Committer.

@PaulSandoz
Copy link
Member

That's a clever change and more preferable (but still potentially fragile in the source to bytecode translation process). However I still don't understand the overall motivation, why does it matter?

@wenshao
Copy link
Contributor Author

wenshao commented Jun 28, 2024

@PaulSandoz
String.format is widely used. In some performance-critical scenarios, when doing performance profiling, it is often found that String.format calls consume a lot of time. My suggestion is to replace String.format with String concat or StringBuilder.

String info = String.format("xxxx %s", str);

replace to

String info = "xxxx %s" + str;

or

String info = new StringBuilder("xxxx ").append(str).toString();

Let's do a Benchmark comparison:

class StringFormat{
    @Benchmark
    public String stringIntFormat() {
        return "%s %d".formatted(s, i);
    }

    @Benchmark
    public String stringIntIndy() {
        return s + " " + i;
    }

    @Benchmark
    public String stringIntStringBuilder() {
        return new StringBuilder(s).append(" ").append(i).toString();
    }    
}
Benchmark                            Mode  Cnt   Score   Error  Units
StringFormat.stringIntFormat         avgt   15  87.451 ? 2.890  ns/op
StringFormat.stringIntIndy           avgt   15   6.471 ? 0.089  ns/op
StringFormat.stringIntStringBuilder  avgt   15   7.949 ? 0.067  ns/op

It can be seen that the performance of using StringConcat and StringBuilder is 10 times that of using String.format.

StringTemplate (JEP 430/459/465) was an attempt to improve performance, but the API was not friendly.

I was going to contribute to StringTemplate, (PR #16044 / PR #16021/PR #16017,) but StringTemplate has been removed.

I'm working on improving the performance of String.format.

The PR I submitted before: 8316704: Regex-free parsing of Formatter and FormatProcessor specifiers

This PR is a very minor improvement, We may also need to do further work to provide a fastpath for common scenarios to improve performance. I also plan to do such work.

@liach
Copy link
Member

liach commented Jun 28, 2024

Thanks Wen Shaojin. Though in the long term we would avoid repeated parsing, this little optimization to speed up parsing is still very useful.

/sponsor

@openjdk
Copy link

openjdk bot commented Jun 28, 2024

Going to push as commit 5d866bf.
Since your change was applied there have been 29 commits pushed to the master branch:

  • 166f9d9: 8335221: Some C2 intrinsics incorrectly assume that type argument is compile-time constant
  • 3e23e9c: 8335344: test/jdk/sun/security/tools/keytool/NssTest.java fails to compile
  • 79a3554: 8335124: com/sun/management/ThreadMXBean/ThreadCpuTimeArray.java failed with CPU time out of expected range
  • 45c4eaa: 8335274: SwitchBootstraps.ResolvedEnumLabels.resolvedEnum should be final
  • 486aa11: 8335237: ubsan: vtableStubs.hpp is_vtable_stub exclude from ubsan checks
  • f4d8c00: 8334562: Automate com/sun/security/auth/callback/TextCallbackHandler/Default.java test
  • 49eb00d: 8299813: java/nio/channels/DatagramChannel/Disconnect.java fails with jtreg test timeout due to lost datagram
  • 8ec378a: 8277949: (dc) java/nio/channels/DatagramChannel/AdaptorBasic.java failed in timeout
  • c798316: 8269657: Test java/nio/channels/DatagramChannel/Loopback.java failed: Unexpected message
  • 99d2bbf: 8334433: jshell.exe runs an executable test.exe on startup
  • ... and 19 more: https://git.openjdk.org/jdk/compare/79a23017fc7154738c375fbb12a997525c3bf9e7...master

Your commit was automatically rebased without conflicts.

@openjdk openjdk bot added the integrated Pull request has been integrated label Jun 28, 2024
@openjdk openjdk bot closed this Jun 28, 2024
@openjdk openjdk bot removed ready Pull request is ready to be integrated rfr Pull request is ready for review sponsor Pull request is ready to be sponsored labels Jun 28, 2024
@openjdk
Copy link

openjdk bot commented Jun 28, 2024

@liach @wenshao Pushed as commit 5d866bf.

💡 You may see a message that your pull request was closed with unmerged commits. This can be safely ignored.

@cl4es
Copy link
Member

cl4es commented Jun 29, 2024

I agree with improving the legacy String.format code, within reason due it's wide use and appeal. Doing so does not detract from the goal of providing smarter and faster formatting via StringTemplates.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

core-libs core-libs-dev@openjdk.org i18n i18n-dev@openjdk.org integrated Pull request has been integrated

Development

Successfully merging this pull request may close these issues.

4 participants