Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Replace Unsafe with VarHandle #67

Open
qweek opened this issue Sep 13, 2021 · 7 comments
Open

Replace Unsafe with VarHandle #67

qweek opened this issue Sep 13, 2021 · 7 comments
Labels

Comments

@qweek
Copy link

qweek commented Sep 13, 2021

I tested my implementation for Java 9+ with VarHandles instead of Unsafe.
In my test case (hashing multiple files with a total size of ~ 1 GB) I noticed a huge speed improvement

  • XxHash: Unsafe 1220 ms vs VarHandle 630 ms
  • XxHash3: Unsafe 1280 ms vs VarHandle 720 ms

Is it possible to publish separate version for modern JDKs without dependency on Unsafe?

@gzm55
Copy link
Collaborator

gzm55 commented Sep 13, 2021

@qweek
Impressively, may be all hash would be improved with VarHandle which is designed to replace operations in Unsafe.

I suggest, in a new pr, create a dedicated Access implementation with VarHandle, and detect a better access class at runtime, in case of portability problems on the mobile jdk. After passing all the ut, it is better to attach jmh profiling benchmark results on major hashs and LTS JDK.

@plokhotnyuk
Copy link

@qweek It can be published as a multi-release JAR in the same artifact that is still compatible with Java 8: https://in.relation.to/2017/02/13/building-multi-release-jars-with-maven/

@qweek
Copy link
Author

qweek commented Sep 13, 2021

@gzm55 I'll try to look at it, but it won't be any time soon. It may be easier for you to start by checking how much it speeds up your use cases.

@plokhotnyuk I thought about multi-release JAR, but it have some drawbacks:

  • we still will have dependency on internal DirectBuffer and we need to open access to this module,
    but with VarHandle you can have direct access to buffer with same speed.
  • with VarHandle we don't need reverse Access anymore, this will make existing code more confusing
  • we need to change XxHash3 implementation because now it depends on specific offsets and BYTE_BASE

With such a structure that is required to maintain backward compatibility, the code will turn into mess

@develar
Copy link

develar commented Nov 24, 2021

@qweek I see that in your fork there is no support for seed. It was difficult to implement?

@gzm55
Copy link
Collaborator

gzm55 commented Feb 7, 2022

I try to benchmark the Unsafe.getLong and VarHandle.get on a x64 server with LTS JDK 11 and 17. The Unsafe.getLong performs about 17% better than VarHandle on JDK 11, and 50% better on JDK 17. On another macOS with JDK11 I get nearly the same results.

@qweek, in your implementation and benchmark, is there other difference between two compared version?

# JMH version: 1.34
# VM version: JDK 17.0.2, Java HotSpot(TM) 64-Bit Server VM, 17.0.2+8-LTS-86
# VM invoker: /opt/jdk-17.0.2/bin/java
# VM options: <none>
# Blackhole mode: compiler (auto-detected, use -Djmh.blackhole.autoDetect=false to disable)
# Warmup: 3 iterations, 3 s each
# Measurement: 3 iterations, 3 s each
# Timeout: 10 min per iteration
# Threads: 1 thread, will synchronize iterations

JDK 11 Result

Benchmark                   Mode  Cnt       Score       Error  Units
VarHandleJmh.getLongPlain   avgt    3  480385.742 ± 45707.927  ns/op
VarHandleJmh.getLongUnsafe  avgt    3  405600.434 ±   609.814  ns/op

JDK 17 Result

Benchmark                   Mode  Cnt       Score     Error  Units
VarHandleJmh.getLongPlain   avgt    3  165615.717 ± 387.519  ns/op
VarHandleJmh.getLongUnsafe  avgt    3   80043.996 ± 102.376  ns/op

Code

import java.lang.invoke.MethodHandles;
import java.lang.invoke.VarHandle;
import java.util.Random;
import java.util.Arrays;
import java.util.concurrent.TimeUnit;
import sun.misc.Unsafe;
import java.lang.reflect.Field;

import org.openjdk.jmh.annotations.*;
import org.openjdk.jmh.infra.Blackhole;

import java.lang.invoke.MethodHandles;
import java.lang.invoke.VarHandle;

import static java.nio.ByteOrder.LITTLE_ENDIAN;

@State(Scope.Thread)
@BenchmarkMode(Mode.AverageTime)
@Warmup(iterations = 3, time = 3)
@Measurement(iterations = 3, time = 3)
@Fork(1)
@OutputTimeUnit(TimeUnit.NANOSECONDS)
public class VarHandleJmh {
        private final static int L = 1024 * 1024;
        byte data[] = new byte[L];

        private static final Unsafe UNSAFE;

        private static final long BYTE_BASE;

        private static final VarHandle LONG_HANDLE = MethodHandles.byteArrayViewVarHandle(long[].class, LITTLE_ENDIAN);

        static {
                Unsafe u = null;
                try {
                        Field theUnsafe = Unsafe.class.getDeclaredField("theUnsafe");
                        theUnsafe.setAccessible(true);
                        u = (Unsafe) theUnsafe.get(null);
                } catch (final Exception e) {
                }

                UNSAFE = u;

                BYTE_BASE = UNSAFE.arrayBaseOffset(byte[].class);
        }

        @Setup
        public void init() {
                new Random().nextBytes(data);
        }

       @Benchmark
        public void getLongUnsafe(Blackhole bh) {
                for (long i = BYTE_BASE; i < BYTE_BASE + L - 8; i += 8) {
                        final long d = UNSAFE.getLong(data, i);
                        bh.consume(d);
                }
        }

        @Benchmark
        public void getLongPlain(Blackhole bh) {
                for (long i = 0; i < L - 8; i += 8) {
                        final long d = (long)LONG_HANDLE.get(data, (int)i);
                        bh.consume(d);
                }
        }

        public static void main(String[] args) throws Exception {
                org.openjdk.jmh.Main.main(args);
        }
}

@gzm55
Copy link
Collaborator

gzm55 commented Feb 8, 2022

I also benchmarked a dedicated VarHandleAccess vs UnsafeAccess, the latter one still performs about 5% better.

@skazis
Copy link

skazis commented Jun 1, 2023

Just wanted to let you know that on Java 17 I stumbled upon a strange problem with hashing, MD5 in particular. It is hard to provide a repeatable scenario as this happened on PROD system without any clear circumstances. And one of the attempted solutions was to switch to this library and XXH3 algorithm since we care about speed and not crypto.
The problem I mention is that many threads got stuck around VarHandle and CPU was overwhelmed, example stack trace (there is not a single hotspot line, but for sure they all are within method implCompress0):

"http-nio-8080-exec-19" #934 daemon prio=5 os_prio=0 cpu=395884.88ms elapsed=55340.79s tid=0x00007f6a44011d00 nid=0x20b850 runnable [0x00007f6a58a0a000] java.lang.Thread.State: RUNNABLE at jdk.internal.util.Preconditions.checkIndex(java.base@17.0.7/Preconditions.java:267) at java.lang.invoke.VarHandleByteArrayAsInts$ArrayHandle.index(java.base@17.0.7/VarHandleByteArrayAsInts.java:103) at java.lang.invoke.VarHandleByteArrayAsInts$ArrayHandle.get(java.base@17.0.7/VarHandleByteArrayAsInts.java:120) at java.lang.invoke.VarHandleGuards.guard_LI_I(java.base@17.0.7/VarHandleGuards.java:163) at sun.security.provider.MD5.implCompress0(java.base@17.0.7/MD5.java:182) at sun.security.provider.MD5.implCompress(java.base@17.0.7/MD5.java:146) at sun.security.provider.DigestBase.implCompressMultiBlock0(java.base@17.0.7/DigestBase.java:150) at sun.security.provider.DigestBase.implCompressMultiBlock(java.base@17.0.7/DigestBase.java:144) at sun.security.provider.DigestBase.engineUpdate(java.base@17.0.7/DigestBase.java:131) at java.security.MessageDigest$Delegate.engineUpdate(java.base@17.0.7/MessageDigest.java:658) at java.security.MessageDigest.update(java.base@17.0.7/MessageDigest.java:349)

@tgd tgd added the review label Jun 3, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

6 participants