Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ZipInputStream throws "ZipException: invalid entry CRC" on Arch Linux #4479

Closed
alexarchambault opened this issue Apr 8, 2022 · 8 comments
Closed
Assignees

Comments

@alexarchambault
Copy link

alexarchambault commented Apr 8, 2022

Because of CRC32 miscalculation from native images there.

CRC32 calculations via java.util.zip.CRC32, used by ZipInputStream, are inconsistent in some cases on Arch Linux. These calculations sometimes provide different results depending on how the input file is fed to CRC32#update.

Note that this issue doesn't seem to happen on other Linux distributions (Ubuntu, Debian, Fedora), nor when running the calculations on the JVM.

Steps to reproduce the issue

See here for a detailed reproduction.

  1. git clone https://github.com/alexarchambault/arch-linux-native-image-crc32-issue && cd arch-linux-native-image-crc32-issue
  2. Follow instructions of the README

Describe GraalVM and your environment:

  • GraalVM version 22.0.0.2 (also manually reproduced with the current dev release based on 8b10078)
  • JDK major version: 17
  • OS: Arch Linux
  • Architecture: AMD64

More details

The reproduction above computes the CRC32 of an example file in two ways:

  • by calling CRC32#update only once, with the whole content of the file
  • by calling CRC32#update multiple times, by splitting the content of the file (sizes of the chunks come from ZipInputStream, that splits it this way while unzipping the JAR this file comes from)

When run on Arch Linux (other Linux distros seem to work fine), from a native image (it runs fine on the JVM), the two calculations give different results, with the second one apparently faulty.

Output of the native image command generating the binary in the reproduction:

Executing [
/root/graalvm-ce-java17-22.2.0-dev/bin/java \
-XX:+UseParallelGC \
-XX:+UnlockExperimentalVMOptions \
-XX:+EnableJVMCI \
-Dtruffle.TrustAllTruffleRuntimeProviders=true \
-Dtruffle.TruffleRuntime=com.oracle.truffle.api.impl.DefaultTruffleRuntime \
-Dgraalvm.ForcePolyglotInvalid=true \
-Dgraalvm.locatorDisabled=true \
-Dsubstratevm.IgnoreGraalVersionCheck=true \
--add-exports=java.base/com.sun.crypto.provider=ALL-UNNAMED \
--add-exports=java.base/jdk.internal.access.foreign=ALL-UNNAMED \
--add-exports=java.base/jdk.internal.event=ALL-UNNAMED \
--add-exports=java.base/jdk.internal.loader=ALL-UNNAMED \
--add-exports=java.base/jdk.internal.logger=ALL-UNNAMED \
--add-exports=java.base/jdk.internal.misc=ALL-UNNAMED \
--add-exports=java.base/jdk.internal.module=ALL-UNNAMED \
--add-exports=java.base/jdk.internal.org.objectweb.asm=ALL-UNNAMED \
--add-exports=java.base/jdk.internal.perf=ALL-UNNAMED \
--add-exports=java.base/jdk.internal.platform=ALL-UNNAMED \
--add-exports=java.base/jdk.internal.ref=ALL-UNNAMED \
--add-exports=java.base/jdk.internal.reflect=ALL-UNNAMED \
--add-exports=java.base/jdk.internal.vm.annotation=ALL-UNNAMED \
--add-exports=java.base/sun.invoke.util=ALL-UNNAMED \
--add-exports=java.base/sun.net=ALL-UNNAMED \
--add-exports=java.base/sun.nio.ch=ALL-UNNAMED \
--add-exports=java.base/sun.reflect.annotation=ALL-UNNAMED \
--add-exports=java.base/sun.reflect.generics.factory=ALL-UNNAMED \
--add-exports=java.base/sun.reflect.generics.reflectiveObjects=ALL-UNNAMED \
--add-exports=java.base/sun.reflect.generics.repository=ALL-UNNAMED \
--add-exports=java.base/sun.reflect.generics.scope=ALL-UNNAMED \
--add-exports=java.base/sun.reflect.generics.tree=ALL-UNNAMED \
--add-exports=java.base/sun.security.jca=ALL-UNNAMED \
--add-exports=java.base/sun.security.provider=ALL-UNNAMED \
--add-exports=java.base/sun.security.ssl=ALL-UNNAMED \
--add-exports=java.base/sun.security.util=ALL-UNNAMED \
--add-exports=java.base/sun.security.x509=ALL-UNNAMED \
--add-exports=java.base/sun.text.spi=ALL-UNNAMED \
--add-exports=java.base/sun.util.calendar=ALL-UNNAMED \
--add-exports=java.base/sun.util.cldr=ALL-UNNAMED \
--add-exports=java.base/sun.util.locale.provider=ALL-UNNAMED \
--add-exports=java.base/sun.util.locale=ALL-UNNAMED \
--add-exports=java.base/sun.util.resources=ALL-UNNAMED \
--add-exports=java.base/sun.util=ALL-UNNAMED \
--add-exports=java.desktop/sun.java2d.pipe=ALL-UNNAMED \
--add-exports=java.desktop/sun.java2d=ALL-UNNAMED \
--add-exports=java.management/com.sun.jmx.mbeanserver=ALL-UNNAMED \
--add-exports=java.management/sun.management=ALL-UNNAMED \
--add-exports=java.xml.crypto/org.jcp.xml.dsig.internal.dom=ALL-UNNAMED \
--add-exports=jdk.internal.vm.ci/jdk.vm.ci.aarch64=ALL-UNNAMED \
--add-exports=jdk.internal.vm.ci/jdk.vm.ci.amd64=ALL-UNNAMED \
--add-exports=jdk.internal.vm.ci/jdk.vm.ci.code.site=ALL-UNNAMED \
--add-exports=jdk.internal.vm.ci/jdk.vm.ci.code.stack=ALL-UNNAMED \
--add-exports=jdk.internal.vm.ci/jdk.vm.ci.code=ALL-UNNAMED \
--add-exports=jdk.internal.vm.ci/jdk.vm.ci.common=ALL-UNNAMED \
--add-exports=jdk.internal.vm.ci/jdk.vm.ci.hotspot.aarch64=ALL-UNNAMED \
--add-exports=jdk.internal.vm.ci/jdk.vm.ci.hotspot.amd64=ALL-UNNAMED \
--add-exports=jdk.internal.vm.ci/jdk.vm.ci.hotspot=ALL-UNNAMED \
--add-exports=jdk.internal.vm.ci/jdk.vm.ci.meta=ALL-UNNAMED \
--add-exports=jdk.internal.vm.ci/jdk.vm.ci.runtime=ALL-UNNAMED \
--add-exports=jdk.internal.vm.ci/jdk.vm.ci.services=ALL-UNNAMED \
--add-exports=jdk.jfr/jdk.jfr.events=ALL-UNNAMED \
--add-exports=jdk.jfr/jdk.jfr.internal.handlers=ALL-UNNAMED \
--add-exports=jdk.jfr/jdk.jfr.internal.jfc=ALL-UNNAMED \
--add-exports=jdk.jfr/jdk.jfr.internal=ALL-UNNAMED \
--add-exports=jdk.management/com.sun.management.internal=ALL-UNNAMED \
-XX:+UseJVMCINativeLibrary \
-Xss10m \
-Xms1g \
-Xmx10049857120 \
-Djava.awt.headless=true \
-Dorg.graalvm.version=22.2.0-dev \
-Dorg.graalvm.config=CE \
-Dcom.oracle.graalvm.isaot=true \
-Djava.system.class.loader=com.oracle.svm.hosted.NativeImageSystemClassLoader \
-Xshare:off \
-Djdk.internal.lambda.disableEagerInitialization=true \
-Djdk.internal.lambda.eagerlyInitialize=false \
-Djava.lang.invoke.InnerClassLambdaMetafactory.initializeLambdas=false \
-javaagent:/root/graalvm-ce-java17-22.2.0-dev/lib/svm/builder/svm.jar \
-cp \
/root/graalvm-ce-java17-22.2.0-dev/lib/svm/builder/llvm-platform-specific-shadowed.jar:/root/graalvm-ce-java17-22.2.0-dev/lib/svm/builder/javacpp-shadowed.jar:/root/graalvm-ce-java17-22.2.0-dev/lib/svm/builder/svm-llvm.jar:/root/graalvm-ce-java17-22.2.0-dev/lib/svm/builder/pointsto.jar:/root/graalvm-ce-java17-22.2.0-dev/lib/svm/builder/objectfile.jar:/root/graalvm-ce-java17-22.2.0-dev/lib/svm/builder/llvm-wrapper-shadowed.jar:/root/graalvm-ce-java17-22.2.0-dev/lib/svm/builder/svm.jar:/root/graalvm-ce-java17-22.2.0-dev/lib/svm/builder/native-image-base.jar \
--module-path \
/root/graalvm-ce-java17-22.2.0-dev/lib/truffle/truffle-api.jar \
'com.oracle.svm.hosted.NativeImageGeneratorRunner$JDK9Plus' \
-watchpid \
3922 \
-imagecp \
/workspace/.scala-build/project_85c3b2bc45_85c3b2bc45-5e1ed52702/classes/main:/root/.cache/coursier/v1/https/repo1.maven.org/maven2/org/scala-lang/scala-library/2.13.8/scala-library-2.13.8.jar:/root/.cache/coursier/v1/https/repo1.maven.org/maven2/com/lihaoyi/os-lib_2.13/0.8.1/os-lib_2.13-0.8.1.jar:/root/.cache/scalacli/local-repo/v0.1.3/org.virtuslab.scala-cli/runner_2.13/0.1.3/jars/runner_2.13.jar:/root/.cache/coursier/v1/https/repo1.maven.org/maven2/com/lihaoyi/geny_2.13/0.7.1/geny_2.13-0.7.1.jar:/root/.cache/scalacli/local-repo/v0.1.3/org.virtuslab.scala-cli/stubs/0.1.3/jars/stubs.jar:/workspace/compute-crc32.jar:/root/graalvm-ce-java17-22.2.0-dev/lib/svm/library-support.jar \
-H:Path=/workspace \
-H:FallbackThreshold=0 \
-H:+DumpTargetInfo \
-H:Path=/workspace \
-H:Name=compute-crc32 \
-H:CLibraryPath=/root/graalvm-ce-java17-22.2.0-dev/lib/svm/clibraries/linux-amd64 \
'-H:Class@explicit main-class=ComputeCrc32_sc'
]
=======================================================================================================================
GraalVM Native Image: Generating 'compute-crc32' (executable)...
=======================================================================================================================
[1/7] Initializing...                                                                                  (18.0s @ 0.14GB)
 Version info: 'GraalVM 22.2.0-dev Java 17 CE'
 C compiler: gcc (linux, x86_64, 9.4.0)
 Garbage collector: Serial GC
 1 user-provided feature(s)
  - com.oracle.svm.polyglot.scala.ScalaFeature
# Printing compilation-target information to: /workspace/reports/target_info_20220408_093531.txt
# Printing native-library information to: /workspace/reports/native_library_info_20220408_093620.txt
[2/7] Performing analysis...  [*******]                                                                (49.2s @ 1.28GB)
   3,541 (76.23%) of  4,645 classes reachable
   4,182 (53.23%) of  7,857 fields reachable
  16,253 (38.74%) of 41,959 methods reachable
      28 classes,   113 fields, and   477 methods registered for reflection
      58 classes,    58 fields, and    52 methods registered for JNI access
[3/7] Building universe...                                                                              (3.6s @ 1.53GB)
[4/7] Parsing methods...      [**]                                                                      (2.9s @ 0.73GB)
[5/7] Inlining methods...     [****]                                                                    (3.5s @ 0.67GB)
[6/7] Compiling methods...    [******]                                                                 (35.5s @ 1.55GB)
[7/7] Creating image...                                                                                 (5.6s @ 1.88GB)
   5.36MB (35.72%) for code area:    9,417 compilation units
   8.27MB (55.09%) for image heap:   2,156 classes and 103,531 objects
   1.38MB ( 9.20%) for other data
  15.01MB in total
-----------------------------------------------------------------------------------------------------------------------
Top 10 packages in code area:                              Top 10 object types in image heap:
 693.86KB java.util                                           1.11MB byte[] for code metadata
 378.73KB java.lang.invoke                                    1.02MB byte[] for java.lang.String
 350.10KB java.lang                                           1.02MB byte[] for general heap data
 278.15KB java.text                                         972.03KB java.lang.String
 235.97KB java.util.regex                                   819.32KB java.lang.Class
 203.01KB com.oracle.svm.core.reflect                       387.89KB java.util.HashMap$Node
 201.16KB java.util.concurrent                              276.64KB com.oracle.svm.core.hub.DynamicHubCompanion
 198.52KB com.oracle.svm.jni                                184.08KB java.lang.String[]
 152.34KB scala.collection.immutable                        167.66KB java.util.HashMap$Node[]
 148.76KB java.math                                         156.61KB java.util.concurrent.ConcurrentHashMap$Node
      ... 139 additional packages                                ... 910 additional object types
                                          (use GraalVM Dashboard to see all)
-----------------------------------------------------------------------------------------------------------------------
                        4.6s (3.6% of total time) in 21 GCs | Peak RSS: 3.49GB | CPU load: 3.42
-----------------------------------------------------------------------------------------------------------------------
Produced artifacts:
 /workspace/compute-crc32 (executable)
 /workspace/compute-crc32.build_artifacts.txt
=======================================================================================================================
Finished generating 'compute-crc32' in 2m 1s.
@alexarchambault
Copy link
Author

This makes iterating over the entries of some zip files with ZipInputStream crash as shown in VirtusLab/scala-cli#828 (comment) or coursier/coursier#2395.

@alexarchambault alexarchambault changed the title CRC32 miscalculation from native images on Arch Linux ZipInputStream throws "ZipException: invalid entry CRC" on Arch Linux Apr 11, 2022
@oubidar-Abderrahim
Copy link
Member

Hi, thank you for reporting this we will take a look into it and get back to you

@oubidar-Abderrahim oubidar-Abderrahim self-assigned this Apr 13, 2022
@oubidar-Abderrahim
Copy link
Member

Hi, is it possible to have a reproducer without the docker? something I can test directly on my machine

@alexarchambault
Copy link
Author

@oubidar-Abderrahim IIRC, if you download this JAR, and open it with a ZipInputStream (not a ZipFile), and iterate on its entries while fully reading their content, it triggers the issue above (from a native image on Arch Linux).

@mukel
Copy link
Member

mukel commented May 6, 2022

Hit the the same issue in Espresso, which doesn't provide intrinsics and calls the native methods directly. I substituted the native calls by a pure Java CRC32 implementation and the problem is gone.
Looking at Arch's zlib I noticed this recent patch archlinux/svntogit-packages@87334c1 which should be a fix for madler/zlib#613

I can reproduce the issue on GraalVM (regular JVM mode) by disabling the Graal CRC32 intrinsics e.g. -XX:+UnlockDiagnosticVMOptions -XX:-UseCRC32Intrinsics, the problem definitely comes from zlib.

@oubidar-Abderrahim
Copy link
Member

Thank you @mukel could you please share the reproducer with steps?

@mukel
Copy link
Member

mukel commented May 10, 2022

Any version of OpenJDK or it's derivatives that dynamically links against (an unpatched) zlib 1.2.12 will hit this issue. A patch is already in the develop branch of zlib and Arch included the patch as well, is just a matter of time till Arch pushes a rolling update.

@oubidar-Abderrahim
Copy link
Member

Great, I'll close this issue for now. If there's anything to do on our side, please reopen it. Thank you

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants