Skip to content

[native-image] Error loading a referenced type: java.lang.IllegalAccessError #1711

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
neomatrix369 opened this issue Sep 28, 2019 · 46 comments
Closed
Assignees

Comments

@neomatrix369
Copy link
Contributor

neomatrix369 commented Sep 28, 2019

I get this error when building a native-image from an uber jar, the detailed version of the error is as follows:

Warning: RecomputeFieldValue.ArrayBaseOffset automatic substitution failed. The automatic substitution registration was attempted because a call to sun.misc.Unsafe.arrayBaseOffset(Class) was detected in the static initializer of org.bytedeco.javacpp.indexer.UnsafeRaw. Detailed failure reason(s): Could not determine the field where the value produced by the call to sun.misc.Unsafe.arrayBaseOffset(Class) for the array base offset computation is stored. The call is not directly followed by a field store or by a sign extend node followed directly by a field store. 

Error: Error loading a referenced type: java.lang.IllegalAccessError: tried to access class org.bytedeco.javacpp.indexer.Raw from class org.bytedeco.javacpp.indexer.BooleanIndexer
Detailed message:
Trace: 
	at parsing org.bytedeco.javacpp.indexer.BooleanIndexer.create(BooleanIndexer.java:88)
Call path from entry point to org.bytedeco.javacpp.indexer.BooleanIndexer.create(BooleanPointer, long[], long[], boolean): 
	at org.bytedeco.javacpp.indexer.BooleanIndexer.create(BooleanIndexer.java:87)
	at org.bytedeco.javacpp.indexer.BooleanIndexer.create(BooleanIndexer.java:77)
	at org.bytedeco.javacpp.indexer.BooleanIndexer.create(BooleanIndexer.java:51)
	at org.nd4j.linalg.api.buffer.BaseDataBuffer.<init>(BaseDataBuffer.java:698)
	at org.nd4j.linalg.api.buffer.BaseDataBuffer.<init>(BaseDataBuffer.java:635)
	at org.nd4j.linalg.api.buffer.LongBuffer.<init>(LongBuffer.java:45)
	at org.nd4j.linalg.cpu.nativecpu.CpuTADManager.getTADOnlyShapeInfo(CpuTADManager.java:93)
	at org.nd4j.linalg.cpu.nativecpu.ops.NativeOpExecutioner.exec(NativeOpExecutioner.java:314)
	at org.nd4j.linalg.api.ndarray.BaseNDArray.equalsWithEps(BaseNDArray.java:5209)
	at org.nd4j.linalg.api.ndarray.BaseNDArray.equals(BaseNDArray.java:5245)
	at com.oracle.svm.jni.JNIJavaCallWrappers.jniInvoke_VA_LIST:Ljava_lang_Object_2_0002eequals_00028Ljava_lang_Object_2_00029Z(generated:0)

Is it expecting the class to be opened via the reflect-config.json configuration, which isn't clear from the error message?

The line of code that's causing the error is over here: https://github.com/bytedeco/javacpp/blob/master/src/main/java/org/bytedeco/javacpp/indexer/BooleanIndexer.java#L87.

Here is the source to org.bytedeco.javacpp.indexer.Raw.java

@neomatrix369
Copy link
Contributor Author

I did add the org.bytedeco.javacpp.indexer.Raw entry to the reflect-config.json but the error still persists.

@neomatrix369 neomatrix369 changed the title Error loading a referenced type: java.lang.IllegalAccessError [native-image] Error loading a referenced type: java.lang.IllegalAccessError Sep 28, 2019
@neomatrix369
Copy link
Contributor Author

neomatrix369 commented Sep 29, 2019

I did some debugging and I think I need to do some of substitution of this form https://gist.github.com/cstancu/e9f737c5c737f0ebd6d5b56e7555e738#file-netty-svm-substitutions-nettysubstitutions-java

Although now I'm wrestling with the onlyWith directive in @RecomputeFieldValue(), this is what it looks like:

/**
 * Configuration to adapt the
 */
@TargetClass(className = "org.bytedeco.javacpp.indexer.UnsafeRaw",
             onlyWith = org.bytedeco.javacpp.indexer.BooleanIndexer.class)
final class Target_org_bytedeco_javacpp_indexer_UnsafeRaw {
    @Alias @RecomputeFieldValue(kind = Kind.ArrayBaseOffset, declClass = byte[].class, isFinal = true)
    private static long ARRAY_BYTE_BASE_OFFSET;
}

native-image builder gives me this error:

Error: Class specified as onlyWith for org.deeplearning4j.feedforward.mnist.Target_org_bytedeco_javacpp_indexer_UnsafeRaw cannot be loaded or instantiated: org.bytedeco.javacpp.indexer.BooleanIndexer
com.oracle.svm.core.util.UserError$UserException: Class specified as onlyWith for org.deeplearning4j.feedforward.mnist.Target_org_bytedeco_javacpp_indexer_UnsafeRaw cannot be loaded or instantiated: org.bytedeco.javacpp.indexer.BooleanIndexer
	at com.oracle.svm.core.util.UserError.abort(UserError.java:75)
	at com.oracle.svm.hosted.substitute.AnnotationSubstitutionProcessor.findTargetClass(AnnotationSubstitutionProcessor.java:801)
	at com.oracle.svm.hosted.substitute.AnnotationSubstitutionProcessor.handleClass(AnnotationSubstitutionProcessor.java:252)
	at com.oracle.svm.hosted.substitute.AnnotationSubstitutionProcessor.init(AnnotationSubstitutionProcessor.java:230)
	at com.oracle.svm.hosted.NativeImageGenerator.createDeclarativeSubstitutionProcessor(NativeImageGenerator.java:875)
	at com.oracle.svm.hosted.NativeImageGenerator.setupNativeImage(NativeImageGenerator.java:824)
	at com.oracle.svm.hosted.NativeImageGenerator.doRun(NativeImageGenerator.java:524)
	at com.oracle.svm.hosted.NativeImageGenerator.lambda$run$0(NativeImageGenerator.java:444)
	at java.util.concurrent.ForkJoinTask$AdaptedRunnableAction.exec(ForkJoinTask.java:1386)
	at java.util.concurrent.ForkJoinTask.doExec(ForkJoinTask.java:289)
	at java.util.concurrent.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1056)
	at java.util.concurrent.ForkJoinPool.runWorker(ForkJoinPool.java:1692)
	at java.util.concurrent.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:157)
Caused by: java.lang.NoSuchMethodException: org.bytedeco.javacpp.indexer.BooleanIndexer.<init>()
	at java.lang.Class.getConstructor0(Class.java:3082)
	at java.lang.Class.getDeclaredConstructor(Class.java:2178)
	at com.oracle.svm.util.ReflectionUtil.lookupConstructor(ReflectionUtil.java:69)
	at com.oracle.svm.util.ReflectionUtil.newInstance(ReflectionUtil.java:80)
	at com.oracle.svm.hosted.substitute.AnnotationSubstitutionProcessor.findTargetClass(AnnotationSubstitutionProcessor.java:799)
	... 11 more

I think a no params constructor for org.bytedeco.javacpp.indexer.BooleanIndexer is looked up but BooleanIndexer only has factory methods like create(..., the constructor is protected.

I tried to open up the class via reflect-config.json, but still get the above error:

<--snipped-->
{
  "name":"org.bytedeco.javacpp.indexer.BooleanIndexer",
  "methods":[{"name":"<init>","parameterTypes":["long", "long"]}],
  "allDeclaredMethods":true,
  "allPublicMethods":true,
  "allDeclaredConstructors":true,
  "allPublicConstructors":true
},
<--snipped-->

@peter-hofer
Copy link
Member

@neomatrix369, the onlyWith attribute takes a "predicate class" with a method that tells the image build whether the substitution should be applied, while I think you expect it to mean "only apply the substitution if this class is reachable". Here's an example for a predicate class:

    class AlwaysIncluded implements BooleanSupplier {
        @Override
        public boolean getAsBoolean() {
            return true;
        }
    }

It's possible that you don't need onlyWith at all, have you tried omitting it?

@neomatrix369
Copy link
Contributor Author

@neomatrix369, the onlyWith attribute takes a "predicate class" with a method that tells the image build whether the substitution should be applied, while I think you expect it to mean "only apply the substitution if this class is reachable". Here's an example for a predicate class:

    class AlwaysIncluded implements BooleanSupplier {
        @Override
        public boolean getAsBoolean() {
            return true;
        }
    }

It's possible that you don't need onlyWith at all, have you tried omitting it?

Yes, I initially implemented it without the onlyWith and then the compiler asks for it! When I implemented it I got the error. I'll implement the above to see what happens next. Although the problem might be something else.

@neomatrix369
Copy link
Contributor Author

@peter-hofer so your suggestion worked, but now I'm getting this:

Error: could not find target field: private static long org.deeplearning4j.feedforward.mnist.Target_org_bytedeco_javacpp_indexer_UnsafeRaw.ARRAY_BYTE_BASE_OFFSET

I got this implementation from the graal source files, Ill dig into the stack traces to find out what the source of the issue is.

@peter-hofer
Copy link
Member

peter-hofer commented Sep 30, 2019

Looking at UnsafeRaw.java, it seems like the name of the field is arrayOffset, not ARRAY_BYTE_BASE_OFFSET. The names from the substitution class (the one that is annotated @TargetClass) need to match those from the original class.

@neomatrix369
Copy link
Contributor Author

Looking at UnsafeRaw.java, it seems like the name of the field is arrayOffset, not ARRAY_BYTE_BASE_OFFSET. The names from the substitution class (the one that is annotated @TargetClass) need to match those from the original class.

I just did that so we are on the same page, cause the Target is looking for this specific source file and not the Unsafe.java.

@neomatrix369
Copy link
Contributor Author

neomatrix369 commented Sep 30, 2019

So we are back to the same error as mentioned in #1711 (comment), except the warning no longer happens. I think our substitution worked and fixed the warning.

Below is the error message in short:

<--snipped-->
Error: Error loading a referenced type: java.lang.IllegalAccessError: tried to access class org.bytedeco.javacpp.indexer.Raw from class org.bytedeco.javacpp.indexer.BooleanIndexer
Detailed message:
Trace: 
	at parsing org.bytedeco.javacpp.indexer.BooleanIndexer.create(BooleanIndexer.java:88)
Call path from entry point to org.bytedeco.javacpp.indexer.BooleanIndexer.create(BooleanPointer, long[], long[], boolean): 
	at org.bytedeco.javacpp.indexer.BooleanIndexer.create(BooleanIndexer.java:87)
	at com.oracle.svm.reflect.BooleanIndexer_create_fbefc57cf8c0e6f1c0b19d7394ef3dc181599717_451.invoke(Unknown Source)
<--snipped-->

I'm going to try opening up org.bytedeco.javacpp.indexer.Raw and seeing what happens - thats my current hunch.

@neomatrix369
Copy link
Contributor Author

neomatrix369 commented Sep 30, 2019

So doing the above, did not have an effect, here are my new changes to the reflect-config.json:

{
  "name":"org.bytedeco.javacpp.indexer.Raw",
  "allDeclaredMethods":true,
  "allPublicMethods":true,
  "allDeclaredConstructors":true,
  "allPublicConstructors":true,
  "allDeclaredClasses":true,
  "allPublicClasses":true,
  "methods":[{"name":"getInstance","parameterTypes":[] }]
}

Even though the error message appears to be telling Raw is not accessible, making it accessible does not still resolve the error. Its possible the issue is somewhere else.

@neomatrix369
Copy link
Contributor Author

neomatrix369 commented Sep 30, 2019

Debugging the graal source lead me to see that native-image builder isn't throwing/showing errors caused due to missing Logger class (incomplete classpath), in my case its the way the Slf4JLoggerFactory class is written.

Here is the piece of code I was debugging when the silent failures were occurring:

package com.oracle.graal.pointsto.infrastructure

public class WrappedConstantPool implements ConstantPool {

<--snipped--->

    public static void loadReferencedType(ConstantPool cp, int cpi, int opcode, boolean initialize) {
        ConstantPool root = cp;
        while (root instanceof WrappedConstantPool) {
            root = ((WrappedConstantPool) root).wrapped;
        }

        try {
            hsLoadReferencedType.invoke(root, cpi, opcode, initialize);
        } catch (Throwable ex) {
            Throwable cause = ex;
            if (ex instanceof InvocationTargetException && ex.getCause() != null) {
                cause = ex.getCause();
                if (cause instanceof BootstrapMethodError && cause.getCause() != null) {
                    cause = cause.getCause();
                }
            } else if (ex instanceof ExceptionInInitializerError && ex.getCause() != null) {
                cause = ex.getCause();
            }
            throw new UnresolvedElementException("Error loading a referenced type: " + cause.toString(), cause);   <=== this error is swallowed somewhere up in the stack and not reported
        }
    }
<--snipped--->
}

The error we get on the console at the end of the failed build is a side effect of another error(s).

Is this a known issue / behaviour? Any reasons for it to behave in this fashion?

I have added a substitute for the missing logger but I'm waiting to see if this impacts the execution of my program.

@neomatrix369
Copy link
Contributor Author

neomatrix369 commented Sep 30, 2019

Through debugging I also found out that the "Error loading a referenced type: java.lang.IllegalAccessError: tried to access class org.bytedeco.javacpp.indexer.Raw from class org.bytedeco.javacpp.indexer.BooleanIndexer" error is caused when the Raw.getInstance() static method is invoked via:

hsLoadReferencedType.invoke(root, cpi, opcode, initialize);

   root: "<HotSpotPool: org.bytedeco.javacpp.indexer.BooleanIndexer>"
   cpi: 13
   opcode: 184
   initialize: false

I figured opcode: 184 corresponds to INVOKESTATIC. Not sure what CPI: 13 represents.

@neomatrix369
Copy link
Contributor Author

@peter-hofer any thoughts or hints on the above issues?

@peter-hofer
Copy link
Member

loadReferencedType is in the JVMCI. @dougxc , do you have any ideas what is happening here?

@dougxc
Copy link
Member

dougxc commented Oct 2, 2019

No. I'd need to see the full stack trace for the swallowed error.

@neomatrix369
Copy link
Contributor Author

@dougxc @peter-hofer I'll provide them asap! For both the silent exception and the main issue with BooleanIndexer, they are happening in the same block of code but pathways and variable values are different.

@dougxc
Copy link
Member

dougxc commented Oct 2, 2019

We should also find out where the exception is being swallowed and fix that if possible.

@neomatrix369
Copy link
Contributor Author

We should also find out where the exception is being swallowed and fix that if possible.

Initially, I thought (after reading @cstancu's post) that the native-image builder has become smarter and is handling this commonly known Logger / LoggerFactory issue and not reporting the issue. But in that case if it did that, it should print out a warning saying I have handled this for you, but let me provide the data and you can make the best conclusion out of that I think.

@neomatrix369
Copy link
Contributor Author

neomatrix369 commented Oct 2, 2019

No. I'd need to see the full stack trace for the swallowed error.

Silent exception: Slf4j
First instance, see https://gist.github.com/neomatrix369/22fa11bade05357392abd8e0f9f4f449 (values in the frame and stack trace till exception are covered) and the second instance, see https://gist.github.com/neomatrix369/a4616b0e79936baf9dc0ee0d7a3637e7

@dougxc
Copy link
Member

dougxc commented Oct 2, 2019

Thanks @neomatrix369 but neither of those (jstack?) traces provide the stack trace of the swallowed exception. The best way to see this would be something like:

diff --git a/substratevm/src/com.oracle.graal.pointsto/src/com/oracle/graal/pointsto/infrastructure/WrappedConstantPool.java b/substratevm/src/com.oracle.graal.pointsto/src/com/oracle/graal/pointsto/infrastructure/WrappedConstantPool.java
index a5f69e5bd6a..5fab1a2d5ee 100644
--- a/substratevm/src/com.oracle.graal.pointsto/src/com/oracle/graal/pointsto/infrastructure/WrappedConstantPool.java
+++ b/substratevm/src/com.oracle.graal.pointsto/src/com/oracle/graal/pointsto/infrastructure/WrappedConstantPool.java
@@ -96,6 +96,7 @@ public class WrappedConstantPool implements ConstantPool {
             } else if (ex instanceof ExceptionInInitializerError && ex.getCause() != null) {
                 cause = ex.getCause();
             }
+            cause.printStackTrace();
             throw new UnresolvedElementException("Error loading a referenced type: " + cause.toString(), cause);
         }
     }

@neomatrix369
Copy link
Contributor Author

neomatrix369 commented Oct 2, 2019

Original exception: org.bytedeco.javacpp.indexer.BooleanIndexer
See stacktrace of the original exception (Error: Error loading a referenced type: java.lang.IllegalAccessError: tried to access class org.bytedeco.javacpp.indexer.Raw from class org.bytedeco.javacpp.indexer.BooleanIndexer) I have been debugging https://gist.github.com/neomatrix369/42f9c152c74a77d392741bb68d901338

@neomatrix369
Copy link
Contributor Author

Thanks @neomatrix369 but neither of those (jstack?) traces provide the stack trace of the swallowed exception. The best way to see this would be something like:

diff --git a/substratevm/src/com.oracle.graal.pointsto/src/com/oracle/graal/pointsto/infrastructure/WrappedConstantPool.java b/substratevm/src/com.oracle.graal.pointsto/src/com/oracle/graal/pointsto/infrastructure/WrappedConstantPool.java
index a5f69e5bd6a..5fab1a2d5ee 100644
--- a/substratevm/src/com.oracle.graal.pointsto/src/com/oracle/graal/pointsto/infrastructure/WrappedConstantPool.java
+++ b/substratevm/src/com.oracle.graal.pointsto/src/com/oracle/graal/pointsto/infrastructure/WrappedConstantPool.java
@@ -96,6 +96,7 @@ public class WrappedConstantPool implements ConstantPool {
             } else if (ex instanceof ExceptionInInitializerError && ex.getCause() != null) {
                 cause = ex.getCause();
             }
+            cause.printStackTrace();
             throw new UnresolvedElementException("Error loading a referenced type: " + cause.toString(), cause);
         }
     }

I'll try to get that, the ones I provided are from IntelliJ's stack trace options capturing all the threads.

@dougxc
Copy link
Member

dougxc commented Oct 2, 2019

This is the interesting trace:

Caused by: java.lang.IllegalAccessError: tried to access class org.bytedeco.javacpp.indexer.Raw from class org.bytedeco.javacpp.indexer.BooleanIndexer
	at jdk.vm.ci.hotspot.CompilerToVM.resolveTypeInPool(Native Method)
	at jdk.vm.ci.hotspot.HotSpotConstantPool.loadReferencedType(HotSpotConstantPool.java:728)
	at sun.reflect.GeneratedMethodAccessor1.invoke(Unknown Source)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
	at java.lang.reflect.Method.invoke(Method.java:498)
	at com.oracle.graal.pointsto.infrastructure.WrappedConstantPool.loadReferencedType(WrappedConstantPool.java:88)

Since the accessing class is in the same package as the accessed class, I can only suspect some class loader problem or the fact this is called via reflection. @tkrodriguez do you have any ideas?

@neomatrix369
Copy link
Contributor Author

This is the interesting trace:

Caused by: java.lang.IllegalAccessError: tried to access class org.bytedeco.javacpp.indexer.Raw from class org.bytedeco.javacpp.indexer.BooleanIndexer
	at jdk.vm.ci.hotspot.CompilerToVM.resolveTypeInPool(Native Method)
	at jdk.vm.ci.hotspot.HotSpotConstantPool.loadReferencedType(HotSpotConstantPool.java:728)
	at sun.reflect.GeneratedMethodAccessor1.invoke(Unknown Source)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
	at java.lang.reflect.Method.invoke(Method.java:498)
	at com.oracle.graal.pointsto.infrastructure.WrappedConstantPool.loadReferencedType(WrappedConstantPool.java:88)

Since the accessing class is in the same package as the accessed class, I can only suspect some class loader problem or the fact this is called via reflection. @tkrodriguez do you have any ideas?

Glad this can be used for further investigations! I suggest you add some sort of save stack trace to log files each time these kings of try...except get executed, I can do a PR with it if you like? Or some other solution so silent exceptions are not going ignored.

@neomatrix369
Copy link
Contributor Author

neomatrix369 commented Oct 2, 2019

Thanks @neomatrix369 but neither of those (jstack?) traces provide the stack trace of the swallowed exception. The best way to see this would be something like:

diff --git a/substratevm/src/com.oracle.graal.pointsto/src/com/oracle/graal/pointsto/infrastructure/WrappedConstantPool.java b/substratevm/src/com.oracle.graal.pointsto/src/com/oracle/graal/pointsto/infrastructure/WrappedConstantPool.java
index a5f69e5bd6a..5fab1a2d5ee 100644
--- a/substratevm/src/com.oracle.graal.pointsto/src/com/oracle/graal/pointsto/infrastructure/WrappedConstantPool.java
+++ b/substratevm/src/com.oracle.graal.pointsto/src/com/oracle/graal/pointsto/infrastructure/WrappedConstantPool.java
@@ -96,6 +96,7 @@ public class WrappedConstantPool implements ConstantPool {
             } else if (ex instanceof ExceptionInInitializerError && ex.getCause() != null) {
                 cause = ex.getCause();
             }
+            cause.printStackTrace();
             throw new UnresolvedElementException("Error loading a referenced type: " + cause.toString(), cause);
         }
     }

@dougxc Please take look at the two gists on Silent exceptions again (#1711 (comment)), I have updated them with stacktraces like you wanted - from cause.printstacktrace().

@neomatrix369
Copy link
Contributor Author

This is the interesting trace:

Caused by: java.lang.IllegalAccessError: tried to access class org.bytedeco.javacpp.indexer.Raw from class org.bytedeco.javacpp.indexer.BooleanIndexer
	at jdk.vm.ci.hotspot.CompilerToVM.resolveTypeInPool(Native Method)
	at jdk.vm.ci.hotspot.HotSpotConstantPool.loadReferencedType(HotSpotConstantPool.java:728)
	at sun.reflect.GeneratedMethodAccessor1.invoke(Unknown Source)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
	at java.lang.reflect.Method.invoke(Method.java:498)
	at com.oracle.graal.pointsto.infrastructure.WrappedConstantPool.loadReferencedType(WrappedConstantPool.java:88)

Since the accessing class is in the same package as the accessed class, I can only suspect some class loader problem or the fact this is called via reflection. @tkrodriguez do you have any ideas?

Yes, we are allowing reflective access to Raw.getInstance(), see #1711 (comment)

@tkrodriguez
Copy link
Member

But since they are in the same package based on the name this access should always be ok. So unless there are multiple copies of these classes loaded I don't see how it could fail. Running with -verbose:class would show whether there are multiple copies.

Otherwise I'd have to assume there's some exception below all this that hiding what's really going on, though I can't really see a sequence of exceptions which could cause this. It might be interesting to try using a fastdebug JVM and run with -XX:+TraceExceptions. This would produce quite a lot of output but you'll be able to see every exception that's occurring.

Are the command to reproduce this problem relatively straightforward?

@dmlloyd
Copy link
Contributor

dmlloyd commented Oct 2, 2019

But since they are in the same package based on the name this access should always be ok.

Two classes are only in the same package if both their package name and class loader are identical.

The problem you're seeing could be explained by a JVMCI JAR trying to access a class in the built-in JVMCI module(or vice-versa) for example. Both would have the same package name, but would not be the same package. It follows that any code which compares only package names to predict access is incorrect (though I'm not sure that's the case here, it's just a good rule to keep in mind).

@neomatrix369
Copy link
Contributor Author

neomatrix369 commented Oct 3, 2019

But since they are in the same package based on the name this access should always be ok. So unless there are multiple copies of these classes loaded I don't see how it could fail. Running with -verbose:class would show whether there are multiple copies.

Otherwise I'd have to assume there's some exception below all this that hiding what's really going on, though I can't really see a sequence of exceptions which could cause this. It might be interesting to try using a fastdebug JVM and run with -XX:+TraceExceptions. This would produce quite a lot of output but you'll be able to see every exception that's occurring.

Are the command to reproduce this problem relatively straightforward?

@tkrodriguez
Some questions, when you say run -verbose:class do you mean pass it via the native-image builder or the native-image (fall-back version) or when I run the jar itself? I'm thinking the jar, am I right?

Is there a fastdebug JVM of GraalVM? CE or EE available? Can you provide via some download? Or is it easy to build via mx?

Again the option -XX:+TraceExceptions should I run when running the native-image builder or the native-image (fall-back version) or the jar?

These are my doubts before I can produce you the stacktrace you need: end-to-end!

@dougxc
Copy link
Member

dougxc commented Oct 3, 2019

Tom means to pass -J-verbose:class to native-image.
Even better would be if you could provide the uber jar you are trying to process along with the native-image command line you are currently using.

There is no fastdebug version of GraalVM. It's something you'd have to build yourself using the fastdebug binaries at https://github.com/graalvm/openjdk8-jvmci-builder/releases/tag/jvmci-19.3-b03. If you manage to do that, then you can pass -J-XX:+TraceExceptions to the native-image command.

@tkrodriguez
Copy link
Member

Yes sorry I was unclear. You can use -J-verbose:class option with the regular GraalVM and that should tell us whether there are multiple copies of those classes loaded. That seems unlikely to me given the context but you never know. As Dougs, the next step would probably be for you to give us the files and command line to reproduce the problem.

@neomatrix369
Copy link
Contributor Author

neomatrix369 commented Oct 3, 2019

As requested here is the jar file (zipped) with the scripts I use (in the folder called native-image)
https://www.dropbox.com/s/9tng5gzix7whof5/native-image-gitIssue1711.zip?dl=0 (let me know if you have any issues with the jar). The jar file is in the folder called target. I have also bundled the src/main/java/resources folder which also has the META-INF folder with the necessary native-image config files.

Commands to reproduce the issue:

$ unzip the archive
$ cd native-image
$ ./native-image-builder.sh  --native-image-args [ANY EXTRA ARGS]  \
           --build ../target/MLPMnist-1.0.0-bin.jar

### --native-image-args is optional

Below should give you an explanation on what the script can do, options etc...:

./native-image-builder.sh  --help 

@neomatrix369
Copy link
Contributor Author

Tom means to pass -J-verbose:class to native-image.
Even better would be if you could provide the uber jar you are trying to process along with the native-image command line you are currently using.

There is no fastdebug version of GraalVM. It's something you'd have to build yourself using the fastdebug binaries at https://github.com/graalvm/openjdk8-jvmci-builder/releases/tag/jvmci-19.3-b03. If you manage to do that, then you can pass -J-XX:+TraceExceptions to the native-image command.

I'll look into the fastdebug version sometime later, sounds useful for situations like the one we ar having currently.

@neomatrix369
Copy link
Contributor Author

As requested here is the jar file (zipped) with the scripts I use (in the folder called native-image)
https://www.dropbox.com/sh/sj0khjg6whwke0o/AACf1hTvNyzIjdn7YkMb9CEda?dl=0 (let me know if you have any issues with the jar). The jar file is in the folder called target. I have also bundled the src/main/java/resources folder which also has the META-INF folder with the necessary native-image config files.

Commands to reproduce the issue:

$ unzip the archive
$ cd native-image
$ ./native-image-builder.sh  --native-image-args [ANY EXTRA ARGS]  \
           --build ../target/MLPMnist-1.0.0-bin.jar

### --native-image-args is optional

Below should give you an explanation on what the script can do, options etc...:

./native-image-builder.sh  --help 

@dougxc @tkrodriguez did what I attached above work for you? I'm checking to see if its good for you to do your debugging and investigations. Pls let me know otherwise.

@tkrodriguez
Copy link
Member

Yes I'm able to reproduce the failure from the provided jar. I'm looking into it.

@tkrodriguez
Copy link
Member

The problem is that org.bytedeco.javacpp.indexer.Raw is loaded in 'sun/misc/Launcher$AppClassLoader' and org.bytedeco.javacpp.indexer.BooleanIndexer is loaded in com/oracle/svm/hosted/NativeImageClassLoader. Looking at the -verbose:class output I think the problem is that there are 2 copies of those packages. One is in jre/lib/svm/builder/javacpp.jar and the other is in your jar MLPMnist-1.0.0-bin.jar which is why they end up in different loaders. Your copy has extra classes which is what allows it to see the other package. If they had exactly the same classes you'd just end up using the copy in Substrate.

So basically this is a collision between classes used by Substrate and the classes being packaged. @christianwimmer It might be good if Substrate has some sanity checking when loading classes to make sure they are getting loaded from the expected source to make failures of this kind more obvious. Maybe these kinds of support classes used by Substrate need to be repackaged with different names to avoid collisions with user code?

@christianwimmer
Copy link

Yes, we need to make sure that we do not use any external libraries without adding a package prefix.

@tkrodriguez tkrodriguez removed their assignment Oct 5, 2019
@neomatrix369
Copy link
Contributor Author

neomatrix369 commented Oct 6, 2019

Thanks @tkrodriguez for your anaylsis and @christianwimmer for chipping in!

Looks like its all under control now, does it look like there is a quick fix for this might unblock a lot of libraries/frameworks/apps.

@tkrodriguez
Copy link
Member

It's possible if you replaced the javacpp.jar in the GraalVM image with a version that matches what you need everything would work fine. I think javacpp.jar is only used by the LLVM backend which isn't in use here. Actually just deleting it would probably let you move forward.

@neomatrix369
Copy link
Contributor Author

@tkrodriguez thanks for the tips, will do that and see where it goes from here!

@neomatrix369
Copy link
Contributor Author

@tkrodriguez thanks for the tips, will do that and see where it goes from here!

The build worked perfect! Thanks!

@neomatrix369
Copy link
Contributor Author

neomatrix369 commented Oct 9, 2019

I have managed to build and run the native-image, although I get the below exception:

Build model....
2019-10-09 04:44:51,893: o.n.l.f.Nd4jBackend - Loaded [CpuBackend] backend
Exception in thread "main" java.lang.ExceptionInInitializerError
	at com.oracle.svm.core.hub.ClassInitializationInfo.initialize(ClassInitializationInfo.java:290)
	at java.lang.Class.ensureInitialized(DynamicHub.java:467)
	at org.deeplearning4j.nn.conf.NeuralNetConfiguration$Builder.seed(NeuralNetConfiguration.java:671)
	at org.deeplearning4j.feedforward.mnist.MLPMnistSingleLayerTrain.execute(MLPMnistSingleLayerTrain.java:64)
	at org.deeplearning4j.feedforward.mnist.MLPMnistSingleLayerRunner.execute(MLPMnistSingleLayerRunner.java:101)
	at org.deeplearning4j.feedforward.mnist.MLPMnistSingleLayerRunner.main(MLPMnistSingleLayerRunner.java:72)
Caused by: java.lang.RuntimeException: java.lang.NullPointerException
	at org.nd4j.linalg.factory.Nd4j.initWithBackend(Nd4j.java:5900)
	at org.nd4j.linalg.factory.Nd4j.initContext(Nd4j.java:5766)
	at org.nd4j.linalg.factory.Nd4j.<clinit>(Nd4j.java:202)
	at com.oracle.svm.core.hub.ClassInitializationInfo.invokeClassInitializer(ClassInitializationInfo.java:350)
	at com.oracle.svm.core.hub.ClassInitializationInfo.initialize(ClassInitializationInfo.java:270)
	... 5 more
Caused by: java.lang.NullPointerException
	at org.nd4j.linalg.factory.Nd4j.isSupportedPlatform(Nd4j.java:5908)
	at org.nd4j.linalg.factory.Nd4j.initWithBackend(Nd4j.java:5785)
	... 9 more

Even though I have loaded the org.nd4j.linalg.factory.Nd4jBackend service loader class via the necessary config. Also reflectively opened the org.nd4j.linalg.factory.Nd4j and org.nd4j.linalg.factory.Nd4jBackend classes.

@neomatrix369
Copy link
Contributor Author

neomatrix369 commented Oct 9, 2019

I have resolved the above error. Although native image still breaks out just as its loading - other JNI related issues are coming to the surface.

@neomatrix369
Copy link
Contributor Author

neomatrix369 commented Oct 24, 2019

The last I tested, this issue seems to be fixed starting 19.2.1, let me know if you don't agree. I'll close it, for now, we can reopen if it reoccurs.

/cc @christianwimmer @tkrodriguez @dougxc

@peter-hofer
Copy link
Member

I don't think this is fixed as long as native-image doesn't have javacpp in a shadow package. I'll keep you posted.

@peter-hofer peter-hofer reopened this Oct 29, 2019
@neomatrix369
Copy link
Contributor Author

I don't think this is fixed as long as native-image doesn't have javacpp in a shadow package. I'll keep you posted.

I was hoping it was fixed, one less thing for you guys to worry about, thanks for keeping me posted!

@peter-hofer
Copy link
Member

Fixed in 9140f26

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants