Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Job fails due to NoSuchMethodError exception #103

Open
kosii opened this issue Apr 10, 2017 · 8 comments
Open

Job fails due to NoSuchMethodError exception #103

kosii opened this issue Apr 10, 2017 · 8 comments

Comments

@kosii
Copy link

kosii commented Apr 10, 2017

I'm trying to execute the index_spark job on a Spark 1.6.2 cluster pre-built for Hadoop 2.4.0.
Each time I submit the job, I've got this exception:

2017-04-10T16:17:58,925 WARN [task-result-getter-0] org.apache.spark.scheduler.TaskSetManager - Lost task 2.0 in stage 0.0 (TID 2, 172.20.0.1): java.lang.NoSuchMethodError: com.google.inject.util.Types.collectionOf(Ljava/lang/reflect/Type;)Ljava/lang/reflect/ParameterizedType;
	at com.google.inject.multibindings.Multibinder.collectionOfProvidersOf(Multibinder.java:202)
	at com.google.inject.multibindings.Multibinder$RealMultibinder.<init>(Multibinder.java:283)
	at com.google.inject.multibindings.Multibinder$RealMultibinder.<init>(Multibinder.java:258)
	at com.google.inject.multibindings.Multibinder.newRealSetBinder(Multibinder.java:178)
	at com.google.inject.multibindings.Multibinder.newSetBinder(Multibinder.java:150)
	at io.druid.guice.LifecycleModule.getEagerBinder(LifecycleModule.java:130)
	at io.druid.guice.LifecycleModule.configure(LifecycleModule.java:136)
	at com.google.inject.spi.Elements$RecordingBinder.install(Elements.java:223)
	at com.google.inject.spi.Elements.getElements(Elements.java:101)
	at com.google.inject.spi.Elements.getElements(Elements.java:92)
	at com.google.inject.util.Modules$RealOverriddenModuleBuilder$1.configure(Modules.java:152)
	at com.google.inject.AbstractModule.configure(AbstractModule.java:59)
	at com.google.inject.spi.Elements$RecordingBinder.install(Elements.java:223)
	at com.google.inject.spi.Elements.getElements(Elements.java:101)
	at com.google.inject.spi.Elements.getElements(Elements.java:92)
	at com.google.inject.util.Modules$RealOverriddenModuleBuilder$1.configure(Modules.java:152)
	at com.google.inject.AbstractModule.configure(AbstractModule.java:59)
	at com.google.inject.spi.Elements$RecordingBinder.install(Elements.java:223)
	at com.google.inject.spi.Elements.getElements(Elements.java:101)
	at com.google.inject.internal.InjectorShell$Builder.build(InjectorShell.java:133)
	at com.google.inject.internal.InternalInjectorCreator.build(InternalInjectorCreator.java:103)
	at com.google.inject.Guice.createInjector(Guice.java:95)
	at com.google.inject.Guice.createInjector(Guice.java:72)
	at com.google.inject.Guice.createInjector(Guice.java:62)
	at io.druid.initialization.Initialization.makeInjectorWithModules(Initialization.java:366)
	at io.druid.indexer.spark.SerializedJsonStatic$.liftedTree1$1(SparkDruidIndexer.scala:438)
	at io.druid.indexer.spark.SerializedJsonStatic$.injector$lzycompute(SparkDruidIndexer.scala:437)
	at io.druid.indexer.spark.SerializedJsonStatic$.injector(SparkDruidIndexer.scala:436)
	at io.druid.indexer.spark.SerializedJsonStatic$.liftedTree2$1(SparkDruidIndexer.scala:465)
	at io.druid.indexer.spark.SerializedJsonStatic$.mapper$lzycompute(SparkDruidIndexer.scala:464)
	at io.druid.indexer.spark.SerializedJsonStatic$.mapper(SparkDruidIndexer.scala:463)
	at io.druid.indexer.spark.SerializedJson.getMap(SparkDruidIndexer.scala:520)
	at io.druid.indexer.spark.SerializedJson.readObject(SparkDruidIndexer.scala:534)
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
	at java.lang.reflect.Method.invoke(Method.java:498)
	at java.io.ObjectStreamClass.invokeReadObject(ObjectStreamClass.java:1058)
	at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:2122)
	at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:2013)
	at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1535)
	at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:2231)
	at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:2155)
	at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:2013)
	at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1535)
	at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:2231)
	at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:2155)
	at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:2013)
	at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1535)
	at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:2231)
	at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:2155)
	at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:2013)
	at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1535)
	at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:2231)
	at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:2155)
	at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:2013)
	at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1535)
	at java.io.ObjectInputStream.readObject(ObjectInputStream.java:422)
	at scala.collection.immutable.$colon$colon.readObject(List.scala:362)
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
	at java.lang.reflect.Method.invoke(Method.java:498)
	at java.io.ObjectStreamClass.invokeReadObject(ObjectStreamClass.java:1058)
	at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:2122)
	at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:2013)
	at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1535)
	at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:2231)
	at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:2155)
	at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:2013)
	at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1535)
	at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:2231)
	at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:2155)
	at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:2013)
	at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1535)
	at java.io.ObjectInputStream.readObject(ObjectInputStream.java:422)
	at org.apache.spark.serializer.JavaDeserializationStream.readObject(JavaSerializer.scala:76)
	at org.apache.spark.serializer.JavaSerializerInstance.deserialize(JavaSerializer.scala:115)
	at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:64)
	at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:41)
	at org.apache.spark.scheduler.Task.run(Task.scala:89)
	at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:227)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
	at java.lang.Thread.run(Thread.java:745)

which causes my job to fail.

I used the java -classpath "lib/*" io.druid.cli.Main tools pull-deps -c io.druid.extensions:druid-spark-batch_2.10:0.9.2.14 -h org.apache.spark:spark-core_2.10:1.6.2 command to install my package.

I tried to add manually the guice jar to my spark's classpath, but it didn't help. I also noticed, that executing the job works with a local Spark master (local[*]). I read this page, because I found similar errors for the index_hadoop job, but I couldn't really apply those tips for my case.

Any help would be really appreciated.

update: I'm using imply 2.0.0.

@kosii
Copy link
Author

kosii commented Apr 11, 2017

I'm not sure if it matters, but the cluster runs in standalone mode

@kosii
Copy link
Author

kosii commented Apr 11, 2017

Also, adding "spark.executor.userClassPathFirst": "true" to the job properties resolves this problem, but creates another one:

2017-04-11T08:59:10,762 WARN [task-result-getter-1] org.apache.spark.scheduler.TaskSetManager - Lost task 1.0 in stage 0.0 (TID 1, 172.20.0.1): java.io.IOException: java.lang.ClassCastException: cannot assign instance of scala.Some to field org.apache.spark.Accumulable.name of type scala.Option in instance of org.apache.spark.Accumulator
	at org.apache.spark.util.Utils$.tryOrIOException(Utils.scala:1212)
	at org.apache.spark.Accumulable.readObject(Accumulators.scala:151)
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
	at java.lang.reflect.Method.invoke(Method.java:498)
	at java.io.ObjectStreamClass.invokeReadObject(ObjectStreamClass.java:1058)
	at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:2122)
	at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:2013)
	at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1535)
	at java.io.ObjectInputStream.readObject(ObjectInputStream.java:422)
	at scala.collection.immutable.$colon$colon.readObject(List.scala:362)
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
	at java.lang.reflect.Method.invoke(Method.java:498)
	at java.io.ObjectStreamClass.invokeReadObject(ObjectStreamClass.java:1058)
	at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:2122)
	at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:2013)
	at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1535)
	at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:2231)
	at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:2155)
	at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:2013)
	at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1535)
	at java.io.ObjectInputStream.readObject(ObjectInputStream.java:422)
	at org.apache.spark.serializer.JavaDeserializationStream.readObject(JavaSerializer.scala:76)
	at org.apache.spark.serializer.JavaSerializerInstance.deserialize(JavaSerializer.scala:115)
	at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:207)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
	at java.lang.Thread.run(Thread.java:745)
Caused by: java.lang.ClassCastException: cannot assign instance of scala.Some to field org.apache.spark.Accumulable.name of type scala.Option in instance of org.apache.spark.Accumulator
	at java.io.ObjectStreamClass$FieldReflector.setObjFieldValues(ObjectStreamClass.java:2133)
	at java.io.ObjectStreamClass.setObjFieldValues(ObjectStreamClass.java:1305)
	at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:2237)
	at java.io.ObjectInputStream.defaultReadObject(ObjectInputStream.java:552)
	at org.apache.spark.Accumulable$$anonfun$readObject$1.apply$mcV$sp(Accumulators.scala:152)
	at org.apache.spark.Accumulable$$anonfun$readObject$1.apply(Accumulators.scala:151)
	at org.apache.spark.Accumulable$$anonfun$readObject$1.apply(Accumulators.scala:151)
	at org.apache.spark.util.Utils$.tryOrIOException(Utils.scala:1205)
	... 30 more

@kosii
Copy link
Author

kosii commented Apr 12, 2017

For further reference the solution was addding the right versions of guice and guava to the executor's classpath. I added this line to the properties object of the ingestion specs:

"properties": {
    "spark.executor.extraClassPath": "guice-4.1.0.jar:guava-16.0.1.jar"
}

So I'm not sure if it's a bug or not, but I expected it to work out of the box as I'm using a pretty standard version of everything.

@drcrallen
Copy link
Contributor

Class path problems are very nasty and hard to track down, especially once you distribute stuff out to the cluster. Thank you a ton for reporting your work around. This ticket will remain open until a more sustainable solution is available.

@bendoerr
Copy link

bendoerr commented Dec 6, 2017

Sharing my solution, we run in cluster mode and provide a "spark.executor.uri", didn't matter what I tried, I either ended up with the wrong version of guice on the executor or the wrong protobuf. I ended up building a modified spark dist to provide to "spark.executor.uri" with the same guice, guava and protobuf jars as druid.

@drcrallen
Copy link
Contributor

@bendoerr : if you're building your own druid after 0.11.0 you can also use the spark2 profile to package the "correct" jars compatible with stock spark: https://github.com/druid-io/druid/blob/druid-0.11.0/pom.xml#L1154

@ywilkof
Copy link

ywilkof commented Dec 14, 2017

@drcrallen built using spark 2.x profile, but submitting jobs to Spark still fails with the same NoMethod error.

@richiesgr
Copy link

For further reference the solution was addding the right versions of guice and guava to the executor's classpath. I added this line to the properties object of the ingestion specs:

"properties": {
    "spark.executor.extraClassPath": "guice-4.1.0.jar:guava-16.0.1.jar"
}

So I'm not sure if it's a bug or not, but I expected it to work out of the box as I'm using a pretty standard version of everything.

Hi
I've exactly the same problem here and I've tried your solution but it's not working for me

"spark.executor.userClassPathFirst": "true",
"spark.executor.extraClassPath": "guice-4.1.0.jar:guava-16.0.1.jar",

if I add only the guava ref I get the same NoSuchMethod exception If I add both then I get

java.lang.ClassCastException: cannot assign instance of scala.Some to field org.apache.spark.util.AccumulatorMetadata.name of type scala.Option in instance of org.apache.spark.util.AccumulatorMetadata

any idea to help ?
Thanks

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants