-
Notifications
You must be signed in to change notification settings - Fork 305
Getting "java.lang.StackOverflowError" when reading avro file #47
Description
I got "java.lang.stackOverflowError" when using sqlContext.load(filePath, "com.databricks.spark.avro") to read a specific avro file to a DataFrame.
However, I could not reproduce this issue on the sample episode.avro file that you provided; thus it seems like this is something related to the schema of the avro file. I realize that the schema of the avro file is much more complicated than the schema of episode.avo file.
But in the meanwhile, when I was using either the avro-tool.jar tool that provided by Apache or the Python API, it can read and parse our avro file without any issues.
Do you think there is a compatibility issue between the spark-avro and the avro file schema we use ? Is there any workaround that might fix it for now ? Below is part of the error stack. Thanks !
15/05/11 17:00:20 INFO Utils: Successfully started service 'SparkUI' on port 4040.
15/05/11 17:00:20 INFO SparkUI: Started SparkUI at http://10.0.1.11:4040
15/05/11 17:00:20 INFO Executor: Starting executor ID on host localhost
15/05/11 17:00:20 INFO AkkaUtils: Connecting to HeartbeatReceiver: akka.tcp://sparkDriver@10.0.1.11:57777/user/HeartbeatReceiver
15/05/11 17:00:20 INFO NettyBlockTransferService: Server created on 57779
15/05/11 17:00:20 INFO BlockManagerMaster: Trying to register BlockManager
15/05/11 17:00:20 INFO BlockManagerMasterActor: Registering block manager localhost:57779 with 2.9 GB RAM, BlockManagerId(, localhost, 57779)
15/05/11 17:00:20 INFO BlockManagerMaster: Registered BlockManager
Exception in thread "main" java.lang.StackOverflowError
at scala.collection.mutable.ArrayBuffer.(ArrayBuffer.scala:47)
at scala.collection.mutable.ArrayBuffer.(ArrayBuffer.scala:62)
at scala.collection.mutable.Buffer$.newBuilder(Buffer.scala:44)
at scala.collection.generic.GenericTraversableTemplate$class.newBuilder(GenericTraversableTemplate.scala:64)
at scala.collection.AbstractTraversable.newBuilder(Traversable.scala:105)
at scala.collection.TraversableLike$class.filter(TraversableLike.scala:262)
at scala.collection.AbstractTraversable.filter(Traversable.scala:105)
at scala.collection.TraversableLike$class.filterNot(TraversableLike.scala:274)
at scala.collection.AbstractTraversable.filterNot(Traversable.scala:105)
at com.databricks.spark.avro.SchemaConverters$.toSqlType(SchemaConverters.scala:72)
at com.databricks.spark.avro.SchemaConverters$$anonfun$1.apply(SchemaConverters.scala:51)
at com.databricks.spark.avro.SchemaConverters$$anonfun$1.apply(SchemaConverters.scala:50)
at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:244)
at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:244)
at scala.collection.Iterator$class.foreach(Iterator.scala:727)
at scala.collection.AbstractIterator.foreach(Iterator.scala:1157)
at scala.collection.IterableLike$class.foreach(IterableLike.scala:72)
at scala.collection.AbstractIterable.foreach(Iterable.scala:54)
at scala.collection.TraversableLike$class.map(TraversableLike.scala:244)
at scala.collection.AbstractTraversable.map(Traversable.scala:105)
at com.databricks.spark.avro.SchemaConverters$.toSqlType(SchemaConverters.scala:50)
at com.databricks.spark.avro.SchemaConverters$.toSqlType(SchemaConverters.scala:58)
at com.databricks.spark.avro.SchemaConverters$.toSqlType(SchemaConverters.scala:74)
at com.databricks.spark.avro.SchemaConverters$$anonfun$1.apply(SchemaConverters.scala:51)
at com.databricks.spark.avro.SchemaConverters$$anonfun$1.apply(SchemaConverters.scala:50)
at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:244)
at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:244)
at scala.collection.Iterator$class.foreach(Iterator.scala:727)
at scala.collection.AbstractIterator.foreach(Iterator.scala:1157)
at scala.collection.IterableLike$class.foreach(IterableLike.scala:72)
at scala.collection.AbstractIterable.foreach(Iterable.scala:54)
at scala.collection.TraversableLike$class.map(TraversableLike.scala:244)
at scala.collection.AbstractTraversable.map(Traversable.scala:105)
at com.databricks.spark.avro.SchemaConverters$.toSqlType(SchemaConverters.scala:50)
at com.databricks.spark.avro.SchemaConverters$.toSqlType(SchemaConverters.scala:74)
at com.databricks.spark.avro.SchemaConverters$$anonfun$1.apply(SchemaConverters.scala:51)
at com.databricks.spark.avro.SchemaConverters$$anonfun$1.apply(SchemaConverters.scala:50)
at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:244)
at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:244)
at scala.collection.Iterator$class.foreach(Iterator.scala:727)
at scala.collection.AbstractIterator.foreach(Iterator.scala:1157)
at scala.collection.IterableLike$class.foreach(IterableLike.scala:72)
at scala.collection.AbstractIterable.foreach(Iterable.scala:54)
at scala.collection.TraversableLike$class.map(TraversableLike.scala:244)
at scala.collection.AbstractTraversable.map(Traversable.scala:105)
at com.databricks.spark.avro.SchemaConverters$.toSqlType(SchemaConverters.scala:50)
at com.databricks.spark.avro.SchemaConverters$.toSqlType(SchemaConverters.scala:58)
at com.databricks.spark.avro.SchemaConverters$.toSqlType(SchemaConverters.scala:74)
at com.databricks.spark.avro.SchemaConverters$$anonfun$1.apply(SchemaConverters.scala:51)
at com.databricks.spark.avro.SchemaConverters$$anonfun$1.apply(SchemaConverters.scala:50)