Skip to content
This repository was archived by the owner on Dec 20, 2018. It is now read-only.
This repository was archived by the owner on Dec 20, 2018. It is now read-only.

Getting "java.lang.StackOverflowError" when reading avro file #47

@imlk

Description

@imlk

I got "java.lang.stackOverflowError" when using sqlContext.load(filePath, "com.databricks.spark.avro") to read a specific avro file to a DataFrame.
However, I could not reproduce this issue on the sample episode.avro file that you provided; thus it seems like this is something related to the schema of the avro file. I realize that the schema of the avro file is much more complicated than the schema of episode.avo file.
But in the meanwhile, when I was using either the avro-tool.jar tool that provided by Apache or the Python API, it can read and parse our avro file without any issues.
Do you think there is a compatibility issue between the spark-avro and the avro file schema we use ? Is there any workaround that might fix it for now ? Below is part of the error stack. Thanks !

15/05/11 17:00:20 INFO Utils: Successfully started service 'SparkUI' on port 4040.
15/05/11 17:00:20 INFO SparkUI: Started SparkUI at http://10.0.1.11:4040
15/05/11 17:00:20 INFO Executor: Starting executor ID  on host localhost
15/05/11 17:00:20 INFO AkkaUtils: Connecting to HeartbeatReceiver: akka.tcp://sparkDriver@10.0.1.11:57777/user/HeartbeatReceiver
15/05/11 17:00:20 INFO NettyBlockTransferService: Server created on 57779
15/05/11 17:00:20 INFO BlockManagerMaster: Trying to register BlockManager
15/05/11 17:00:20 INFO BlockManagerMasterActor: Registering block manager localhost:57779 with 2.9 GB RAM, BlockManagerId(, localhost, 57779)
15/05/11 17:00:20 INFO BlockManagerMaster: Registered BlockManager
Exception in thread "main" java.lang.StackOverflowError
    at scala.collection.mutable.ArrayBuffer.(ArrayBuffer.scala:47)
    at scala.collection.mutable.ArrayBuffer.(ArrayBuffer.scala:62)
    at scala.collection.mutable.Buffer$.newBuilder(Buffer.scala:44)
    at scala.collection.generic.GenericTraversableTemplate$class.newBuilder(GenericTraversableTemplate.scala:64)
    at scala.collection.AbstractTraversable.newBuilder(Traversable.scala:105)
    at scala.collection.TraversableLike$class.filter(TraversableLike.scala:262)
    at scala.collection.AbstractTraversable.filter(Traversable.scala:105)
    at scala.collection.TraversableLike$class.filterNot(TraversableLike.scala:274)
    at scala.collection.AbstractTraversable.filterNot(Traversable.scala:105)
    at com.databricks.spark.avro.SchemaConverters$.toSqlType(SchemaConverters.scala:72)
    at com.databricks.spark.avro.SchemaConverters$$anonfun$1.apply(SchemaConverters.scala:51)
    at com.databricks.spark.avro.SchemaConverters$$anonfun$1.apply(SchemaConverters.scala:50)
    at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:244)
    at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:244)
    at scala.collection.Iterator$class.foreach(Iterator.scala:727)
    at scala.collection.AbstractIterator.foreach(Iterator.scala:1157)
    at scala.collection.IterableLike$class.foreach(IterableLike.scala:72)
    at scala.collection.AbstractIterable.foreach(Iterable.scala:54)
    at scala.collection.TraversableLike$class.map(TraversableLike.scala:244)
    at scala.collection.AbstractTraversable.map(Traversable.scala:105)
    at com.databricks.spark.avro.SchemaConverters$.toSqlType(SchemaConverters.scala:50)
    at com.databricks.spark.avro.SchemaConverters$.toSqlType(SchemaConverters.scala:58)
    at com.databricks.spark.avro.SchemaConverters$.toSqlType(SchemaConverters.scala:74)
    at com.databricks.spark.avro.SchemaConverters$$anonfun$1.apply(SchemaConverters.scala:51)
    at com.databricks.spark.avro.SchemaConverters$$anonfun$1.apply(SchemaConverters.scala:50)
    at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:244)
    at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:244)
    at scala.collection.Iterator$class.foreach(Iterator.scala:727)
    at scala.collection.AbstractIterator.foreach(Iterator.scala:1157)
    at scala.collection.IterableLike$class.foreach(IterableLike.scala:72)
    at scala.collection.AbstractIterable.foreach(Iterable.scala:54)
    at scala.collection.TraversableLike$class.map(TraversableLike.scala:244)
    at scala.collection.AbstractTraversable.map(Traversable.scala:105)
    at com.databricks.spark.avro.SchemaConverters$.toSqlType(SchemaConverters.scala:50)
    at com.databricks.spark.avro.SchemaConverters$.toSqlType(SchemaConverters.scala:74)
    at com.databricks.spark.avro.SchemaConverters$$anonfun$1.apply(SchemaConverters.scala:51)
    at com.databricks.spark.avro.SchemaConverters$$anonfun$1.apply(SchemaConverters.scala:50)
    at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:244)
    at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:244)
    at scala.collection.Iterator$class.foreach(Iterator.scala:727)
    at scala.collection.AbstractIterator.foreach(Iterator.scala:1157)
    at scala.collection.IterableLike$class.foreach(IterableLike.scala:72)
    at scala.collection.AbstractIterable.foreach(Iterable.scala:54)
    at scala.collection.TraversableLike$class.map(TraversableLike.scala:244)
    at scala.collection.AbstractTraversable.map(Traversable.scala:105)
    at com.databricks.spark.avro.SchemaConverters$.toSqlType(SchemaConverters.scala:50)
    at com.databricks.spark.avro.SchemaConverters$.toSqlType(SchemaConverters.scala:58)
    at com.databricks.spark.avro.SchemaConverters$.toSqlType(SchemaConverters.scala:74)
    at com.databricks.spark.avro.SchemaConverters$$anonfun$1.apply(SchemaConverters.scala:51)
    at com.databricks.spark.avro.SchemaConverters$$anonfun$1.apply(SchemaConverters.scala:50)

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions