Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Spline producer url unreachable from spark #805

Closed
3 of 6 tasks
siddharths067 opened this issue Nov 25, 2020 · 5 comments
Closed
3 of 6 tasks

Spline producer url unreachable from spark #805

siddharths067 opened this issue Nov 25, 2020 · 5 comments
Assignees
Labels

Comments

@siddharths067
Copy link

siddharths067 commented Nov 25, 2020

Describe the bug

Spark cannot connect to spline gateway, HTTP Response 404

I used the TLDR configuration and packages given in https://absaoss.github.io/spline/. za.co.absa.spline.agent.spark:spark-2.4-spline-agent-bundle_2.12:0.5.5 and 'spark.spline.producer.url' as 'http://localhost:9090/producer'

But I get this error in my spark job

20/11/25 04:05:03 ERROR QueryExecutionEventHandlerFactory: Spline initialization failed! Spark Lineage tracking is DISABLED.
za.co.absa.spline.harvester.exception.SplineInitializationException: Spark Agent was not able to establish connection to Spline Gateway. HTTP Response: 404 

The docker-compose logs showed this error in spline container

spline_1    | 04:15:29.924 [main] INFO  z.c.a.s.p.ArangoRepoConfig$$EnhancerBySpringCGLIB$$71b73625 - Connecting to arangodb://arangodb:8529/spline
spline_1    | 04:15:30.466 [main] WARN  z.c.a.s.g.r.AppInitializer$$anon$1 - Exception encountered during context initialization - cancelling refresh attempt: org.springframework.beans.factory.BeanCreationException: Error creating bean with name 'arangoRepoConfig': Invocation of init method failed; nested exception is java.lang.RuntimeException: Database version 0.4.0 is out of date, version 0.5.5 is required. Please execute 'java -jar admin-0.5.5.jar db-upgrade' to upgrade the database.
spline_1    | 04:15:30.482 [main] ERROR o.s.web.context.ContextLoader - Context initialization failed
spline_1    | org.springframework.beans.factory.BeanCreationException: Error creating bean with name 'arangoRepoConfig': Invocation of init method failed; nested exception is java.lang.RuntimeException: Database version 0.4.0 is out of date, version 0.5.5 is required. Please execute 'java -jar admin-0.5.5.jar db-upgrade' to upgrade the database.

following the logs, I ran the recommended command by modifying the docker-compose image java -jar admin-0.5.5.jar db-upgrade as follows

    command: >
      bash -c "echo 'Initializing Spline DB...'
      && curl -O -s https://repo1.maven.org/maven2/za/co/absa/spline/admin/0.5.0/admin-0.5.0.jar
      && java -jar ./admin-0.5.0.jar db-init arangodb://arangodb/spline -s
      && java -jar admin-0.5.5.jar db-upgrade
      && catalina.sh run"

But I still get this error and spline exited.

spline_1    | Error: Unable to access jarfile admin-0.5.5.jar
spline_spline_1 exited with code 1

Versions

Please provide versions of: Spline, Spark and Scala that were in use when the bug happened.
Scala 2.12
Spline 0.5
Spark 3.0.0

Components State

  • ArangoDB running without errors
  • ArangoDB spline database initialized
  • Rest Gateway running and
    • connects to ArangoDB
    • there are no errors in logs
  • Spline UI running and
    • connects to Rest Gateway consumer
    • there are no errors in logs
@cerveada cerveada self-assigned this Nov 25, 2020
cerveada added a commit that referenced this issue Nov 25, 2020
@cerveada
Copy link
Contributor

cerveada commented Nov 25, 2020

Please remove the containers and try using the following docker compose
https://github.com/AbsaOSS/spline/blob/docker-compose-update-to-0.5.5/docker-compose.yml

The current one wasn't updated and it's using the latest version which changed so it is causing these problems.

Edit: docker compose in the documentation is fixed now.

@siddharths067
Copy link
Author

@cerveada the docker-compose image now runs successfully. But after my spark job ends, it throws this exception

20/11/25 10:53:58 ERROR Utils: uncaught error in thread spark-listener-group-shared, stopping SparkContext
java.lang.NoSuchMethodError: 'org.json4s.Formats org.json4s.Formats.$plus$plus(scala.collection.Traversable)'
	at za.co.absa.commons.json.format.JavaTypesSupport.formats(JavaTypesSupport.scala:23)
	at za.co.absa.commons.json.format.JavaTypesSupport.formats$(JavaTypesSupport.scala:23)
	at __wrapper$1$4fc0c0d1871142fb815f664199d41b35.__wrapper$1$4fc0c0d1871142fb815f664199d41b35$$anon$1.formats(<no source file>)
	at za.co.absa.commons.json.AbstractJsonSerDe.$init$(AbstractJsonSerDe.scala:32)
	at __wrapper$1$4fc0c0d1871142fb815f664199d41b35.__wrapper$1$4fc0c0d1871142fb815f664199d41b35$$anon$1.<init>(<no source file>)
	at __wrapper$1$4fc0c0d1871142fb815f664199d41b35.__wrapper$1$4fc0c0d1871142fb815f664199d41b35$.$anonfun$wrapper$1(<no source file>)
	at za.co.absa.spline.harvester.json.HarvesterJsonSerDe$.<init>(HarvesterJsonSerDe.scala:37)
	at za.co.absa.spline.harvester.json.HarvesterJsonSerDe$.<clinit>(HarvesterJsonSerDe.scala)
	at za.co.absa.spline.harvester.dispatcher.HttpLineageDispatcher.send(HttpLineageDispatcher.scala:148)
	at za.co.absa.spline.harvester.QueryExecutionEventHandler.$anonfun$onSuccess$2(QueryExecutionEventHandler.scala:45)
	at za.co.absa.spline.harvester.QueryExecutionEventHandler.$anonfun$onSuccess$2$adapted(QueryExecutionEventHandler.scala:43)
	at scala.Option.foreach(Option.scala:407)
	at za.co.absa.spline.harvester.QueryExecutionEventHandler.onSuccess(QueryExecutionEventHandler.scala:43)
	at za.co.absa.spline.harvester.listener.SplineQueryExecutionListener.$anonfun$onSuccess$2(SplineQueryExecutionListener.scala:40)
	at za.co.absa.spline.harvester.listener.SplineQueryExecutionListener.$anonfun$onSuccess$2$adapted(SplineQueryExecutionListener.scala:40)
	at scala.Option.foreach(Option.scala:407)
	at za.co.absa.spline.harvester.listener.SplineQueryExecutionListener.$anonfun$onSuccess$1(SplineQueryExecutionListener.scala:40)
	at za.co.absa.spline.harvester.listener.SplineQueryExecutionListener.withErrorHandling(SplineQueryExecutionListener.scala:49)
	at za.co.absa.spline.harvester.listener.SplineQueryExecutionListener.onSuccess(SplineQueryExecutionListener.scala:40)
	at org.apache.spark.sql.util.ExecutionListenerBus.doPostEvent(QueryExecutionListener.scala:153)
	at org.apache.spark.sql.util.ExecutionListenerBus.doPostEvent(QueryExecutionListener.scala:127)
	at org.apache.spark.util.ListenerBus.postToAll(ListenerBus.scala:115)
	at org.apache.spark.util.ListenerBus.postToAll$(ListenerBus.scala:99)
	at org.apache.spark.sql.util.ExecutionListenerBus.postToAll(QueryExecutionListener.scala:127)
	at org.apache.spark.sql.util.ExecutionListenerBus.onOtherEvent(QueryExecutionListener.scala:133)
	at org.apache.spark.scheduler.SparkListenerBus.doPostEvent(SparkListenerBus.scala:82)
	at org.apache.spark.scheduler.SparkListenerBus.doPostEvent$(SparkListenerBus.scala:28)
	at org.apache.spark.scheduler.AsyncEventQueue.doPostEvent(AsyncEventQueue.scala:37)
	at org.apache.spark.scheduler.AsyncEventQueue.doPostEvent(AsyncEventQueue.scala:37)
	at org.apache.spark.util.ListenerBus.postToAll(ListenerBus.scala:115)
	at org.apache.spark.util.ListenerBus.postToAll$(ListenerBus.scala:99)
	at org.apache.spark.scheduler.AsyncEventQueue.super$postToAll(AsyncEventQueue.scala:105)
	at org.apache.spark.scheduler.AsyncEventQueue.$anonfun$dispatch$1(AsyncEventQueue.scala:105)
	at scala.runtime.java8.JFunction0$mcJ$sp.apply(JFunction0$mcJ$sp.java:23)
	at scala.util.DynamicVariable.withValue(DynamicVariable.scala:62)
	at org.apache.spark.scheduler.AsyncEventQueue.org$apache$spark$scheduler$AsyncEventQueue$$dispatch(AsyncEventQueue.scala:100)
	at org.apache.spark.scheduler.AsyncEventQueue$$anon$2.$anonfun$run$1(AsyncEventQueue.scala:96)
	at org.apache.spark.util.Utils$.tryOrStopSparkContext(Utils.scala:1319)
	at org.apache.spark.scheduler.AsyncEventQueue$$anon$2.run(AsyncEventQueue.scala:96)
20/11/25 10:53:58 ERROR Utils: throw uncaught fatal error in thread spark-listener-group-shared
java.lang.NoSuchMethodError: 'org.json4s.Formats org.json4s.Formats.$plus$plus(scala.collection.Traversable)'
	at za.co.absa.commons.json.format.JavaTypesSupport.formats(JavaTypesSupport.scala:23)
	at za.co.absa.commons.json.format.JavaTypesSupport.formats$(JavaTypesSupport.scala:23)
	at __wrapper$1$4fc0c0d1871142fb815f664199d41b35.__wrapper$1$4fc0c0d1871142fb815f664199d41b35$$anon$1.formats(<no source file>)
	at za.co.absa.commons.json.AbstractJsonSerDe.$init$(AbstractJsonSerDe.scala:32)
	at __wrapper$1$4fc0c0d1871142fb815f664199d41b35.__wrapper$1$4fc0c0d1871142fb815f664199d41b35$$anon$1.<init>(<no source file>)
	at __wrapper$1$4fc0c0d1871142fb815f664199d41b35.__wrapper$1$4fc0c0d1871142fb815f664199d41b35$.$anonfun$wrapper$1(<no source file>)
	at za.co.absa.spline.harvester.json.HarvesterJsonSerDe$.<init>(HarvesterJsonSerDe.scala:37)
	at za.co.absa.spline.harvester.json.HarvesterJsonSerDe$.<clinit>(HarvesterJsonSerDe.scala)
	at za.co.absa.spline.harvester.dispatcher.HttpLineageDispatcher.send(HttpLineageDispatcher.scala:148)
	at za.co.absa.spline.harvester.QueryExecutionEventHandler.$anonfun$onSuccess$2(QueryExecutionEventHandler.scala:45)
	at za.co.absa.spline.harvester.QueryExecutionEventHandler.$anonfun$onSuccess$2$adapted(QueryExecutionEventHandler.scala:43)
	at scala.Option.foreach(Option.scala:407)
	at za.co.absa.spline.harvester.QueryExecutionEventHandler.onSuccess(QueryExecutionEventHandler.scala:43)
	at za.co.absa.spline.harvester.listener.SplineQueryExecutionListener.$anonfun$onSuccess$2(SplineQueryExecutionListener.scala:40)
	at za.co.absa.spline.harvester.listener.SplineQueryExecutionListener.$anonfun$onSuccess$2$adapted(SplineQueryExecutionListener.scala:40)
	at scala.Option.foreach(Option.scala:407)
	at za.co.absa.spline.harvester.listener.SplineQueryExecutionListener.$anonfun$onSuccess$1(SplineQueryExecutionListener.scala:40)
	at za.co.absa.spline.harvester.listener.SplineQueryExecutionListener.withErrorHandling(SplineQueryExecutionListener.scala:49)
	at za.co.absa.spline.harvester.listener.SplineQueryExecutionListener.onSuccess(SplineQueryExecutionListener.scala:40)
	at org.apache.spark.sql.util.ExecutionListenerBus.doPostEvent(QueryExecutionListener.scala:153)
	at org.apache.spark.sql.util.ExecutionListenerBus.doPostEvent(QueryExecutionListener.scala:127)
	at org.apache.spark.util.ListenerBus.postToAll(ListenerBus.scala:115)
	at org.apache.spark.util.ListenerBus.postToAll$(ListenerBus.scala:99)
	at org.apache.spark.sql.util.ExecutionListenerBus.postToAll(QueryExecutionListener.scala:127)
	at org.apache.spark.sql.util.ExecutionListenerBus.onOtherEvent(QueryExecutionListener.scala:133)
	at org.apache.spark.scheduler.SparkListenerBus.doPostEvent(SparkListenerBus.scala:82)
	at org.apache.spark.scheduler.SparkListenerBus.doPostEvent$(SparkListenerBus.scala:28)
	at org.apache.spark.scheduler.AsyncEventQueue.doPostEvent(AsyncEventQueue.scala:37)
	at org.apache.spark.scheduler.AsyncEventQueue.doPostEvent(AsyncEventQueue.scala:37)
	at org.apache.spark.util.ListenerBus.postToAll(ListenerBus.scala:115)
	at org.apache.spark.util.ListenerBus.postToAll$(ListenerBus.scala:99)
	at org.apache.spark.scheduler.AsyncEventQueue.super$postToAll(AsyncEventQueue.scala:105)
	at org.apache.spark.scheduler.AsyncEventQueue.$anonfun$dispatch$1(AsyncEventQueue.scala:105)
	at scala.runtime.java8.JFunction0$mcJ$sp.apply(JFunction0$mcJ$sp.java:23)
	at scala.util.DynamicVariable.withValue(DynamicVariable.scala:62)
	at org.apache.spark.scheduler.AsyncEventQueue.org$apache$spark$scheduler$AsyncEventQueue$$dispatch(AsyncEventQueue.scala:100)
	at org.apache.spark.scheduler.AsyncEventQueue$$anon$2.$anonfun$run$1(AsyncEventQueue.scala:96)
	at org.apache.spark.util.Utils$.tryOrStopSparkContext(Utils.scala:1319)
	at org.apache.spark.scheduler.AsyncEventQueue$$anon$2.run(AsyncEventQueue.scala:96)
Exception in thread "spark-listener-group-shared" java.lang.NoSuchMethodError: 'org.json4s.Formats org.json4s.Formats.$plus$plus(scala.collection.Traversable)'
	at za.co.absa.commons.json.format.JavaTypesSupport.formats(JavaTypesSupport.scala:23)
	at za.co.absa.commons.json.format.JavaTypesSupport.formats$(JavaTypesSupport.scala:23)
	at __wrapper$1$4fc0c0d1871142fb815f664199d41b35.__wrapper$1$4fc0c0d1871142fb815f664199d41b35$$anon$1.formats(<no source file>)
	at za.co.absa.commons.json.AbstractJsonSerDe.$init$(AbstractJsonSerDe.scala:32)
	at __wrapper$1$4fc0c0d1871142fb815f664199d41b35.__wrapper$1$4fc0c0d1871142fb815f664199d41b35$$anon$1.<init>(<no source file>)
	at __wrapper$1$4fc0c0d1871142fb815f664199d41b35.__wrapper$1$4fc0c0d1871142fb815f664199d41b35$.$anonfun$wrapper$1(<no source file>)
	at za.co.absa.spline.harvester.json.HarvesterJsonSerDe$.<init>(HarvesterJsonSerDe.scala:37)
	at za.co.absa.spline.harvester.json.HarvesterJsonSerDe$.<clinit>(HarvesterJsonSerDe.scala)
	at za.co.absa.spline.harvester.dispatcher.HttpLineageDispatcher.send(HttpLineageDispatcher.scala:148)
	at za.co.absa.spline.harvester.QueryExecutionEventHandler.$anonfun$onSuccess$2(QueryExecutionEventHandler.scala:45)
	at za.co.absa.spline.harvester.QueryExecutionEventHandler.$anonfun$onSuccess$2$adapted(QueryExecutionEventHandler.scala:43)
	at scala.Option.foreach(Option.scala:407)
	at za.co.absa.spline.harvester.QueryExecutionEventHandler.onSuccess(QueryExecutionEventHandler.scala:43)
	at za.co.absa.spline.harvester.listener.SplineQueryExecutionListener.$anonfun$onSuccess$2(SplineQueryExecutionListener.scala:40)
	at za.co.absa.spline.harvester.listener.SplineQueryExecutionListener.$anonfun$onSuccess$2$adapted(SplineQueryExecutionListener.scala:40)
	at scala.Option.foreach(Option.scala:407)
	at za.co.absa.spline.harvester.listener.SplineQueryExecutionListener.$anonfun$onSuccess$1(SplineQueryExecutionListener.scala:40)
	at za.co.absa.spline.harvester.listener.SplineQueryExecutionListener.withErrorHandling(SplineQueryExecutionListener.scala:49)
	at za.co.absa.spline.harvester.listener.SplineQueryExecutionListener.onSuccess(SplineQueryExecutionListener.scala:40)
	at org.apache.spark.sql.util.ExecutionListenerBus.doPostEvent(QueryExecutionListener.scala:153)
	at org.apache.spark.sql.util.ExecutionListenerBus.doPostEvent(QueryExecutionListener.scala:127)
	at org.apache.spark.util.ListenerBus.postToAll(ListenerBus.scala:115)
	at org.apache.spark.util.ListenerBus.postToAll$(ListenerBus.scala:99)
	at org.apache.spark.sql.util.ExecutionListenerBus.postToAll(QueryExecutionListener.scala:127)
	at org.apache.spark.sql.util.ExecutionListenerBus.onOtherEvent(QueryExecutionListener.scala:133)
	at org.apache.spark.scheduler.SparkListenerBus.doPostEvent(SparkListenerBus.scala:82)
	at org.apache.spark.scheduler.SparkListenerBus.doPostEvent$(SparkListenerBus.scala:28)
	at org.apache.spark.scheduler.AsyncEventQueue.doPostEvent(AsyncEventQueue.scala:37)
	at org.apache.spark.scheduler.AsyncEventQueue.doPostEvent(AsyncEventQueue.scala:37)
	at org.apache.spark.util.ListenerBus.postToAll(ListenerBus.scala:115)
	at org.apache.spark.util.ListenerBus.postToAll$(ListenerBus.scala:99)
	at org.apache.spark.scheduler.AsyncEventQueue.super$postToAll(AsyncEventQueue.scala:105)
	at org.apache.spark.scheduler.AsyncEventQueue.$anonfun$dispatch$1(AsyncEventQueue.scala:105)
	at scala.runtime.java8.JFunction0$mcJ$sp.apply(JFunction0$mcJ$sp.java:23)
	at scala.util.DynamicVariable.withValue(DynamicVariable.scala:62)
	at org.apache.spark.scheduler.AsyncEventQueue.org$apache$spark$scheduler$AsyncEventQueue$$dispatch(AsyncEventQueue.scala:100)
	at org.apache.spark.scheduler.AsyncEventQueue$$anon$2.$anonfun$run$1(AsyncEventQueue.scala:96)
	at org.apache.spark.util.Utils$.tryOrStopSparkContext(Utils.scala:1319)
	at org.apache.spark.scheduler.AsyncEventQueue$$anon$2.run(AsyncEventQueue.scala:96)

@cerveada
Copy link
Contributor

Spark 3.0.0 is not yet supported by Spline. See the compatibility matrix.

@wajda
Copy link
Contributor

wajda commented Nov 25, 2020

Spark 3.0 support is a work in progress AbsaOSS/spline-spark-agent#93

@siddharths067
Copy link
Author

Yikes, thanks, I guess I should close this then.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
Status: Done
Development

No branches or pull requests

3 participants