-
Notifications
You must be signed in to change notification settings - Fork 308
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ADAM compatibility with Spark 2.0 #1021
Comments
Update: I've solved the problems related to logging by using an ADAM logger rather than the now private Spark one. |
Continuing from previous comment...
I suspect I can work around by using the implicit which apparently replaces it. As aside, I was surprised to see at: that in ADAM we have code that we actually are placing in package |
No that is bad practice. It is mostly likely that way because we're using some private or package-private Spark API. Which is sometimes the only way to get things done. |
There are several places where the ADAM I have a version of bdg-utils code which comments out the overridden functions in these ADAM instrumentation extensions , which then allows Utils to compile and then when ADAM is built with that version of Utils and Spark 2.0 - ADAM passes all of ADAM's tests - thus seems to works with Spark 2.0 However, some bdg utils tests still fail, such as the below, which reports an error that then comes from deep within Spark itself.
I'm interested in discussing the history and future plans for the InstrumentedRDD ADAM code, which we shoe horn into Spark packages, as it reaches deep into Spark and is going to have to maintained in parallel over time as this issue demonstrates. What use cases does ADAM present that requires our own private instrumentation codebase? Why is it not part of Spark where it can be tested and maintained - assuming such instrumentation would be useful to other projects beyond ADAM? Do other projects use bdg Utils and this InstrumenteddRDD? |
Note as of 20 minutes ago with |
Note, I believe I have a working/hack solution to this I'll highlight this to review with goal of replacing with a proper equals functions when I make the PR. |
Here you can find branches of bdg-utils and ADAM which compile and pass tests against Spark 2.0.0-SNAPSHOT 4987f39 https://github.com/jpdna/adam/tree/spark2.0_scala_2.11 These branches are re-based on the current master of adam and utils as of 5/18/2016 Once Spark 2.0.0 is officially released we should make a plan about how to move scala 2.11/Spark2.0 long term branches into our main bdg repos and how to keep them in sync with the scala 2_10 / Spark 1.6 version which I imagine we will maintain and release for some time forward as well (applying new PRs and Jenkins testing automatically to both ideally). |
Closed by #1123 |
I'd like to start a discussion here about the strategy to be able to build ADAM against Spark 2.0 (which I understand to be now basically the current master at: https://github.com/apache/spark )
There are dataset api operations that don't exist in Spark 1.6 that I want to play with and will be needed to move more of ADAM functionality to Dataset API - like dataset sortWithinPartitions
, so I'm motivated to help see this happen ASAP.
At the moment the first error I see in my attempt to build ADAM against most current Spark master (after changing the obvious dependencies in the POM and building/installing latest Spark locally)), is that logging was made private in 2.0
as per apache/spark@8ef3399
So our import of:
org.apache.spark.Logging
fails.
Some questions:
Is there a compelling reason to wait until Spark 2.0 is released in the next month(s), or does it make sense to start trying to make a version of ADAM that builds against the current Apache Spark master on github?
Has any work been done so far towards ADAM on Spark 2.0? What are the known issues - and any that we expect will require substantial effort?
@heuermh - you manage these wizardly build issues - if you have a suggestion of how I can best help here or we can work together I'd like to benefit from your experience.
The text was updated successfully, but these errors were encountered: