Spark 2.0 Branch / Support [Enhancement] #78

javadba · 2016-06-08T17:56:11Z

I added a PR for Spark 2.0 using the SparkSession instead of SparkContext. In addition the libraries were moved to scala 2.11 and hadoop 2.7.1 to be more in line with spark 2.X direction. The tests were run and pass.

#77

mriduljain · 2016-06-08T18:39:10Z

It would be great if you can make this backwards compatible

javadba · 2016-06-08T18:42:26Z

To be backwards compatible we would need to retain the SparkContext as the
input parameter to all the methods. That is squarely contrary to Spark 2.X
. 2.X is a breaking change - so to be consistent it is a breaking
change here.

Now if you really want a non-breaking change I can add it in a separate
branch. Keep in mind it will be using deprecated api's in that case.

2016-06-08 11:39 GMT-07:00 mriduljain notifications@github.com:

It would be great if you can make this backwards compatible

—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub
#78 (comment),
or mute the thread
https://github.com/notifications/unsubscribe/AAZEkTCZysTVmES4YnObn_xbIYGWPYiJks5qJwxUgaJpZM4IxOus
.

javadba · 2016-06-08T18:54:23Z

fyi I have created said branch and looking to see if can make this happen. update l8r today.

javadba · 2016-06-08T20:21:55Z

Backwards compatibility is in place. Two methods are provided: using SparkSession or using SparkContext. The updates are in the same branch to simplify our discussion. Feel free to take look at the latest commit on the previously provided PR.

mriduljain · 2016-06-08T20:26:58Z

Thanks will go through and comment soon

On Wed, Jun 8, 2016 at 1:21 PM, StephenBoesch notifications@github.com
wrote:

Backwards compatibility is in place. Two methods are provided: using
SparkSession or using SparkContext. The updates are in the same branch
to simplify our discussion. Feel free to take look at the latest commit on
the previously provided PR.

—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
#78 (comment),
or mute the thread
https://github.com/notifications/unsubscribe/ACCTVQcIaGQOIpF-2yn6iqMX67V0bXN0ks5qJyRmgaJpZM4IxOus
.

javadba · 2016-06-08T21:18:34Z

I should clarify: the backwards compatibility applies only to the consuming source code: they do not need to be changed.

The SparkSession class is included and thus this change will not run against Spark 1.X. To truly achieve the backwards compatibility then we would need to add some shell scripts to manipulate the source files. I do not believe that were worth it. Instead maintain in separate Spark 2.X branch. At some point you decide to merge it into main and then the Spark 1.X becomes maintenance mode.

javadba · 2016-06-08T23:17:16Z

A new PR has been opened that simplifies the approach and provides full backwards compatibility via a maven profile. #79
Use

mvn -Dspark2  <actions>

to use Spark 2.x. The actual versions of the following items are specified in that profile

Spark  Scala Hadoop

javadba · 2016-06-09T00:34:48Z

I found a bug in #79 and am looking into it.

java.lang.NoSuchMethodError: org.apache.spark.sql.SQLContext.createDataFrame(Lorg/apache/spark/rdd/RDD;Lorg/apache/spark/sql/types/StructType;)Lorg/apache/spark/sql/DataFrame;
at com.yahoo.ml.caffe.DataFrameTest$$anonfun$1.apply$mcV$sp(DataFrameTest.scala:51)

javadba · 2016-06-09T00:46:58Z

False alarm! The reason is:

If you first try the spark 1.X via

mvn package

and then try spark 2.X via

mvn -Dspark2 package

it WILL fail.

We need to do clean to get all the stuff recompiled. So the following is required:

mvn -Dspark2  clean package

The tests are passing both under spark 1.x / scala 2.10 and spark 2.x / scala 2.11

mriduljain · 2016-07-01T02:11:53Z

I guess this pull request has been merged. Closing this. Thanks so much

javadba changed the title ~~Spark 2.0 Branch / Support~~ Spark 2.0 Branch / Support [Enhancement] Jun 8, 2016

mriduljain added the enhancement label Jun 14, 2016

mriduljain closed this as completed Jul 1, 2016

heliumsun mentioned this issue Apr 6, 2017

Executor may hung when using multiple devices(GPU) #243

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Spark 2.0 Branch / Support [Enhancement] #78

Spark 2.0 Branch / Support [Enhancement] #78

javadba commented Jun 8, 2016

mriduljain commented Jun 8, 2016

javadba commented Jun 8, 2016

javadba commented Jun 8, 2016

javadba commented Jun 8, 2016

mriduljain commented Jun 8, 2016

javadba commented Jun 8, 2016

javadba commented Jun 8, 2016

javadba commented Jun 9, 2016 •

edited

Loading

javadba commented Jun 9, 2016 •

edited

Loading

mriduljain commented Jul 1, 2016

Spark 2.0 Branch / Support [Enhancement] #78

Spark 2.0 Branch / Support [Enhancement] #78

Comments

javadba commented Jun 8, 2016

mriduljain commented Jun 8, 2016

javadba commented Jun 8, 2016

javadba commented Jun 8, 2016

javadba commented Jun 8, 2016

mriduljain commented Jun 8, 2016

javadba commented Jun 8, 2016

javadba commented Jun 8, 2016

javadba commented Jun 9, 2016 • edited Loading

javadba commented Jun 9, 2016 • edited Loading

mriduljain commented Jul 1, 2016

javadba commented Jun 9, 2016 •

edited

Loading

javadba commented Jun 9, 2016 •

edited

Loading