-
Notifications
You must be signed in to change notification settings - Fork 2.8k
[ZEPPELIN-605] Add support for Scala 2.11 #747
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
| <dependency> | ||
| <groupId>com.typesafe.akka</groupId> | ||
| <artifactId>akka-actor_${flink.scala.binary.version}</artifactId> | ||
| <artifactId>akka-actor_${scala.binary.version}</artifactId> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
there might still be a need for interpreter individually requiring different version of Scala? I think for now it might make sense to have separate flink.scala, ignite.scala and so on
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What would be the case for having interpreter specific scala versions ? Now it seems that everything work with latest scala 2.10. I am more towards simplifying the build now, and making it more complex when actually needed. Do we have a concrete case where this is needed now ?
If we are talking here about actually scala 2.10 versus 2.11, I plan to handle that by profiles/modules.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I was referring to flink.scala.binary.version, ignite.scala.binary.version.
Maybe Ignite doesn't support Scala 2.11? https://github.com/apache/ignite/blob/master/pom.xml#L453
29facb9 to
5a02318
Compare
2886a4f to
88452f0
Compare
|
The current status of this PR is that it's working ok with Scala 2.10 and Scala 2.11, I need to try to have one source for both versions of scala, and also rebase to pickup the most recent projects added to trunk |
|
Hi @lresende, I rebased the PR and built a distribution with:
But I still don't have access to a SparkContext instance:
Am I missing anything? Thanks. |
|
@adeandrade this seems to be a breaking change in Spark 2.0, also being fixed in #868 |
|
@adeandrade is ur spark binary compiled with scala-2.11, else it will continue to use the repl classes from scala-2.10(https://issues.apache.org/jira/browse/SPARK-1812 ) and you would hit the exception. |
|
@karup1990 Yes, it is. Did you rebase with master? I think it works if you don't. @felixcheung suggests the problem is being fixed in #868. I'm waiting for that PR to be merged to try again. If it doesn't work then I'll fall back to your suggestion. Thanks. |
|
Sorry to not be so responsive here, I was busy with conferences, etc... but should be able to devote some time to this again. This should work if you don't rebase, otherwise we need to update the new modules such as R extensions to properly work on Scala 2.11. Note that one way or another, you probably need to build Spark and Flink with Scala 2.11 and make sure you use maven "install" and not only "package". |
31d4234 to
6952c4c
Compare
|
@lresende Awesome! Tested and Looks good to me! |
| return invokeMethod(o, name, new Class[]{}, new Object[]{}); | ||
| } | ||
|
|
||
| private Object invokeMethod(Object o, String name, Class [] argTypes, Object [] params) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's awesome, you did a lot of work to get both Scala 2.10 and 2.11 supported!
Probably a nit, but how do you think, shall these guys be pulled up to some common ancestor to avoid duplication between DepInterpreter and SparkInterpreter? Or may be just extracted and re-used?
|
Looks 👍 💯 to me |
|
@bzz and @Leemoonsoo I have updated the readme to match the ci, and also refactored the reflection utility methods to a separate class. |
|
Looks like there were some network problem on last CI bulid. Could you trigger ci once again and see if it goes to green? |
|
@Leemoonsoo all back to green |
|
Looks good to me! |
|
Looks great to me! Thank you @lresende Let's merge if there is no further discussion |
|
Merging it into master and branch-0.6 |
|
@Leemoonsoo could you into non-trivial merge conflicts that happen on merging to |
|
@bzz Sure |
### What is this PR for? Add new interpreter to Python group: `%python.sql` for SQL over DataFrame support ### What type of PR is it? Improvement ### TODOs * [x] add new interpreter `%python.sql` * [x] add test * [x] make Python-dependant tests, excluded from CI * PythonInterpreterWithPythonInstalledTest * PythonPandasSqlInterpreterTest * run manually by `mvn -Dpython.test.exclude='' test -pl python -am` * [x] add docs `%python.sql` * [x] make `%python.sql` fail gracefully in case there is no Pandas or PandaSQL installed * [x] after #747 is merged - rebase and remove `-Dpython.test.exclude=''` from both profiles ### What is the Jira issue? [ZEPPELIN-1115](https://issues.apache.org/jira/browse/ZEPPELIN-1115) ### How should this be tested? `mvn -Dpython.test.exclude='' test -pl python -am` should pass or manually run - Given the DataFrame i.e ``` %python import pandas as pd rates = pd.read_csv("bank.csv", sep=";") ``` - SQL query it like ``` %python.sql SELECT * FROM rates LIMIT 10 ``` ### Screenshots (if appropriate)  ### Questions: * Does the licenses files need update? No, no dependencies were included in source or binary release * Is there breaking changes for older versions? No * Does this needs documentation? Yes Author: Alexander Bezzubov <bzz@apache.org> Closes #1164 from bzz/ZEPPELIN-1115/python/add-sql-for-dataframes and squashes the following commits: 0f2f852 [Alexander Bezzubov] Fail SQL gracefully if no python dependencies installed aca2bdf [Alexander Bezzubov] Fix typos in docs ⚡ 158ba6a [Alexander Bezzubov] Remove third-party dependant test from CI 5fe46fc [Alexander Bezzubov] Update Python Matplotlib notebook example 72884c8 [Alexander Bezzubov] Add docs for %python.sql feature e931dc4 [Alexander Bezzubov] Make test for PythonPandasSqlInterpreter usable 76bbb44 [Alexander Bezzubov] Complete implementation of the PythonPandasSqlInterpreter f6ca1eb [Alexander Bezzubov] Add %python.sql to interpreter menue 11ba490 [Alexander Bezzubov] Add draft implementation of %python.sql for DataFrames
Enable Zeppelin to be built with both Scala 2.10 and Scala 2.11, mostly to start supporting interpreters that are moving to Scala 2.11 only such as Spark. Before testing this PR, one would need to [build Spark 1.6.1 for example with Scala 2.11](http://spark.apache.org/docs/latest/building-spark.html#building-for-scala-211) and [build Flink 1.0 with Scala 2.11](https://ci.apache.org/projects/flink/flink-docs-master/setup/building.html#scala-versions) Author: Luciano Resende <lresende@apache.org> Author: Lee moon soo <moon@apache.org> Closes #747 from lresende/scala-210-211 and squashes the following commits: b9bdf86 [Luciano Resende] Properly invoke createTempDir from spark utils c208e69 [Luciano Resende] Fix class reference 87f46de [Luciano Resende] Force build 6e5e5ad [Luciano Resende] Refactor utility methods to helper class 4e2237a [Luciano Resende] Update readme to use profile to build scala 2.11 and match CI dd79443 [Luciano Resende] Minor formatting change to force build de4fc10 [Luciano Resende] Minor change to force build 9194218 [Lee moon soo] initialize imain cbf84c7 [Luciano Resende] Force Scala 2.11 profile to be called 98790a6 [Luciano Resende] Remove obsolete/commented config 6e4f7b0 [Luciano Resende] Force scala-library dependency version based on scala a3d0525 [Luciano Resende] Fix new code to support both scala versions e068593 [Luciano Resende] Fix pom.xml merge conflict 736d055 [Lee moon soo] make binary built with scala 2.11 work with spark_2.10 binary 74d8a62 [Luciano Resende] Force close 9f5d2a2 [Lee moon soo] Remove unused methods fc9e8a0 [Lee moon soo] Update ignite interpreter 6d3e7e2 [Lee moon soo] Update FlinkInterpreter 6b9ff1d [Lee moon soo] SparkContext sharing seems not working in scala 2.11, disable the test 9424769 [Lee moon soo] style 2ec51a3 [Lee moon soo] Fix reflection c999a2d [Lee moon soo] fix style dfe6e83 [Lee moon soo] Fix reflection around HttpServer and createTempDir 222e4e7 [Lee moon soo] Fix reflection on creating SparkCommandLine 112ae7d [Lee moon soo] Fix some reflections b9e0e1e [Lee moon soo] scala 2.11 support for spark interpreter c88348d [Lee moon soo] Initial scala-210, 211 support in the single binary 5c47d9a [Luciano Resende] [ZEPPELIN-605] Rewrite Spark interpreter based on Scala 2.11 support a73b68d [Luciano Resende] [ZEPPELIN-605] Enable Scala 2.11 REPL support for Spark Interpreter 175be7a [Luciano Resende] [ZEPPELIN-605] Add Scala 2.11 build profile 82eaefa [Luciano Resende] [ZEPPELIN-605] Add support for Scala 2.11 (cherry picked from commit bd714c2) Signed-off-by: Lee moon soo <moon@apache.org>
### What is this PR for? This PR implement spark 2.0 support based on #747. This PR has approach from #980 which is reimplementing code in scala. You can try build this branch ``` mvn clean package -Dscala-2.11 -Pspark-2.0 -Dspark.version=2.0.0-preview -Ppyspark -Psparkr -Pyarn -Phadoop-2.6 -DskipTests ``` ### What type of PR is it? Improvements ### Todos * [x] - Spark 2.0 support * [x] - Rebase after #747 merge * [x] - Update LICENSE file * [x] - Update related document (build) ### What is the Jira issue? https://issues.apache.org/jira/browse/ZEPPELIN-759 ### How should this be tested? Build and try ``` mvn clean package -Dscala-2.11 -Pspark-2.0 -Dspark.version=2.0.0-preview -Ppyspark -Psparkr -Pyarn -Phadoop-2.6 -DskipTests ``` ### Screenshots (if appropriate)  ### Questions: * Does the licenses files need update? yes * Is there breaking changes for older versions? no * Does this needs documentation? yes Author: Lee moon soo <moon@apache.org> Closes #1195 from Leemoonsoo/spark-20 and squashes the following commits: d78b322 [Lee moon soo] trigger ci 8017e8b [Lee moon soo] Remove unnecessary spark.version property e3141bd [Lee moon soo] restart sparkcluster before sparkr test 1493b2c [Lee moon soo] print spark standalone cluster log when ci test fails a208cd0 [Lee moon soo] Debug sparkRTest 31369c6 [Lee moon soo] Update license 293896a [Lee moon soo] Update build instruction 862ff6c [Lee moon soo] Make ZeppelinSparkClusterTest.java work with spark 2 839912a [Lee moon soo] Update SPARK_HOME directory detection pattern for 2.0.0-preview in the test 3413707 [Lee moon soo] Update .travis.yml 02bcd5d [Lee moon soo] Update SparkSqlInterpreterTest f06a2fa [Lee moon soo] Spark 2.0 support
This PR implement spark 2.0 support based on #747. This PR has approach from #980 which is reimplementing code in scala. You can try build this branch ``` mvn clean package -Dscala-2.11 -Pspark-2.0 -Dspark.version=2.0.0-preview -Ppyspark -Psparkr -Pyarn -Phadoop-2.6 -DskipTests ``` Improvements * [x] - Spark 2.0 support * [x] - Rebase after #747 merge * [x] - Update LICENSE file * [x] - Update related document (build) https://issues.apache.org/jira/browse/ZEPPELIN-759 Build and try ``` mvn clean package -Dscala-2.11 -Pspark-2.0 -Dspark.version=2.0.0-preview -Ppyspark -Psparkr -Pyarn -Phadoop-2.6 -DskipTests ```  * Does the licenses files need update? yes * Is there breaking changes for older versions? no * Does this needs documentation? yes Author: Lee moon soo <moon@apache.org> Closes #1195 from Leemoonsoo/spark-20 and squashes the following commits: d78b322 [Lee moon soo] trigger ci 8017e8b [Lee moon soo] Remove unnecessary spark.version property e3141bd [Lee moon soo] restart sparkcluster before sparkr test 1493b2c [Lee moon soo] print spark standalone cluster log when ci test fails a208cd0 [Lee moon soo] Debug sparkRTest 31369c6 [Lee moon soo] Update license 293896a [Lee moon soo] Update build instruction 862ff6c [Lee moon soo] Make ZeppelinSparkClusterTest.java work with spark 2 839912a [Lee moon soo] Update SPARK_HOME directory detection pattern for 2.0.0-preview in the test 3413707 [Lee moon soo] Update .travis.yml 02bcd5d [Lee moon soo] Update SparkSqlInterpreterTest f06a2fa [Lee moon soo] Spark 2.0 support (cherry picked from commit 8546666) Signed-off-by: Lee moon soo <moon@apache.org>
Enable Zeppelin to be built with both Scala 2.10 and Scala 2.11, mostly to start supporting interpreters that are moving to Scala 2.11 only such as Spark. Before testing this PR, one would need to [build Spark 1.6.1 for example with Scala 2.11](http://spark.apache.org/docs/latest/building-spark.html#building-for-scala-211) and [build Flink 1.0 with Scala 2.11](https://ci.apache.org/projects/flink/flink-docs-master/setup/building.html#scala-versions) Author: Luciano Resende <lresende@apache.org> Author: Lee moon soo <moon@apache.org> Closes apache#747 from lresende/scala-210-211 and squashes the following commits: b9bdf86 [Luciano Resende] Properly invoke createTempDir from spark utils c208e69 [Luciano Resende] Fix class reference 87f46de [Luciano Resende] Force build 6e5e5ad [Luciano Resende] Refactor utility methods to helper class 4e2237a [Luciano Resende] Update readme to use profile to build scala 2.11 and match CI dd79443 [Luciano Resende] Minor formatting change to force build de4fc10 [Luciano Resende] Minor change to force build 9194218 [Lee moon soo] initialize imain cbf84c7 [Luciano Resende] Force Scala 2.11 profile to be called 98790a6 [Luciano Resende] Remove obsolete/commented config 6e4f7b0 [Luciano Resende] Force scala-library dependency version based on scala a3d0525 [Luciano Resende] Fix new code to support both scala versions e068593 [Luciano Resende] Fix pom.xml merge conflict 736d055 [Lee moon soo] make binary built with scala 2.11 work with spark_2.10 binary 74d8a62 [Luciano Resende] Force close 9f5d2a2 [Lee moon soo] Remove unused methods fc9e8a0 [Lee moon soo] Update ignite interpreter 6d3e7e2 [Lee moon soo] Update FlinkInterpreter 6b9ff1d [Lee moon soo] SparkContext sharing seems not working in scala 2.11, disable the test 9424769 [Lee moon soo] style 2ec51a3 [Lee moon soo] Fix reflection c999a2d [Lee moon soo] fix style dfe6e83 [Lee moon soo] Fix reflection around HttpServer and createTempDir 222e4e7 [Lee moon soo] Fix reflection on creating SparkCommandLine 112ae7d [Lee moon soo] Fix some reflections b9e0e1e [Lee moon soo] scala 2.11 support for spark interpreter c88348d [Lee moon soo] Initial scala-210, 211 support in the single binary 5c47d9a [Luciano Resende] [ZEPPELIN-605] Rewrite Spark interpreter based on Scala 2.11 support a73b68d [Luciano Resende] [ZEPPELIN-605] Enable Scala 2.11 REPL support for Spark Interpreter 175be7a [Luciano Resende] [ZEPPELIN-605] Add Scala 2.11 build profile 82eaefa [Luciano Resende] [ZEPPELIN-605] Add support for Scala 2.11
### What is this PR for? Add new interpreter to Python group: `%python.sql` for SQL over DataFrame support ### What type of PR is it? Improvement ### TODOs * [x] add new interpreter `%python.sql` * [x] add test * [x] make Python-dependant tests, excluded from CI * PythonInterpreterWithPythonInstalledTest * PythonPandasSqlInterpreterTest * run manually by `mvn -Dpython.test.exclude='' test -pl python -am` * [x] add docs `%python.sql` * [x] make `%python.sql` fail gracefully in case there is no Pandas or PandaSQL installed * [x] after apache#747 is merged - rebase and remove `-Dpython.test.exclude=''` from both profiles ### What is the Jira issue? [ZEPPELIN-1115](https://issues.apache.org/jira/browse/ZEPPELIN-1115) ### How should this be tested? `mvn -Dpython.test.exclude='' test -pl python -am` should pass or manually run - Given the DataFrame i.e ``` %python import pandas as pd rates = pd.read_csv("bank.csv", sep=";") ``` - SQL query it like ``` %python.sql SELECT * FROM rates LIMIT 10 ``` ### Screenshots (if appropriate)  ### Questions: * Does the licenses files need update? No, no dependencies were included in source or binary release * Is there breaking changes for older versions? No * Does this needs documentation? Yes Author: Alexander Bezzubov <bzz@apache.org> Closes apache#1164 from bzz/ZEPPELIN-1115/python/add-sql-for-dataframes and squashes the following commits: 0f2f852 [Alexander Bezzubov] Fail SQL gracefully if no python dependencies installed aca2bdf [Alexander Bezzubov] Fix typos in docs ⚡ 158ba6a [Alexander Bezzubov] Remove third-party dependant test from CI 5fe46fc [Alexander Bezzubov] Update Python Matplotlib notebook example 72884c8 [Alexander Bezzubov] Add docs for %python.sql feature e931dc4 [Alexander Bezzubov] Make test for PythonPandasSqlInterpreter usable 76bbb44 [Alexander Bezzubov] Complete implementation of the PythonPandasSqlInterpreter f6ca1eb [Alexander Bezzubov] Add %python.sql to interpreter menue 11ba490 [Alexander Bezzubov] Add draft implementation of %python.sql for DataFrames
### What is this PR for? This PR implement spark 2.0 support based on apache#747. This PR has approach from apache#980 which is reimplementing code in scala. You can try build this branch ``` mvn clean package -Dscala-2.11 -Pspark-2.0 -Dspark.version=2.0.0-preview -Ppyspark -Psparkr -Pyarn -Phadoop-2.6 -DskipTests ``` ### What type of PR is it? Improvements ### Todos * [x] - Spark 2.0 support * [x] - Rebase after apache#747 merge * [x] - Update LICENSE file * [x] - Update related document (build) ### What is the Jira issue? https://issues.apache.org/jira/browse/ZEPPELIN-759 ### How should this be tested? Build and try ``` mvn clean package -Dscala-2.11 -Pspark-2.0 -Dspark.version=2.0.0-preview -Ppyspark -Psparkr -Pyarn -Phadoop-2.6 -DskipTests ``` ### Screenshots (if appropriate)  ### Questions: * Does the licenses files need update? yes * Is there breaking changes for older versions? no * Does this needs documentation? yes Author: Lee moon soo <moon@apache.org> Closes apache#1195 from Leemoonsoo/spark-20 and squashes the following commits: d78b322 [Lee moon soo] trigger ci 8017e8b [Lee moon soo] Remove unnecessary spark.version property e3141bd [Lee moon soo] restart sparkcluster before sparkr test 1493b2c [Lee moon soo] print spark standalone cluster log when ci test fails a208cd0 [Lee moon soo] Debug sparkRTest 31369c6 [Lee moon soo] Update license 293896a [Lee moon soo] Update build instruction 862ff6c [Lee moon soo] Make ZeppelinSparkClusterTest.java work with spark 2 839912a [Lee moon soo] Update SPARK_HOME directory detection pattern for 2.0.0-preview in the test 3413707 [Lee moon soo] Update .travis.yml 02bcd5d [Lee moon soo] Update SparkSqlInterpreterTest f06a2fa [Lee moon soo] Spark 2.0 support
Enable Zeppelin to be built with both Scala 2.10
and Scala 2.11, mostly to start supporting interpreters
that are moving to Scala 2.11 only such as Spark.
Before testing this PR, one would need to build Spark 1.6.1 for example with Scala 2.11 and build Flink 1.0 with Scala 2.11