Skip to content

Conversation

@zjffdu
Copy link
Contributor

@zjffdu zjffdu commented Aug 2, 2016

What is this PR for?

PySparkInterpreter doesn't work in spark 2.0 because pyspark and py4j is not distributed to executors. This PR extract the setup staff for pyspark interpreter into method setupConfForPySpark and use it for both spark1 and spark2. But this is just a short term solution, as I think this should be handled by spark rather than zeppelin, here zeppelin duplicate part of spark's work. In the long term, I'd like to resolve it in ZEPPELIN-1263.

What type of PR is it?

[Bug Fix]

Todos

What is the Jira issue?

How should this be tested?

Verify it manually.

Screenshots (if appropriate)

2016-08-02_1749

Questions:

  • Does the licenses files need update? No
  • Is there breaking changes for older versions? No
  • Does this needs documentation? No

@bzz
Copy link
Member

bzz commented Aug 2, 2016

\cc @jongyoul for review

@Leemoonsoo
Copy link
Member

Looks good to me.

@jongyoul
Copy link
Member

jongyoul commented Aug 3, 2016

I've tested it. LGTM.

@bzz
Copy link
Member

bzz commented Aug 4, 2016

Let's merge if there is no further discussion!

@minahlee
Copy link
Member

minahlee commented Aug 4, 2016

@zjffdu I tried current master branch with master set to local[*], Spark standalone and yarn-client both with/without SPARK_HOME set. But somehow I was able to run pyspark interpreter without this patch. I don't know what I missed, could you tell me which environment should I try to make pyspark interpreter fail on master branch?

@zjffdu
Copy link
Contributor Author

zjffdu commented Aug 4, 2016

I also use the latest master and build zeppelin with this command:
mvn clean package -Pspark-2.0 -Ppyspark -Psparkr -DskipTests , and then export SPARK_HOME to where spark-2.0 is located and run pyspark interpreter using yarn-client mode.

@minahlee
Copy link
Member

minahlee commented Aug 4, 2016

@zjffdu thank you for quick response, I was able to reproduce it and tested this patch fixes the issue!
Merge if there is no further discussion

@asfgit asfgit closed this in 161dd0e Aug 4, 2016
asfgit pushed a commit that referenced this pull request Aug 4, 2016
### What is this PR for?
PySparkInterpreter doesn't work in spark 2.0 because pyspark and py4j is not distributed to executors.  This PR extract the setup staff for pyspark interpreter into method setupConfForPySpark and use it for both spark1 and spark2. But this is just a short term solution, as I think this should be handled by spark rather than zeppelin, here zeppelin duplicate part of spark's work. In the long term, I'd like to resolve it in `ZEPPELIN-1263`.

### What type of PR is it?
[Bug Fix]

### Todos
* https://issues.apache.org/jira/browse/ZEPPELIN-1263

### What is the Jira issue?
* https://issues.apache.org/jira/browse/ZEPPELIN-1267

### How should this be tested?
Verify it manually.

### Screenshots (if appropriate)
![2016-08-02_1749](https://cloud.githubusercontent.com/assets/164491/17324523/7d349c60-58d9-11e6-9d3e-5072e1505575.png)

### Questions:
* Does the licenses files need update? No
* Is there breaking changes for older versions? No
* Does this needs documentation? No

Author: Jeff Zhang <zjffdu@apache.org>

Closes #1260 from zjffdu/ZEPPELIN-1267 and squashes the following commits:

81d1d56 [Jeff Zhang] ZEPPELIN-1267. PySparkInterpreter doesn't work in spark 2.0

(cherry picked from commit 161dd0e)
Signed-off-by: Mina Lee <minalee@apache.org>
PhilippGrulich pushed a commit to SWC-SENSE/zeppelin that referenced this pull request Aug 8, 2016
### What is this PR for?
PySparkInterpreter doesn't work in spark 2.0 because pyspark and py4j is not distributed to executors.  This PR extract the setup staff for pyspark interpreter into method setupConfForPySpark and use it for both spark1 and spark2. But this is just a short term solution, as I think this should be handled by spark rather than zeppelin, here zeppelin duplicate part of spark's work. In the long term, I'd like to resolve it in `ZEPPELIN-1263`.

### What type of PR is it?
[Bug Fix]

### Todos
* https://issues.apache.org/jira/browse/ZEPPELIN-1263

### What is the Jira issue?
* https://issues.apache.org/jira/browse/ZEPPELIN-1267

### How should this be tested?
Verify it manually.

### Screenshots (if appropriate)
![2016-08-02_1749](https://cloud.githubusercontent.com/assets/164491/17324523/7d349c60-58d9-11e6-9d3e-5072e1505575.png)

### Questions:
* Does the licenses files need update? No
* Is there breaking changes for older versions? No
* Does this needs documentation? No

Author: Jeff Zhang <zjffdu@apache.org>

Closes apache#1260 from zjffdu/ZEPPELIN-1267 and squashes the following commits:

81d1d56 [Jeff Zhang] ZEPPELIN-1267. PySparkInterpreter doesn't work in spark 2.0
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants