-
Notifications
You must be signed in to change notification settings - Fork 2.8k
[ZEPPELIN-1665] Z.run with external note executable and access resource for zeppelin in each interpreter #1637
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[ZEPPELIN-1665] Z.run with external note executable and access resource for zeppelin in each interpreter #1637
Conversation
# Conflicts: # spark/src/main/java/org/apache/zeppelin/spark/SparkInterpreter.java
Leemoonsoo
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @cloverhearts for the contribution. I left some comments. Please take a look.
| * get Remote Zeppelin Server Controller interface | ||
| */ | ||
| @ZeppelinApi | ||
| public abstract RemoteWorksController getRemoteZeppelinServerController(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it's better pass RemoteWorksController through InterpreterContext from here, and avoid defining new methods in Interpreter class.
| private int port = -1; | ||
| private final String interpreterDir; | ||
| private final String localRepoDir; | ||
| private RemoteWorksController remoteWorksController; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We need to avoid RemoteInterpreterManagedProcess holds reference to RemoteWorksController, because the object is not used in RemoteInterpreterManagedProcess.
Which can be done by adding new method in the RemoteInterpreterProcessListener interface.
| notebook.addNotebookEventListener(notebookWsServer.getNotebookInformationListener()); | ||
|
|
||
| remoteWorksManager = new RemoteWorksManager(notebook); | ||
| replFactory.setRemoteController(remoteWorksManager.getInstance()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
remoteWorksManager.getInstance() is invoked here only, not anywhere else.
That means we don't need singleton pattern here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I deleted this class file.
| return RemoteWorksManager.instance; | ||
| } | ||
|
|
||
| private class NotebookJobManager implements RemoteWorksController { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is purpose of having NotebookJobManager private class and RemoteWorksController interface?
I think all the methods in the NotebookJobManager can be moved to RemoteWorksManager, and RemoteWorksController interface can be removed.
I think that's simple and easier to understand.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is actually a sign of my impatience.
I expected RemoteWorksController to exist in various cases in the future.
But this is not what we need now.
Your opinion is correct.
I will modify this.
| OUTPUT_UPDATE = 9, | ||
| ANGULAR_REGISTRY_PUSH = 10, | ||
| APP_STATUS_UPDATE = 11, | ||
| REMOTE_ZEPPELIN_SERVER_CONTROL = 12 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can use more specific name for this. such as PARAGRAPH_RUN_CONTEXT.
| struct RemoteZeppelinServerController { | ||
| 1: RemoteZeppelinServerControlEvent type, | ||
| 2: string eventOwnerKey | ||
| 3: string msg |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
basically, we don't need enum RemoteZeppelinServerControlEvent, enum RemoteZeppelinServerResourceType, struct ZeppelinServerResourceParagraphRunner, struct ZeppelinServerResource, struct RemoteZeppelinServerController.
Please try use generic message data structure RemoteInterpreterEvent for sending event from interpreter process to zeppelin server process.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you,
modified done. :)
| RemoteApplicationResult unloadApplication(1: string applicationInstanceId); | ||
| RemoteApplicationResult runApplication(1: string applicationInstanceId); | ||
|
|
||
| void remoteZeppelinServerControlFeedback(1: RemoteZeppelinServerController response); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please try use primitive types instead of RemoteZeppelinServerController. You can json serialize complex data into string like other rpc methods above.
And more specific name, such as 'sendRunContexts()`
# Conflicts: # zeppelin-interpreter/src/main/java/org/apache/zeppelin/interpreter/remote/RemoteInterpreterEventPoller.java # zeppelin-interpreter/src/main/java/org/apache/zeppelin/interpreter/thrift/RemoteInterpreterEventType.java
# Conflicts: # scio/src/main/scala/org/apache/zeppelin/scio/ScioInterpreter.scala
# Conflicts: # zeppelin-interpreter/src/main/java/org/apache/zeppelin/interpreter/remote/RemoteInterpreter.java # zeppelin-interpreter/src/main/thrift/RemoteInterpreterService.thrift # zeppelin-interpreter/src/test/java/org/apache/zeppelin/interpreter/remote/RemoteAngularObjectTest.java # zeppelin-interpreter/src/test/java/org/apache/zeppelin/interpreter/remote/RemoteInterpreterOutputTestStream.java # zeppelin-interpreter/src/test/java/org/apache/zeppelin/interpreter/remote/RemoteInterpreterTest.java # zeppelin-interpreter/src/test/java/org/apache/zeppelin/resource/DistributedResourcePoolTest.java # zeppelin-interpreter/src/test/java/org/apache/zeppelin/scheduler/RemoteSchedulerTest.java # zeppelin-zengine/src/main/java/org/apache/zeppelin/interpreter/InterpreterFactory.java
|
@moon Thank you for your kind and good feedback. |
|
Change structure and remove remoteWorksManager |
|
CI Error. This is irrelevant to my edits. |
|
All work is done. |
Leemoonsoo
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @cloverhearts for addressing comments.
I left few more comments.
About some unnecessary changes around import blocks in many interpreter modules and for more consistency in thrift idl.
| import org.apache.zeppelin.interpreter.InterpreterContext; | ||
| import org.apache.zeppelin.interpreter.InterpreterPropertyBuilder; | ||
| import org.apache.zeppelin.interpreter.InterpreterResult; | ||
| import org.apache.zeppelin.interpreter.*; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would you like to keep import block unchanged here (and the in the other interpreters)?
| import org.apache.zeppelin.interpreter.InterpreterContext; | ||
| import org.apache.zeppelin.interpreter.InterpreterResult; | ||
| import org.apache.zeppelin.interpreter.InterpreterResult.Code; | ||
| import org.apache.zeppelin.interpreter.RemoteWorksController; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you keep PigInterpreter.java unchanged?
| private SparkDependencyResolver dep; | ||
| private String sparkUrl; | ||
|
|
||
| private RemoteWorksController remoteWorksController; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you keep SparkInterpreter.java unchanged?
| struct RemoteZeppelinServerController { | ||
| 2: string eventOwnerKey | ||
| 3: string msg | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
every message RemoteInterpreter send to ZeppelinServer using RemoteInterpreterEventClient is RemoteInterpreterEvent type. Shell we avoid defining new type - ZeppelinServerResourceParagraphRunner and RemoteZeppelinServerController - for the payload ?
Because all other event are sending their payload using RemoteInterpreterEvent instead of defining their own struct in thrift idl. So let's do it in consistent way.
| RemoteApplicationResult unloadApplication(1: string applicationInstanceId); | ||
| RemoteApplicationResult runApplication(1: string applicationInstanceId); | ||
|
|
||
| void onReceivedResourceParagraphRunners(1: RemoteInterpreterEvent response); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is message from ZeppelinServer to Interpreter process. Like other methods definitions, shell we not use RemoteInterpreterEvent and use primitive types instead? If you have a complex type payload, you can json serialize and send it as a string. (see angularObjectAdd at line 118)
|
@Leemoonsoo |
|
Could you keep following files unchanged and resolve conflict with master? bigquery/src/main/java/org/apache/zeppelin/bigquery/BigQueryInterpreter.java Except for that, LGTM |
# Conflicts: # zeppelin-interpreter/src/main/java/org/apache/zeppelin/interpreter/InterpreterContext.java # zeppelin-interpreter/src/main/java/org/apache/zeppelin/interpreter/thrift/InterpreterCompletion.java # zeppelin-interpreter/src/main/java/org/apache/zeppelin/interpreter/thrift/RemoteApplicationResult.java # zeppelin-interpreter/src/main/java/org/apache/zeppelin/interpreter/thrift/RemoteInterpreterContext.java # zeppelin-interpreter/src/main/java/org/apache/zeppelin/interpreter/thrift/RemoteInterpreterEvent.java # zeppelin-interpreter/src/main/java/org/apache/zeppelin/interpreter/thrift/RemoteInterpreterResult.java # zeppelin-interpreter/src/main/java/org/apache/zeppelin/interpreter/thrift/RemoteInterpreterService.java
|
One test case is experiencing a problem. |
|
Okay, fixed all problem. |
|
I still can see some unnecessary changes on the import code block in many files. |
|
ci retry |
|
@Leemoonsoo Sorry, I think I misunderstood your answer. All edits are complete. |
|
@cloverhearts Thanks for the great contribution. |

What is this PR for?
Currently, the z.run command is restricted.
Only paragraphs in a single note can be executed.
I have modified this to allow you to freely execute paragraphs of other notes.
This PR provides the basis for the freeful use of Zeppelin's resources at each Interpreter implementation.
What type of PR is it?
Improvement, Feature
Todos
What is the Jira issue?
https://issues.apache.org/jira/browse/ZEPPELIN-1665
How should this be tested?
Currently under development.
run paragraph in same note
run paragraph with external note
all note run
Screenshots (if appropriate)
paragraph run

noterun

Questions: