-
Notifications
You must be signed in to change notification settings - Fork 2.8k
ZEPPELIN-1692: Ability to access Spark jobs UI from the paragraph #1663
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
awsome! |
| Date dateUpdated; | ||
| private Map<String, Object> config; // paragraph configs like isOpen, colWidth, etc | ||
| public final GUI settings; // form and parameter settings | ||
| private Map<String, Set<String>> runtimeInfos; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I dont want to persist this info in json.
When i add the transient modifier, the serailization to JSON(for broadcasting to client browser) doesnot happen.
I can add an Exclusion strategy for all the storage handlers. But wanted to check if there is an easier way to achieve this . @Leemoonsoo @cloverhearts
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hello,
I think there is no problem using transient.
If you think you have a problem with using transient, can you comment on reason?
Actually, my english is not good.
I am sorry if I have been rude to you.
Thank you.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@cloverhearts Your English is not bad 👍
If i add transient modifier, when Gson.toJson(Paragraph) is done to send message about note/paragraph, the runtimeInfos wont be serialized and will not be received by browser(runtimeInfos wont get persisted in note.json),
Whereas if i add the modifier(current implementation), it will get persisted in JSON. But we dont want it, since it a run time information, and we might not want to persist in the note.json
Is there a way where i can serailize this info(for sending to client browser) as well as not persist in json?
PS: runtimeInfos object has the paragraph url information.
2fc4dcf to
ad50955
Compare
|
Ready for review @Leemoonsoo |
|
Let me review this PR and then give you some feedbacks. Thanks @karup1990!
|
|
Thanks @1ambda for looking into this PR and for the suggestions. |
|
@karup1990 sorry for misspelling It was clear output |
| Method take; | ||
| String jobGroup = "zeppelin-" + interpreterContext.getParagraphId(); | ||
| String jobGroup = "zeppelin-" + interpreterContext.getNoteId() + "-" | ||
| + interpreterContext.getParagraphId(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see several places of constructing jobGroupId from noteId/paragraphId and extract noteId/paragraphId from jobGroupId. It's better to put them together so that others can understand it easily and other places can reuse it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have added Utils. buildJobGroupId and using it across.
| String id = interpreterGroup.getId(); | ||
| int indexOfColon = id.indexOf(":"); | ||
| return id.substring(0, indexOfColon); | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This assume the interpreterGroup id format, seems fragile.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am seeing that interpreter setting is the first prefix, then based on the interpreter mode we are appending ids. Please correct me if I am wrong .
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it is better to put getInterpreterSettingId into Utils as well because it is about how we parse interpreterGroup
| SparkInterpreter sparkInterpreter = getSparkInterpreter(); | ||
| sparkInterpreter.populateSparkWebUrl(interpreterContext); | ||
|
|
||
| String jobGroup = sparkInterpreter.getJobGroup(interpreterContext); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would it be better to rename it sparkInterpreter.buildJobGroupId as each time it would create a new jobGroupId IIUC.
|
I left some comments inline and could you add test for this feature ? |
| <a href="{{paragraph.runtimeInfos.jobUrl[0]}}" target="_blank"><span class="fa fa-tasks"></span> Spark job </a> | ||
| </span> | ||
| <span class="dropdown" ng-show="paragraph.runtimeInfos.jobUrl.length > 1"> | ||
| <span class="fa fa-tasks" style="cursor:pointer;color:#3071A9" tooltip-placement="top" tooltip="Run this paragraph (Shift+Enter)" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think tooltip-placement="top" tooltip="Run this paragraph (Shift+Enter)" doesn't need in here :)
|
Hi @karup1990, really nice and useful feature indeed.
|
| ng-click="runParagraph(getEditorValue())" | ||
| ng-show="paragraph.status!='RUNNING' && paragraph.status!='PENDING' && paragraph.config.enabled"></span> | ||
| <span ng-show="paragraph.runtimeInfos.jobUrl.length == 1"> | ||
| <a href="{{paragraph.runtimeInfos.jobUrl[0]}}" target="_blank"><span class="fa fa-tasks"></span> Spark job </a> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe it's nit pick, let's use capital letter for Spark job as below "Spark Jobs" does :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And how about we generalize this feature little bit more, such as
runtimeInfos = {
label: "Spark Job"
jobUrl : [],
}
So other interpreters can also leverage this feature.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@1ambda Clearing the links when we clear the output 👍
@AhyoungRyu Fixed font issue
@Leemoonsoo Added ParagraphRuntimeInfos class(https://github.com/apache/zeppelin/pull/1663/files#diff-a2ac97dfb3f07b2a3fd1533af701295e) to have a generic support to add more properties. Let me know how it looks.
@AhyoungRyu @Leemoonsoo @1ambda Fixed the alignment and capitalized the text(to access the jobs)

|
Thanks for the feedback. i will have them handled some time over the weekend. |
|
Failure seems not related |
4fcc24e to
d2d633e
Compare
|
Updated the PR with test @zjffdu |
e8fd1e0 to
9612dd0
Compare
|
Ready for review |
| int secondIndex = jobgroupId.indexOf("-", indexOf + 1); | ||
| return jobgroupId.substring(secondIndex + 1, jobgroupId.length()); | ||
| } | ||
| }; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would it be better to put getNoteId and getParagraphId to Utils ? So that the constructing and parsing logic are in the same class.
|
|
||
| public void addValue(String value) { | ||
| if (values == null) { | ||
| values = new ArrayList<>(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we create values in the constructor ? Otherwise at least this method is not thread-safe (we might create 2 values in 2 threads)
| Date dateUpdated; | ||
| private Map<String, Object> config; // paragraph configs like isOpen, colWidth, etc | ||
| public final GUI settings; // form and parameter settings | ||
| private Map<String, ParagraphRuntimeInfos> runtimeInfos; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we rename ParagraphRuntimeInfos to ParagraphRuntimeInfo ? Because it is a little confusing that runtimeInfos is also plural
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
| if (p.getDateFinished() != null && lastUpdatedDate.before(p.getDateFinished())) { | ||
| lastUpdatedDate = p.getDateFinished(); | ||
| } | ||
| p.clearRuntimeInfo(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why we need to clear RuntimeInfo when we load note if the runtimeInfo is not stored ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I will nedd to remove this line.
The runtime infos is persisted currently. But I dont want to persist it.
If i add transient modifier to runtimeInfos, when Gson.toJson(Paragraph) is done to send message about note/paragraph, the runtimeInfos wont be serialized and will not be received by browser(runtimeInfos wont get persisted in note.json),
Whereas if i add the modifier(current implementation), it will get persisted in JSON. But we dont want it, since it a run time information, and we might not want to persist in the note.json
Is there a way where i can serialize this info(for sending to client browser) as well as not persist in json?
PS: runtimeInfos object has the paragraph url information.
|
|
||
| String propertyName; | ||
| String label; | ||
| String group; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think here the label and group are duplicated. I don't see you use label in frontend. Another approach I think of is that we always use interpreter group name as the prefix of propertyName as convention, (e.g. spark.jobUrl), so that we don't need label or group. @Leemoonsoo
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The label is used here
| {{paragraph.runtimeInfos.jobUrl.label}} |
I had added group , so that two interpreters having same property name can be rendered in a different manner in UI.
| Map<String, String> infos = new java.util.HashMap<>(); | ||
| if (sparkUrl != null) { | ||
| infos.put("url", sparkUrl); | ||
| logger.info("Sending metainfos to Zeppelin server: {}", infos.toString()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we put the logging in the if block ? And what about else block ? Will that happen ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
| } | ||
| } | ||
|
|
||
| public void clearParagraphRuntimeInfo(InterpreterSetting setting) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am afraid this method may some issues because the binding between InterpreterSetting and Note/Paragraph may change. e.g. At the beginning I use spark in p1, but later I use flink in p1, but the method will clear all the runtime info of p1 when I restart spark interpreter.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right.. I have handled this now.
f85bfcc to
d978c48
Compare
Signed-off-by: karuppayya <karuppayya1990@gmail.com>
Signed-off-by: Karup <karuppayya@outlook.com>
Signed-off-by: Karup <karuppayya@outlook.com>
Signed-off-by: Karup <karuppayya@outlook.com>
Signed-off-by: karuppayya <karuppayya1990@gmail.com>
Signed-off-by: karuppayya <karuppayyar@qubole.com>
Signed-off-by: Karup <karuppayya@outlook.com>
Signed-off-by: Karup <karuppayya@outlook.com>
27ca32c to
8e2cd85
Compare
|
Thanks @Leemoonsoo for the suggestion and the pointer. |
|
@karuppayya Great work! LGTM and merge to master if no further discussions. |
|
Just upgraded to yesterday's master snapshot. When I click on any of these links, link is leading to http://hostname.domain.com:8088/proxy/application_1488384993892_0001/jobs/job/ and that error shows HTTP ERROR 400 Page reads
Thanks. |
|
@Tagar to access a specific job, the id has to be part of the url . |
|
@karuppayya thank you for the follow up. A little bit more information - the link on paragprah leads to A Spark Driver UI.
To your question on consistency - yes it happens in 100% cases, I never saw this new feature works for us. |
|
According to apache/spark#5947 URL format is different in YARN and non-YARN modes? Was PR-1663 for ZEPPELIN-1692 tested on both of these modes? Not sure what else might break those links. We do use Spark on YARN. Created https://issues.apache.org/jira/browse/ZEPPELIN-2221 to move discussion there. Thank you. |
A paragraph execution may result in spark job(s). Adding ability to access the spark job UI(corresponding to the job generated by the paragraph run), directly from the paragraph. Improvement * [x] Write tests ZEPPELIN-1692 Run paragraphs with spark code(scala, pyspark, sql, R). The paragraph will display a button on the top right corner, which on click will open up the corresponding job UI  * Does the licenses files need update? NA * Is there breaking changes for older versions? NA * Does this needs documentation? NA Author: Karup <karuppayya@outlook.com> Author: karuppayya <karuppayya1990@gmail.com> Author: karuppayya <karuppayyar@qubole.com> Closes apache#1663 from karuppayya/ZEPPELIN-1692 and squashes the following commits: 4253d0b [Karup] Fix bad rebase d7eb3b6 [Karup] Fix paragraph.js 8e2cd85 [Karup] tryout: fix selenium tests based on moons suggstion 732b0a4 [karuppayya] Fix test 890107d [Karup] Fix test - tryout ed4685c [Karup] Fix tooltip d27221d [Karup] Adding license header 87214a7 [Karup] Fix incorrect rebase 19513a6 [Karup] Send para runtimeinfos via websocker, but dont persist in json 09fc0e2 [Karup] Fix compilation fc44d9b [Karup] Address review comments b837c6c [karuppayya] Fix incorrect variable used 42d92ac [karuppayya] Fix test d4e54e8 [karuppayya] Address review feedbacks 1a45284 [Karup] Fix test 717eedf [Karup] Add tests , refactor 25379aa [Karup] Clear job urls when we clear output 7383c0a [Karup] Address review comments e2cd4db [karuppayya] Fix NPE in tests 3d9a573 [karuppayya] Fix NPE and some refactoring 9b3a3e2 [karuppayya] Fix checkstyle f16422f [karuppayya] Ability to view spark job urls in each paragraph (cherry picked from commit e9caebc)
|
Does this feature work for anyone who is using Spark on YARN? It seems to be broken by https://issues.apache.org/jira/browse/SPARK-20772 I've updated https://issues.apache.org/jira/browse/ZEPPELIN-2221 |






What is this PR for?
A paragraph execution may result in spark job(s).
Adding ability to access the spark job UI(corresponding to the job generated by the paragraph run), directly from the paragraph.
What type of PR is it?
Improvement
Todos
What is the Jira issue?
ZEPPELIN-1692
How should this be tested?
Run paragraphs with spark code(scala, pyspark, sql, R).
The paragraph will display a button on the top right corner, which on click will open up the corresponding job UI
Screenshots (if appropriate)
Questions: