-
Notifications
You must be signed in to change notification settings - Fork 2.8k
Buffer append output results + fix extra incorrect results #1283
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
| * could be called in PENDING state as well | ||
| */ | ||
| if ($scope.paragraph.id === data.paragraphId && | ||
| ($scope.paragraph.status === 'RUNNING' || $scope.paragraph.status === 'PENDING')) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Considering the Comment above, why would we accept PENDING state?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What I observed was that in between the pending and the running state, some (I observed 1) append-output events were called. If we miss those, then while the paragraph is running, we would see that some intial-result line/s is/are missing.
Also, when the para execution is finished, the complete note is sent again (with results), so we would see the correct result.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see, the comment was a bit misleading, as it seems it was saying that it could be errorneously call es as well in PENDING state.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ya, when I re-read it sounds that way.. Re-phrased it.
| + " and send paragraph append data"); | ||
| thread = new Thread(new AppendOutputRunner()); | ||
| thread.start(); | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's not thread-safe. You should use synchronized or something else
|
I think you'd better use |
|
There're some |
|
It is easier to handle the twice output on the javascript side, so I have added the check in file |
1. Synchronize on AppendOutputRunner creation 2. Use ScheduledExecutorService instead of while loop 3. Remove Thread.sleep() from tests
|
@jongyoul: I have incorported all feedback:
|
| prepareInvocationCounts(listener); | ||
| AppendOutputRunner.setListener(listener); | ||
| CheckAppendOutputRunner.startRunnerForUnitTests(); | ||
| while(numInvocations != numTimes); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I suggest you add Circuit Breaker with few seconds to avoid running infinitely.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.. Added a check for to fail the tests after 2 seconds.. Would have to add CircuitBreaker in dependencies, so just using System times.
|
My tests needed some fixes to work with other tests. I have fixed them. But the build still has some errors I think not related to my change. https://travis-ci.org/apache/zeppelin/builds/150876902 . |
|
@jongyoul @corneadoug I have incorporated all feedback, and added comments on queries. This PR is ready for review. |
|
I've a few comment on your comments. Please check it. |
|
Tested a few cases to check that the front-end still handle most of the cases, and its good |
| private static final Long SAFE_PROCESSING_STRING_SIZE = new Long(100000); | ||
|
|
||
| private static final BlockingQueue<AppendOutputBuffer> QUEUE = | ||
| new LinkedBlockingQueue<AppendOutputBuffer>(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This dynamic data structure does not feel like a constant, so accouding to the project styleguide#s5.2.4-constant-names) better be rather named queue.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done..
|
@corneadoug thanks for verifying. |
|
@corneadoug @jongyoul @bzz this is ready for review. |
| public void run() { | ||
|
|
||
| Map<String, Map<String, StringBuilder> > noteMap = | ||
| new HashMap<String, Map<String, StringBuilder> >(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Probably a nitpick, but 'diamond operator' should come handy
Map<String, Map<String, StringBuilder> > noteMap = new HashMap<>();There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done..
|
@corneadoug @jongyoul @bzz this is ready for review. |
|
@beriaanirudh You need to change more where you don't use 'diamond operator'. Can you fix them? |
|
@jongyoul done. Used diamond operator at all places except for 1 where compiler wouldn't allow (AppendOutputRunner.java:78) |
|
@beriaanirudh thank you for prompt fixes! |
|
+1 Looking forward to this |
|
LGTM |
|
@corneadoug @bzz this is ready for review... |
|
The behaviour is still the same with: |
|
@corneadoug yes, that streams output every one seconds if i try the master branch. However, I did some more experiments and found Output streaming : master branch (X), this pullrequest (X) Output streaming : master branch (O), this pullrequest (X) @corneadoug @beriaanirudh Could you verify these cases, too? |
|
I tested both cases: However the first time I tried this branch and commented, the Case 2 was fine in this Branch |
|
According to the test results, %sh interpreter output streaming might be other issue. @beriaanirudh Could you take care of it? |
|
Hey, I think SparkInterpreter could have been broken which got fixed today in |
|
Don't know why sometimes we get some failure in that branch, Case1: master (X), this branch (X) |
|
Thanks for the verification @corneadoug . Even I am not able to reproduce it. My best guess is that it had to do something with the fix I mentioned above. |
|
@corneadoug @Leemoonsoo please let me know if any action-item on my side. |
|
Tested again and got Case2 works fine on this branch. Thanks @beriaanirudh for the contribution! |
There are 2 issues and their proposed fixes: 1. On a paragraph run, for every line of output, there is a broadcast of the new line from zeppelin. In case of thousands of lines of output, the browser/s would hang because of the volume of these append-output events. 2. In the above case, besides the browser-hang, another bug observed is that result data is will repeated twice (coming from append-output calls + finish-event calls). The proposed solution for apache#1 is: - Buffer the append-output event into a queue instead of sending the event immediately. - In a separate thread, read from the queue periodically and send the append-output event. Solution for apache#2 is: - Donot append output to result if the paragraph is not runnig. Improvement + Bug Fix https://issues.apache.org/jira/browse/ZEPPELIN-1292 The test could be to run a simple paragraph with large result. Eg: ``` %sh for i in {1..10000} do echo $i done ``` PS: One will need to clear browser cache between running with and without this code patch since there are javascript changes as well. * Does the licenses files need update? No * Is there breaking changes for older versions? No * Does this needs documentation? It could need for the design. Otherwise I have added code comments explaining behaviour. Author: Beria <beria@qubole.com> Closes apache#1283 from beriaanirudh/ZEPPELIN-1292 and squashes the following commits: 17f0524 [Beria] Use diamond operator 7852368 [Beria] nit 4b68c86 [Beria] fix checkstyle d168614 [Beria] Remove un-necessary class CheckAppendOutputRunner 2eae38e [Beria] Make AppendOutputRunner non-static 72c316d [Beria] Scheduler service to replace while loop in AppendOutputRunner 599281f [Beria] fix unit tests that run after dd24816 [Beria] Add license in test file 3984ef8 [Beria] fix tests when ran with other tests 1c893c0 [Beria] Add licensing 1bdd669 [Beria] fix javadoc comment 27790e4 [Beria] Avoid infinite loop in tests 5057bb3 [Beria] Incorporate feedback 1. Synchronize on AppendOutputRunner creation 2. Use ScheduledExecutorService instead of while loop 3. Remove Thread.sleep() from tests 82e9c4a [Beria] Fix comment 7020f0c [Beria] Buffer append output results + fix extra incorrect results (cherry picked from commit 11becde)


What is this PR for?
There are 2 issues and their proposed fixes:
The proposed solution for #1 is:
Solution for #2 is:
What type of PR is it?
Improvement + Bug Fix
Todos
What is the Jira issue?
https://issues.apache.org/jira/browse/ZEPPELIN-1292
How should this be tested?
The test could be to run a simple paragraph with large result. Eg:
PS: One will need to clear browser cache between running with and without this code patch since there are javascript changes as well.
Screenshots (if appropriate)
Questions:
No
No
It could need for the design. Otherwise I have added code comments explaining behaviour.