Skip to content

Conversation

@shahidki31
Copy link
Contributor

@shahidki31 shahidki31 commented Oct 10, 2018

What changes were proposed in this pull request?

When we enable event log compression and compression codec as 'zstd', we are unable to open the webui of the running application from the history server page.
The reason is that, Replay listener was unable to read from the zstd compressed eventlog due to the zstd frame was not finished yet. This causes truncated error while reading the eventLog.

So, when we try to open the WebUI from the History server page, it throws "truncated error ", and we never able to open running application in the webui, when we enable zstd compression.

In this PR, when the IO excpetion happens, and if it is a running application, we log the error,
"Failed to read Spark event log: evetLogDirAppName.inprogress", instead of throwing exception.

How was this patch tested?

Test steps:
1)spark.eventLog.compress = true
2)spark.io.compression.codec = zstd
3)restart history server
4) launch bin/spark-shell
5) run some queries
6) Open history server page
7) click on the application

Before fix:
screenshot from 2018-10-10 23-52-12

screenshot from 2018-10-10 23-52-28

After fix:

screenshot from 2018-10-10 23-43-49

screenshot from 2018-10-10 23-44-05

(Please explain how this patch was tested. E.g. unit tests, integration tests, manual tests)
(If this patch involves UI changes, please attach a screenshot; otherwise, remove this)

Please review http://spark.apache.org/contributing.html before opening a pull request.

@shahidki31 shahidki31 changed the title [SPARK-25697][CORE]When zstd compression enabled in progress application is throwing Error is throwing in the history webui… [SPARK-25697][CORE]When zstd compression enabled, InProgress application is throwing Error in the history webui Oct 10, 2018
@shahidki31
Copy link
Contributor Author

cc @vanzin @srowen . Kindly review.

@srowen
Copy link
Member

srowen commented Oct 10, 2018

Should the Event Log be available for running apps? Or if it's not going to work, disable it where it can't be shown, but I suppose that could be difficult. This just silently sends you back to the jobs page?

@shahidki31
Copy link
Contributor Author

shahidki31 commented Oct 10, 2018

Hi @srowen . Yes. Event logs are available for running apps, but with the extension, ".inprogress".
We can open webui from the history server page for both running and finished applications.
This is working for all compressed codecs (lz4, snappy, lzf) , also for uncompressed event logs.

@srowen
Copy link
Member

srowen commented Oct 10, 2018

I guess that doing nothing is better than an error screen. Is it possible to just skip reading incomplete files here? I don't know this code well. That sounds better.

@shahidki31
Copy link
Contributor Author

shahidki31 commented Oct 11, 2018

@srowen . Yes. We should read only from the finished frames of zstd. When the listener try to read from the unfinished frame, zstd input reader throws an exception (unless we make set continuous true).

Currently the behavior is, it reads from the finished frames, but after that it try to read from the unfinished frame and throws exception while loading the webui. So, the solution should be, we should not parse from the unfinished frame, and load the UI based on only the finished frames.

Hi @vanzin , could you please give your inputs?

Thanks

@felixcheung
Copy link
Member

ok to test

@SparkQA
Copy link

SparkQA commented Oct 11, 2018

Test build #97233 has finished for PR 22689 at commit c309f34.

  • This patch fails due to an unknown error code, -9.
  • This patch merges cleanly.
  • This patch adds no public classes.

@shahidki31
Copy link
Contributor Author

retest this please.

@SparkQA
Copy link

SparkQA commented Oct 11, 2018

Test build #97247 has finished for PR 22689 at commit 0924a0a.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

Copy link
Member

@srowen srowen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, so it does read all the data it can. That seems fine

@shahidki31
Copy link
Contributor Author

Yes. Thank you @srowen .

asfgit pushed a commit that referenced this pull request Oct 12, 2018
…tion is throwing Error in the history webui

## What changes were proposed in this pull request?
When we enable event log compression and compression codec as 'zstd', we are unable to open the webui of the running application from the history server page.
The reason is that, Replay listener was unable to read from the zstd compressed eventlog due to the zstd frame was not finished yet. This causes truncated error while reading the eventLog.

So, when we try to open the WebUI from the History server page, it throws "truncated error ", and we never able to open running application in the webui, when we enable zstd compression.

In this PR, when the IO excpetion happens, and if it is a running application, we  log the error,
"Failed to read Spark event log: evetLogDirAppName.inprogress", instead of throwing exception.

## How was this patch tested?
Test steps:
1)spark.eventLog.compress =  true
2)spark.io.compression.codec = zstd
3)restart history server
4) launch bin/spark-shell
5) run some queries
6) Open history server page
7) click on the application

**Before fix:**
![screenshot from 2018-10-10 23-52-12](https://user-images.githubusercontent.com/23054875/46757387-9b4fa580-cce7-11e8-96ad-8938400483ed.png)

![screenshot from 2018-10-10 23-52-28](https://user-images.githubusercontent.com/23054875/46757393-a0145980-cce7-11e8-8cb0-44b583dde648.png)

**After fix:**

![screenshot from 2018-10-10 23-43-49](https://user-images.githubusercontent.com/23054875/46756971-6858e200-cce6-11e8-946c-0bffebb2cfba.png)

![screenshot from 2018-10-10 23-44-05](https://user-images.githubusercontent.com/23054875/46756981-6d1d9600-cce6-11e8-95ea-ff8339a2fdfd.png)

(Please explain how this patch was tested. E.g. unit tests, integration tests, manual tests)
(If this patch involves UI changes, please attach a screenshot; otherwise, remove this)

Please review http://spark.apache.org/contributing.html before opening a pull request.

Closes #22689 from shahidki31/SPARK-25697.

Authored-by: Shahid <shahidki31@gmail.com>
Signed-off-by: Sean Owen <sean.owen@databricks.com>
(cherry picked from commit 8e039a7)
Signed-off-by: Sean Owen <sean.owen@databricks.com>
@srowen
Copy link
Member

srowen commented Oct 12, 2018

Merged to master/2.4

@asfgit asfgit closed this in 8e039a7 Oct 12, 2018
@shahidki31
Copy link
Contributor Author

Thanks a lot @srowen

@shahidki31 shahidki31 deleted the SPARK-25697 branch October 12, 2018 19:21
jackylee-ch pushed a commit to jackylee-ch/spark that referenced this pull request Feb 18, 2019
…tion is throwing Error in the history webui

## What changes were proposed in this pull request?
When we enable event log compression and compression codec as 'zstd', we are unable to open the webui of the running application from the history server page.
The reason is that, Replay listener was unable to read from the zstd compressed eventlog due to the zstd frame was not finished yet. This causes truncated error while reading the eventLog.

So, when we try to open the WebUI from the History server page, it throws "truncated error ", and we never able to open running application in the webui, when we enable zstd compression.

In this PR, when the IO excpetion happens, and if it is a running application, we  log the error,
"Failed to read Spark event log: evetLogDirAppName.inprogress", instead of throwing exception.

## How was this patch tested?
Test steps:
1)spark.eventLog.compress =  true
2)spark.io.compression.codec = zstd
3)restart history server
4) launch bin/spark-shell
5) run some queries
6) Open history server page
7) click on the application

**Before fix:**
![screenshot from 2018-10-10 23-52-12](https://user-images.githubusercontent.com/23054875/46757387-9b4fa580-cce7-11e8-96ad-8938400483ed.png)

![screenshot from 2018-10-10 23-52-28](https://user-images.githubusercontent.com/23054875/46757393-a0145980-cce7-11e8-8cb0-44b583dde648.png)

**After fix:**

![screenshot from 2018-10-10 23-43-49](https://user-images.githubusercontent.com/23054875/46756971-6858e200-cce6-11e8-946c-0bffebb2cfba.png)

![screenshot from 2018-10-10 23-44-05](https://user-images.githubusercontent.com/23054875/46756981-6d1d9600-cce6-11e8-95ea-ff8339a2fdfd.png)

(Please explain how this patch was tested. E.g. unit tests, integration tests, manual tests)
(If this patch involves UI changes, please attach a screenshot; otherwise, remove this)

Please review http://spark.apache.org/contributing.html before opening a pull request.

Closes apache#22689 from shahidki31/SPARK-25697.

Authored-by: Shahid <shahidki31@gmail.com>
Signed-off-by: Sean Owen <sean.owen@databricks.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants