Skip to content

Commit 3331d4c

Browse files
kuwiisrowen
authored andcommitted
[SPARK-39620][WEB UI] Use same condition in history server page and API to filter applications
### What changes were proposed in this pull request? Updated REST API `/api/v1/applications`, to use the same condition as history server page to filter completed/incomplete applications. ### Why are the changes needed? When opening summary page, history server follows this logic: - If there's completed/incomplete application, page will add script in response, using AJAX to call the REST API to get the filtered list. - If there's no such application, page will only return a message telling nothing found. Issue is that page and REST API are using different conditions to filter applications. In `HistoryPage`, an application is considered as completed as long as the last attempt is completed. But in `ApplicationListResource`, all attempts should be completed. This brings inconsistency and will cause issue in a corner case. In driver, event queues have capacity to protect memory. When there's too many events, some of them will be dropped and the event log file will be incomplete. For an application with multiple attempts, there's possibility that the last attempt is completed, but the previous attempts is considered as incomplete due to loss of application end event. For this type of application, page thinks it is completed, but the API thinks it is still running. When opening summary page: - When checking completed applications, page will call script, but API returns nothing. - When checking incomplete applications, page returns nothing. So the user won't be able to see this app in history server. ### Does this PR introduce _any_ user-facing change? Yes, there will be a change on `/api/v1/applications` API and history server summary page. When calling API, for application mentioned above, previously it is considered as running. After the change it is considered as completed. So the result will be different using same filter. But this change should be OK. Because attempts are executed sequentially and incrementally. So if an attempt with bigger ID is completed, the previous attempts can be considered as completed. For history server summary page, previously user is not able to see the application. Now it will appear in the completed applications. ### How was this patch tested? Add a new unit test `HistoryServerPageSuite`, which will check whether `HistoryPage` behaves the same as `ApplicationListResource` when filtering applications. To implement the test, there's a minor change of `HistoryPage`, exposing a method called `shouldDisplayApplications` to tell whether the summary page will display applications. The test verifies that: - If no completed/incomplete application found, `HistoryPage` should not display applications, and API should return an empty list. - Otherwise, `HistoryPage` should display applications, and API should return a non-empty list. Currently 2 scenarios are included: - Application with last attempt completed but previous attempt incomplete. - Application with last attempt incomplete but previous attempt completed. Closes #37008 from kuwii/kuwii/hs-fix. Authored-by: kuwii <kuwii.someone@gmail.com> Signed-off-by: Sean Owen <srowen@gmail.com>
1 parent d4cb2ea commit 3331d4c

File tree

8 files changed

+148
-3
lines changed

8 files changed

+148
-3
lines changed

core/src/main/scala/org/apache/spark/deploy/history/HistoryPage.scala

Lines changed: 5 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -30,8 +30,7 @@ private[history] class HistoryPage(parent: HistoryServer) extends WebUIPage("")
3030
val requestedIncomplete = Option(request.getParameter("showIncomplete"))
3131
.getOrElse("false").toBoolean
3232

33-
val displayApplications = parent.getApplicationList()
34-
.exists(isApplicationCompleted(_) != requestedIncomplete)
33+
val displayApplications = shouldDisplayApplications(requestedIncomplete)
3534
val eventLogsUnderProcessCount = parent.getEventLogsUnderProcess()
3635
val lastUpdatedTime = parent.getLastUpdatedTime()
3736
val providerConfig = parent.getProviderConfig()
@@ -91,6 +90,10 @@ private[history] class HistoryPage(parent: HistoryServer) extends WebUIPage("")
9190
UIUtils.basicSparkPage(request, content, "History Server", true)
9291
}
9392

93+
def shouldDisplayApplications(requestedIncomplete: Boolean): Boolean = {
94+
parent.getApplicationList().exists(isApplicationCompleted(_) != requestedIncomplete)
95+
}
96+
9497
private def makePageLink(request: HttpServletRequest, showIncomplete: Boolean): String = {
9598
UIUtils.prependBaseUri(request, "/?" + "showIncomplete=" + showIncomplete)
9699
}

core/src/main/scala/org/apache/spark/status/api/v1/ApplicationListResource.scala

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -38,7 +38,7 @@ private[v1] class ApplicationListResource extends ApiRequestContext {
3838
val includeRunning = status.isEmpty || status.contains(ApplicationStatus.RUNNING)
3939

4040
uiRoot.getApplicationInfoList.filter { app =>
41-
val anyRunning = app.attempts.exists(!_.completed)
41+
val anyRunning = app.attempts.isEmpty || !app.attempts.head.completed
4242
// if any attempt is still running, we consider the app to also still be running;
4343
// keep the app if *any* attempts fall in the right time window
4444
((!anyRunning && includeCompleted) || (anyRunning && includeRunning)) &&

core/src/test/resources/spark-events-broken/last-attempt-incomplete/application_1656321732247_0006_1

Lines changed: 10 additions & 0 deletions
Large diffs are not rendered by default.

core/src/test/resources/spark-events-broken/last-attempt-incomplete/application_1656321732247_0006_2

Lines changed: 9 additions & 0 deletions
Large diffs are not rendered by default.

core/src/test/resources/spark-events-broken/previous-attempt-incomplete/application_1656321732247_0006_1

Lines changed: 9 additions & 0 deletions
Large diffs are not rendered by default.

core/src/test/resources/spark-events-broken/previous-attempt-incomplete/application_1656321732247_0006_2

Lines changed: 10 additions & 0 deletions
Large diffs are not rendered by default.
Lines changed: 103 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,103 @@
1+
/*
2+
* Licensed to the Apache Software Foundation (ASF) under one or more
3+
* contributor license agreements. See the NOTICE file distributed with
4+
* this work for additional information regarding copyright ownership.
5+
* The ASF licenses this file to You under the Apache License, Version 2.0
6+
* (the "License"); you may not use this file except in compliance with
7+
* the License. You may obtain a copy of the License at
8+
*
9+
* http://www.apache.org/licenses/LICENSE-2.0
10+
*
11+
* Unless required by applicable law or agreed to in writing, software
12+
* distributed under the License is distributed on an "AS IS" BASIS,
13+
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
14+
* See the License for the specific language governing permissions and
15+
* limitations under the License.
16+
*/
17+
18+
package org.apache.spark.deploy.history
19+
20+
import java.net.URL
21+
import javax.servlet.http.HttpServletResponse
22+
23+
import org.json4s.DefaultFormats
24+
import org.json4s.JsonAST._
25+
import org.json4s.jackson.JsonMethods.parse
26+
import org.scalatest.BeforeAndAfter
27+
28+
import org.apache.spark.{SparkConf, SparkFunSuite}
29+
import org.apache.spark.internal.config.History._
30+
import org.apache.spark.internal.config.Tests._
31+
import org.apache.spark.status.api.v1.ApplicationStatus
32+
import org.apache.spark.util.Utils
33+
34+
class HistoryServerPageSuite extends SparkFunSuite with BeforeAndAfter {
35+
private implicit val format: DefaultFormats.type = DefaultFormats
36+
37+
private val logDirs = Seq(
38+
getTestResourcePath("spark-events-broken/previous-attempt-incomplete"),
39+
getTestResourcePath("spark-events-broken/last-attempt-incomplete")
40+
)
41+
42+
private var server: Option[HistoryServer] = None
43+
private val localhost: String = Utils.localHostNameForURI()
44+
private var port: Int = -1
45+
46+
private def startHistoryServer(logDir: String): Unit = {
47+
assert(server.isEmpty)
48+
val conf = new SparkConf()
49+
.set(HISTORY_LOG_DIR, logDir)
50+
.set(UPDATE_INTERVAL_S.key, "0")
51+
.set(IS_TESTING, true)
52+
val provider = new FsHistoryProvider(conf)
53+
provider.checkForLogs()
54+
val securityManager = HistoryServer.createSecurityManager(conf)
55+
val _server = new HistoryServer(conf, provider, securityManager, 18080)
56+
_server.bind()
57+
provider.start()
58+
server = Some(_server)
59+
port = _server.boundPort
60+
}
61+
62+
private def stopHistoryServer(): Unit = {
63+
server.foreach(_.stop())
64+
server = None
65+
}
66+
67+
private def callApplicationsAPI(requestedIncomplete: Boolean): Seq[JObject] = {
68+
val param = if (requestedIncomplete) {
69+
ApplicationStatus.RUNNING.toString.toLowerCase()
70+
} else {
71+
ApplicationStatus.COMPLETED.toString.toLowerCase()
72+
}
73+
val (code, jsonOpt, errOpt) = HistoryServerSuite.getContentAndCode(
74+
new URL(s"http://$localhost:$port/api/v1/applications?status=$param")
75+
)
76+
assert(code == HttpServletResponse.SC_OK)
77+
assert(jsonOpt.isDefined)
78+
assert(errOpt.isEmpty)
79+
val json = parse(jsonOpt.get).extract[List[JObject]]
80+
json
81+
}
82+
83+
override def afterEach(): Unit = {
84+
super.afterEach()
85+
stopHistoryServer()
86+
}
87+
88+
test("SPARK-39620: should behaves the same as REST API when filtering applications") {
89+
logDirs.foreach { logDir =>
90+
startHistoryServer(logDir)
91+
val page = new HistoryPage(server.get)
92+
Seq(true, false).foreach { requestedIncomplete =>
93+
val apiResponse = callApplicationsAPI(requestedIncomplete)
94+
if (page.shouldDisplayApplications(requestedIncomplete)) {
95+
assert(apiResponse.nonEmpty)
96+
} else {
97+
assert(apiResponse.isEmpty)
98+
}
99+
}
100+
stopHistoryServer()
101+
}
102+
}
103+
}

dev/.rat-excludes

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -138,3 +138,4 @@ over10k
138138
exported_table/*
139139
ansible-for-test-node/*
140140
node_modules
141+
spark-events-broken/*

0 commit comments

Comments
 (0)