Skip to content
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
26 commits
Select commit Hold shift + click to select a range
3bacabd
[SPARK-32446][SHS] Add percentile distribution REST API of peak memor…
AngersZhuuuu Nov 23, 2021
fa685bf
Add UI stelp 1
AngersZhuuuu Nov 24, 2021
a51b4e8
Merge branch 'master' into SPARK-32446
AngersZhuuuu Dec 7, 2021
27ab858
Update executorspage.js
AngersZhuuuu Dec 7, 2021
b01750e
Update executorspage.js
AngersZhuuuu Dec 7, 2021
d74f909
Update executorspage.js
AngersZhuuuu Dec 7, 2021
b2a4289
Update executorspage.js
AngersZhuuuu Dec 7, 2021
5e12e0c
Fix JS format
AngersZhuuuu Dec 8, 2021
415981f
Merge branch 'master' into SPARK-32446
AngersZhuuuu Dec 8, 2021
aa207af
update
AngersZhuuuu Dec 16, 2021
0ffdbb8
Update executorspage.js
AngersZhuuuu Dec 16, 2021
b08a77d
Update executor_peak_memory_metrics_distributions_expectation.json
AngersZhuuuu Dec 16, 2021
718c7d3
Update executorspage.js
AngersZhuuuu Dec 16, 2021
52c920c
Update executorspage.js
AngersZhuuuu Dec 17, 2021
b2d6e50
Merge branch 'master' into SPARK-32446
AngersZhuuuu Dec 17, 2021
34e1bf7
Update executorspage.js
AngersZhuuuu Dec 21, 2021
247985b
Merge branch 'master' into SPARK-32446
AngersZhuuuu Jan 4, 2022
8ca2156
Update executorspage.js
AngersZhuuuu Jan 7, 2022
1f276df
Update executorspage.js
AngersZhuuuu Jan 7, 2022
343e178
Update executorspage.js
AngersZhuuuu Jan 7, 2022
c7b5ad8
Update executorspage.js
AngersZhuuuu Jan 7, 2022
13d0981
trigger
AngersZhuuuu Jan 7, 2022
c567c57
Merge branch 'master' into SPARK-32446
AngersZhuuuu Jan 25, 2022
edec1a5
Update executorspage.js
AngersZhuuuu Jan 25, 2022
98c1ec7
Update executorspage.js
AngersZhuuuu Jan 25, 2022
2ef92fd
Update executorspage.js
AngersZhuuuu Jan 26, 2022
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
Expand Up @@ -65,7 +65,22 @@ <h4 class="title-table">Summary</h4>
</tbody>
</table>
</div>
<h4 class="title-table">Executors</h4>
<h4 id="executorSummaryMetricsTitle" class="title-table"></h4>
<div class="container-fluid">
<table id="summary-executor-metrics-table" class="table table-striped compact table-dataTable cell-border">
<thead>
<th>Metric</th>
<th>Min</th>
<th>25th percentile</th>
<th>Median</th>
<th>75th percentile</th>
<th>Max</th>
</thead>
<tbody>
</tbody>
</table>
</div>
<h4 id="executorsTitle" class="title-table">Executors</h4>
<div class="container-fluid">
<table id="active-executors-table" class="table table-striped compact cell-border" style="width: 100%">
<thead>
Expand Down
318 changes: 307 additions & 11 deletions core/src/main/resources/org/apache/spark/ui/static/executorspage.js

Large diffs are not rendered by default.

23 changes: 23 additions & 0 deletions core/src/main/resources/org/apache/spark/ui/static/utils.js
Original file line number Diff line number Diff line change
Expand Up @@ -205,6 +205,29 @@ function createRESTEndPointForExecutorsPage(appId) {
return uiRoot + "/api/v1/applications/" + appId + "/allexecutors";
}

function createRESTEndPointForExecutorsSummaries(appId) {
var words = getBaseURI().split('/');
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems like most of the logic here is duplicated with createRESTEndPointForExecutorsPage except for the REST call. Can we refactor it a bit and avoid/reduce the redundant code?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems like most of the logic here is duplicated with createRESTEndPointForExecutorsPage except for the REST call. Can we refactor it a bit and avoid/reduce the redundant code?

It belongs to different api, allexecutors return the list of all executor's summaries. It's hard to combine /allexecutors and /executorPeakMemoryMetricsDistribution

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about current.

var ind = words.indexOf("proxy");
var newBaseURI;
if (ind > 0) {
appId = words[ind + 1];
newBaseURI = words.slice(0, ind + 2).join('/');
return newBaseURI + "/api/v1/applications/" + appId + "/executorPeakMemoryMetricsDistribution";
}
ind = words.indexOf("history");
if (ind > 0) {
appId = words[ind + 1];
var attemptId = words[ind + 2];
newBaseURI = words.slice(0, ind).join('/');
if (isNaN(attemptId)) {
return newBaseURI + "/api/v1/applications/" + appId + "/executorPeakMemoryMetricsDistribution";
} else {
return newBaseURI + "/api/v1/applications/" + appId + "/" + attemptId + "/executorPeakMemoryMetricsDistribution";
}
}
return uiRoot + "/api/v1/applications/" + appId + "/executorPeakMemoryMetricsDistribution";
}

function createRESTEndPointForMiscellaneousProcess(appId) {
var words = getBaseURI().split('/');
var ind = words.indexOf("proxy");
Expand Down
12 changes: 12 additions & 0 deletions core/src/main/scala/org/apache/spark/status/AppStatusStore.scala
Original file line number Diff line number Diff line change
Expand Up @@ -452,6 +452,18 @@ private[spark] class AppStatusStore(
Some(computedQuantiles)
}

/**
* Calculates a summary of the executor metrics for executors, returning the
* requested quantiles for the recorded metrics.
*/
def executorMetricSummary(
activeOnly: Boolean,
unsortedQuantiles: Array[Double]): Option[v1.ExecutorPeakMetricsDistributions] = {
val quantiles = unsortedQuantiles.sorted
val executors = executorList(activeOnly).flatMap(_.peakMemoryMetrics).toIndexedSeq
Some(new v1.ExecutorPeakMetricsDistributions(quantiles, executors))
}

/**
* Whether to cache information about a specific metric quantile. We cache quantiles at every 0.05
* step, which covers the default values used both in the API and in the stages page.
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -52,6 +52,25 @@ private[v1] class AbstractApplicationResource extends BaseAppResource {
@Path("executors")
def executorList(): Seq[ExecutorSummary] = withUI(_.store.executorList(true))

@GET
@Path("executorPeakMemoryMetricsDistribution")
def executorSummary(
@QueryParam("activeOnly") @DefaultValue("true") activeOnly: Boolean,
@DefaultValue("0.05,0.25,0.5,0.75,0.95") @QueryParam("quantiles") quantileString: String)
: ExecutorPeakMetricsDistributions = withUI { ui =>
val quantiles = quantileString.split(",").map { s =>
try {
s.toDouble
} catch {
case _: NumberFormatException =>
throw new BadParameterException("quantiles", "double", s)
}
}

ui.store.executorMetricSummary(activeOnly, quantiles).getOrElse(
throw new NotFoundException(s"No executor reported metrics yet."))
}

@GET
@Path("executors/{executorId}/threads")
def threadDump(@PathParam("executorId") execId: String): Array[ThreadStackTrace] = withUI { ui =>
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,23 @@
{
"JVMHeapMemory" : [ 2.09883992E8, 4.6213568E8, 7.5947948E8, 9.8473656E8, 9.8473656E8 ],
"JVMOffHeapMemory" : [ 6.0829472E7, 6.1343616E7, 6.271752E7, 9.1926448E7, 9.1926448E7 ],
"OnHeapExecutionMemory" : [ 0.0, 0.0, 0.0, 0.0, 0.0 ],
"OffHeapExecutionMemory" : [ 0.0, 0.0, 0.0, 0.0, 0.0 ],
"OnHeapStorageMemory" : [ 7023.0, 12537.0, 19560.0, 19560.0, 19560.0 ],
"OffHeapStorageMemory" : [ 0.0, 0.0, 0.0, 0.0, 0.0 ],
"OnHeapUnifiedMemory" : [ 7023.0, 12537.0, 19560.0, 19560.0, 19560.0 ],
"OffHeapUnifiedMemory" : [ 0.0, 0.0, 0.0, 0.0, 0.0 ],
"DirectPoolMemory" : [ 10742.0, 10865.0, 12781.0, 157182.0, 157182.0 ],
"MappedPoolMemory" : [ 0.0, 0.0, 0.0, 0.0, 0.0 ],
"ProcessTreeJVMVMemory" : [ 8.296026112E9, 9.678606336E9, 9.684373504E9, 9.691553792E9, 9.691553792E9 ],
"ProcessTreeJVMRSSMemory" : [ 5.26491648E8, 7.03639552E8, 9.64222976E8, 1.210867712E9, 1.210867712E9 ],
"ProcessTreePythonVMemory" : [ 0.0, 0.0, 0.0, 0.0, 0.0 ],
"ProcessTreePythonRSSMemory" : [ 0.0, 0.0, 0.0, 0.0, 0.0 ],
"ProcessTreeOtherVMemory" : [ 0.0, 0.0, 0.0, 0.0, 0.0 ],
"ProcessTreeOtherRSSMemory" : [ 0.0, 0.0, 0.0, 0.0, 0.0 ],
"MinorGCCount" : [ 7.0, 15.0, 24.0, 27.0, 27.0 ],
"MinorGCTime" : [ 55.0, 106.0, 140.0, 145.0, 145.0 ],
"MajorGCCount" : [ 2.0, 2.0, 2.0, 3.0, 3.0 ],
"MajorGCTime" : [ 60.0, 63.0, 75.0, 144.0, 144.0 ],
"TotalGCTime" : [ 0.0, 0.0, 0.0, 0.0, 0.0 ]
}
Original file line number Diff line number Diff line change
Expand Up @@ -184,6 +184,8 @@ class HistoryServerSuite extends SparkFunSuite with BeforeAndAfter with Matchers
"executor node excludeOnFailure unexcluding" ->
"applications/app-20161115172038-0000/executors",
"executor memory usage" -> "applications/app-20161116163331-0000/executors",
"executor peak memory metrics distributions" ->
"applications/application_1553914137147_0018/executorPeakMemoryMetricsDistribution",
"executor resource information" -> "applications/application_1555004656427_0144/executors",
"multiple resource profiles" -> "applications/application_1578436911597_0052/environment",
"stage list with peak metrics" -> "applications/app-20200706201101-0003/stages",
Expand Down
9 changes: 9 additions & 0 deletions docs/monitoring.md
Original file line number Diff line number Diff line change
Expand Up @@ -545,6 +545,15 @@ can be identified by their `[attempt-id]`. In the API listed below, when running
<td><code>/applications/[app-id]/allexecutors</code></td>
<td>A list of all(active and dead) executors for the given application.</td>
</tr>
<tr>
<td><code>/applications/[app-id]/executorPeakMemoryMetricsDistribution</code></td>
<td>
Distributions of peak memory metrics for executors.
<br><code>?activeOnly=[true (default) | false]</code> lists only active executors
<br><code>?quantiles</code> summarize the metrics with the given quantiles.
<br>Example: <code>?activeOnly=false&quantiles=0.01,0.5,0.99</code>
</td>
</tr>
<tr>
<td><code>/applications/[app-id]/storage/rdd</code></td>
<td>A list of stored RDDs for the given application.</td>
Expand Down