Skip to content

Commit

Permalink
[Feature][Metrics] Add resource download related metrics for workers (#…
Browse files Browse the repository at this point in the history
…10749)

* [Feature][Metrics] Add resource download related metrics for workers (#9324)

* [Feature][Metrics] Fix bugs and add grafana demos for worker resource download metrics (#9324)

* [Feature][Metrics] Add docs to resource related metrics (#9324)

* [Feature][Metrics] Use tags to indicate status in metrics (#9324)

* [Feature][Metrics] Fix demos, docs and remove redundant code (#9324)

* [Feature][Metrics] Remove .pnpm-debug.log (#9324)

* [Feature][Metrics] Fix style check (#9324)

* [Feature][Metrics] Replace KB with bytes for the unit of resource file size in metrics (#9324)

* [Feature][Metrics] Make code neat (#9324)
  • Loading branch information
EricGao888 authored Jul 12, 2022
1 parent 56fe11e commit 2f7281c
Show file tree
Hide file tree
Showing 5 changed files with 460 additions and 14 deletions.
5 changes: 4 additions & 1 deletion docs/docs/en/guide/metrics/metrics.md
Original file line number Diff line number Diff line change
Expand Up @@ -74,7 +74,7 @@ For example, you can get the master metrics by `curl http://localhost:5679/actua
- ds.task.execution.count.by.type: (counter) the number of task executions grouped by tag `task_type`
- ds.task.running: (gauge) the number of running tasks
- ds.task.prepared: (gauge) the number of tasks prepared for task queue
- ds.task.execution.count: (histogram) the number of executed tasks
- ds.task.execution.count: (counter) the number of executed tasks
- ds.task.execution.duration: (histogram) duration of task executions


Expand Down Expand Up @@ -103,6 +103,9 @@ For example, you can get the master metrics by `curl http://localhost:5679/actua

- ds.worker.overload.count: (counter) the number of times the worker overloaded
- ds.worker.full.submit.queue.count: (counter) the number of times the worker's submit queue being full
- ds.worker.resource.download.count: (counter) the number of downloaded resource files on workers, sliced by tag `status`
- ds.worker.resource.download.duration: (histogram) the time cost of resource download on workers
- ds.worker.resource.download.size: (histogram) the sizes of downloaded resource files on workers (bytes)

### Api Server Metrics

Expand Down
3 changes: 3 additions & 0 deletions docs/docs/zh/guide/metrics/metrics.md
Original file line number Diff line number Diff line change
Expand Up @@ -104,6 +104,9 @@ metrics exporter端口`server.port`是在application.yaml里定义的: master: `

- ds.worker.overload.count: (counter) worker过载次数
- ds.worker.full.submit.queue.count: (counter) worker提交队列全满次数
- ds.worker.resource.download.count: (counter) worker下载资源文件的次数,可由`status`标签切分
- ds.worker.resource.download.duration: (histogram) worker下载资源文件时花费的时间分布
- ds.worker.resource.download.size: (histogram) worker下载资源文件大小的分布(bytes)

### Api Server指标

Expand Down
Loading

0 comments on commit 2f7281c

Please sign in to comment.