-
Notifications
You must be signed in to change notification settings - Fork 4.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[DSIP-8][Metrics] Improve DolphinScheduler Monitoring #9324
Comments
Hi:
|
I think it's better to including the number of threads related to the execution of the worker and master in the monitoring. |
I just updated the google doc in the Another thing I propose we could think about is the granularity of metrics. I find current metrics are general statistics. Statistics of tasks and workflows are separated. We may need some metric like
|
Besides, we need some descriptions for exiting metrics in official docs. #9441 |
@EricGao888 Hi, I close #5255, since there is already a module dolphinscheduler-meter can expose the metrics, and I will take part in this work to provide some common method. |
I think this issue is worth |
@devosend Hello, may I ask whether it is possible to include the three PRs of stage I in |
Agrees with that, we should add DSIP for this |
@EricGao888 Could you follow the https://dolphinscheduler.apache.org/en-us/community/DSIP.html guide to make it like DSIP? |
Oh, I remenber you already discuss with an e-mail about the monitoring in https://lists.apache.org/thread/6sogjh6k7f2hv954mhn24c94l2mzwgsz, maybe you should append some words and tell users we want to covert it to DSIP now |
…e size in metrics (apache#9324)
… download metrics (apache#9324)
…e size in metrics (apache#9324)
…10749) * [Feature][Metrics] Add resource download related metrics for workers (#9324) * [Feature][Metrics] Fix bugs and add grafana demos for worker resource download metrics (#9324) * [Feature][Metrics] Add docs to resource related metrics (#9324) * [Feature][Metrics] Use tags to indicate status in metrics (#9324) * [Feature][Metrics] Fix demos, docs and remove redundant code (#9324) * [Feature][Metrics] Remove .pnpm-debug.log (#9324) * [Feature][Metrics] Fix style check (#9324) * [Feature][Metrics] Replace KB with bytes for the unit of resource file size in metrics (#9324) * [Feature][Metrics] Make code neat (#9324)
…pache#10749) * [Feature][Metrics] Add resource download related metrics for workers (apache#9324) * [Feature][Metrics] Fix bugs and add grafana demos for worker resource download metrics (apache#9324) * [Feature][Metrics] Add docs to resource related metrics (apache#9324) * [Feature][Metrics] Use tags to indicate status in metrics (apache#9324) * [Feature][Metrics] Fix demos, docs and remove redundant code (apache#9324) * [Feature][Metrics] Remove .pnpm-debug.log (apache#9324) * [Feature][Metrics] Fix style check (apache#9324) * [Feature][Metrics] Replace KB with bytes for the unit of resource file size in metrics (apache#9324) * [Feature][Metrics] Make code neat (apache#9324)
Looks like some PRs related to metrics has not been cherry-picked to 3.0.0-prepare. What about picks them when #10867 merged? @ruanwenjun @caishunfeng @zhongjiajie Thx~ |
I think it's better put into next version, because we are about to release 3.0.0-release, during this time, we only hope to cherry-pick the pr of bugfix. |
Sure, make sense to me. Thx~ |
Search before asking
Description
Choose good tools, Back home early. Use Right Scheduler, Sleep Tight.
we need richer metrics to increase monitoring ability and give our users better experience using Dolphinscheduler, especially in production environment.Use case
Description
section happen, we could take three steps:Action Items
Stage I
Stage II
Micrometer
besidesPrometheus
, such asCloudWatch
,Datadog
,StatsD
,Influx
,JMX
,Elastic
, etc. For a full list, visit MicrometerSetup
section. In addition, to provide users with smooth experience, we should add docker yaml files for each exporter for the demo purpose.Stage III
Related issues
related: #5255
Are you willing to submit a PR?
Code of Conduct
The text was updated successfully, but these errors were encountered: