-
Notifications
You must be signed in to change notification settings - Fork 514
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
HDDS-11316. Improve Create Key and Chunk IO Dashboards #7075
Conversation
Change-Id: I187a3f0b61ebd3e6ce7c36464052d3dd2e2a2d8b
Change-Id: I352b08bd43a79416a38d8070fcbd206608ec8ef6
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the patch @kerneltime . LGTM
@@ -15,7 +15,7 @@ | |||
"type": "grafana", | |||
"id": "grafana", | |||
"name": "Grafana", | |||
"version": "10.4.2" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Naive question: Would other grafana servers be able to import dashboard generated from newer grafana.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good question. Even with the local setup the docker imaged pulled down is the latest. I used the older version 10.4.2
of the docker image and the dashboards were rendered without errors. This will be a hard thing for us to keep track off across versions.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I tried on an older version Version 7.3.6 (commit: NA, branch: master)
and a lot of the dashboards don't really work.. but I think it is ok for us to continue with the latest version.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for the patch @kerneltime, tested the patch locally ✅.
Change-Id: Icdd524ac0479890b50bdf0e256089ae9768deaed
@tanvipenumudy I pused one more change, can you take a look? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Redeployed the changes.
- The Chunk IO Dashboard LGTM.
- Please find a very small nit on the Create Keys Dashboard.
Thanks!
hadoop-ozone/dist/src/main/compose/common/grafana/dashboards/Ozone - CreateKey Metrics.json
Outdated
Show resolved
Hide resolved
hadoop-ozone/dist/src/main/compose/common/grafana/dashboards/Ozone - CreateKey Metrics.json
Outdated
Show resolved
Hide resolved
…zone - CreateKey Metrics.json
* master: HDDS-11081. Use thread-local instance of FileSystem in Freon tests (#7091) HDDS-11333. Avoid hard-coded current version in upgrade/xcompat tests (#7089) Mark TestPipelineManagerMXBean#testPipelineInfo as flaky Mark TestAddRemoveOzoneManager#testForceBootstrap as flaky HDDS-11352. HDDS-11353. Mark TestOzoneManagerHAWithStoppedNodes as flaky HDDS-11354. Mark TestOzoneManagerSnapshotAcl#testLookupKeyWithNotAllowedUserForPrefixAcl as flaky HDDS-11355. Mark TestMultiBlockWritesWithDnFailures#testMultiBlockWritesWithIntermittentDnFailures as flaky HDDS-11227. Use server default key provider to encrypt/decrypt keys from multiple OMs. (#7081) HDDS-11316. Improve Create Key and Chunk IO Dashboards (#7075) HDDS-11239. Fix KeyOutputStream's exception handling when calling hsync concurrently (#7047)
…an-on-error * HDDS-10239-container-reconciliation: (428 commits) HDDS-11081. Use thread-local instance of FileSystem in Freon tests (apache#7091) HDDS-11333. Avoid hard-coded current version in upgrade/xcompat tests (apache#7089) Mark TestPipelineManagerMXBean#testPipelineInfo as flaky Mark TestAddRemoveOzoneManager#testForceBootstrap as flaky HDDS-11352. HDDS-11353. Mark TestOzoneManagerHAWithStoppedNodes as flaky HDDS-11354. Mark TestOzoneManagerSnapshotAcl#testLookupKeyWithNotAllowedUserForPrefixAcl as flaky HDDS-11355. Mark TestMultiBlockWritesWithDnFailures#testMultiBlockWritesWithIntermittentDnFailures as flaky HDDS-11227. Use server default key provider to encrypt/decrypt keys from multiple OMs. (apache#7081) HDDS-11316. Improve Create Key and Chunk IO Dashboards (apache#7075) HDDS-11239. Fix KeyOutputStream's exception handling when calling hsync concurrently (apache#7047) HDDS-11254. Reconcile commands should be handled by datanode ReplicationSupervisor (apache#7076) HDDS-11331. Fix Datanode unable to report for a long time (apache#7090) HDDS-11346. FS CLI gives incorrect recursive volume deletion prompt (apache#7102) HDDS-11349. Add NullPointer handling when volume/bucket tables are not initialized (apache#7103) HDDS-11209. Avoid insufficient EC pipelines in the container pipeline cache (apache#6974) HDDS-11284. refactor quota repair non-blocking while upgrade (apache#7035) HDDS-9790. Add tests for Overview page (apache#6983) HDDS-10904. [hsync] Enable PutBlock piggybacking and incremental chunk list by default (apache#7074) HDDS-11322. [hsync] Block ECKeyOutputStream from calling hsync and hflush (apache#7098) HDDS-11325. Intermittent failure in TestBlockOutputStreamWithFailures#testContainerClose (apache#7099) ... Conflicts: hadoop-hdds/container-service/src/main/java/org/apache/hadoop/ozone/container/checksum/ContainerChecksumTreeManager.java hadoop-hdds/container-service/src/main/java/org/apache/hadoop/ozone/container/keyvalue/KeyValueContainerCheck.java hadoop-hdds/container-service/src/main/java/org/apache/hadoop/ozone/container/keyvalue/KeyValueHandler.java hadoop-hdds/container-service/src/main/java/org/apache/hadoop/ozone/container/ozoneimpl/OzoneContainer.java
…rrupt-files * HDDS-10239-container-reconciliation: (61 commits) HDDS-11081. Use thread-local instance of FileSystem in Freon tests (apache#7091) HDDS-11333. Avoid hard-coded current version in upgrade/xcompat tests (apache#7089) Mark TestPipelineManagerMXBean#testPipelineInfo as flaky Mark TestAddRemoveOzoneManager#testForceBootstrap as flaky HDDS-11352. HDDS-11353. Mark TestOzoneManagerHAWithStoppedNodes as flaky HDDS-11354. Mark TestOzoneManagerSnapshotAcl#testLookupKeyWithNotAllowedUserForPrefixAcl as flaky HDDS-11355. Mark TestMultiBlockWritesWithDnFailures#testMultiBlockWritesWithIntermittentDnFailures as flaky HDDS-11227. Use server default key provider to encrypt/decrypt keys from multiple OMs. (apache#7081) HDDS-11316. Improve Create Key and Chunk IO Dashboards (apache#7075) HDDS-11239. Fix KeyOutputStream's exception handling when calling hsync concurrently (apache#7047) HDDS-11254. Reconcile commands should be handled by datanode ReplicationSupervisor (apache#7076) HDDS-11331. Fix Datanode unable to report for a long time (apache#7090) HDDS-11346. FS CLI gives incorrect recursive volume deletion prompt (apache#7102) HDDS-11349. Add NullPointer handling when volume/bucket tables are not initialized (apache#7103) HDDS-11209. Avoid insufficient EC pipelines in the container pipeline cache (apache#6974) HDDS-11284. refactor quota repair non-blocking while upgrade (apache#7035) HDDS-9790. Add tests for Overview page (apache#6983) HDDS-10904. [hsync] Enable PutBlock piggybacking and incremental chunk list by default (apache#7074) HDDS-11322. [hsync] Block ECKeyOutputStream from calling hsync and hflush (apache#7098) HDDS-11325. Intermittent failure in TestBlockOutputStreamWithFailures#testContainerClose (apache#7099) ... Conflicts: hadoop-hdds/container-service/src/main/java/org/apache/hadoop/ozone/container/checksum/ContainerChecksumTreeManager.java
Change-Id: I187a3f0b61ebd3e6ce7c36464052d3dd2e2a2d8b
Add additional metrics to create key dashboard to track total data committed, number of key commits executed.
Minor fixes to legends for Chunk Dashboard. Workaround issue grafana/grafana#44251
Also, default to prometheus configured instead of parameterizing the source which gives errors on import.
What is the link to the Apache JIRA
https://issues.apache.org/jira/browse/HDDS-11316
How was this patch tested?
Deployed locally.