Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

HDDS-10517. Recon - Add a UI component for showing DN decommissioning detailed status and info #6724

Merged
merged 20 commits into from
Aug 7, 2024

Conversation

smitajoshi12
Copy link
Contributor

@smitajoshi12 smitajoshi12 commented May 24, 2024

What changes were proposed in this pull request?

UI component has been added with some drill down on a DN which is under decommissioning.

Backend PR For Reference:-
https://github.com/apache/ozone/pull/6376##

What is the link to the Apache JIRA
(https://issues.apache.org/jira/browse/HDDS-10517)

How was this patch tested?

Manually

New Enhancemnet Tested with Cluster data in local

  1. Added New card with DataNodes Decommission Count

image

  1. After Hovering on UUID Column
    image

  2. Tooltip has been added before hover
    image

  3. Once Status becomes Decommissioned decommission details will not be shown.
    image

  4. From Backend if we get Empty response from decommission api will not showing any details about decommission.

  5. Handled all negative test cases for empty and null data if one Api's fail other apis working

image

  1. With Multiple datanode Decommission

image

image

  1. After Error on Card

image

  1. Vertical Layout

image

@devmadhuu
Copy link
Contributor

@smitajoshi12 thanks for working on this patch. Looks like on datanodes page, you wait for /datanodes API to provide the info if any datanode is in decommissioning or not, then you call the decommission info API for that respective UUID, which seems inconsistent behavior, because on overview summary page, you are showing the count of datanodes which are in decommissioning , but when user clicks on summary tile, and taken to datanodes page, you still show all nodes healthy which is wrong. Kindly correct the behavior and make it consistent.

@smitajoshi12
Copy link
Contributor Author

@smitajoshi12 thanks for working on this patch. Looks like on datanodes page, you wait for /datanodes API to provide the info if any datanode is in decommissioning or not, then you call the decommission info API for that respective UUID, which seems inconsistent behavior, because on overview summary page, you are showing the count of datanodes which are in decommissioning , but when user clicks on summary tile, and taken to datanodes page, you still show all nodes healthy which is wrong. Kindly correct the behavior and make it consistent.

@devmadhuu It is taking time to reflect status in /datanodes page. /datanode api is called after each 60 seconds as default behaviour. will think how we can meet this condition.

@smitajoshi12
Copy link
Contributor Author

smitajoshi12 commented Jun 5, 2024

@devmadhuu
Addressed synchronization isssue in latest commit and attached screenshots.

@dombizita dombizita changed the title HDDS-10517. Recon - Add a UI component for showing DN decommissioning… HDDS-10517. Recon - Add a UI component for showing DN decommissioning detailed status and info Jun 10, 2024
Copy link
Contributor

@ArafatKhan2198 ArafatKhan2198 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for working on this smitha!
Have posted a few comments and I also have a few doubts on the additions you have made can you please take a look.
And also please resolve the conflict

@devmadhuu
Copy link
Contributor

@smitajoshi12 we shouldn't show the initial tool tip from datanodes page "It is going to decommission soon!!!". Rather we should call decommission info API and match with what all datanodes are in decommissioning and update their status on UI. Behavior should be consistent across all UI pages for same entity. And also you need to call the decommission info API once , not multiple times as user land up on datanodes page.

@smitajoshi12
Copy link
Contributor Author

@smitajoshi12 we shouldn't show the initial tool tip from datanodes page "It is going to decommission soon!!!". Rather we should call decommission info API and match with what all datanodes are in decommissioning and update their status on UI. Behavior should be consistent across all UI pages for same entity. And also you need to call the decommission info API once , not multiple times as user land up on datanodes page.

@devmadhuu
Issue with Calling decommision API once is datanode api is called after 1 min if we call decommission API only once then we will not get updated values so checking UUIDs from decommission API and comparision will give incorrect results as i checked with Cluster data.
It is going to decommmission soon msg is updated after calling decommision API and datanode API as we discussed previously datanode api is taking time to update records.

@devmadhuu
Copy link
Contributor

@smitajoshi12 we shouldn't show the initial tool tip from datanodes page "It is going to decommission soon!!!". Rather we should call decommission info API and match with what all datanodes are in decommissioning and update their status on UI. Behavior should be consistent across all UI pages for same entity. And also you need to call the decommission info API once , not multiple times as user land up on datanodes page.

@devmadhuu Issue with Calling decommision API once is datanode api is called after 1 min if we call decommission API only once then we will not get updated values so checking UUIDs from decommission API and comparision will give incorrect results as i checked with Cluster data. It is going to decommmission soon msg is updated after calling decommision API and datanode API as we discussed previously datanode api is taking time to update records.

@smitajoshi12 As discussed, pls change the operational state column value based on decommission info API on datanodes page. Showing tooltip is wrong because datanode decommissioning has already started.

@smitajoshi12
Copy link
Contributor Author

smitajoshi12 commented Jun 18, 2024

@smitajoshi12 we shouldn't show the initial tool tip from datanodes page "It is going to decommission soon!!!". Rather we should call decommission info API and match with what all datanodes are in decommissioning and update their status on UI. Behavior should be consistent across all UI pages for same entity. And also you need to call the decommission info API once , not multiple times as user land up on datanodes page.

@devmadhuu Issue with Calling decommision API once is datanode api is called after 1 min if we call decommission API only once then we will not get updated values so checking UUIDs from decommission API and comparision will give incorrect results as i checked with Cluster data. It is going to decommmission soon msg is updated after calling decommision API and datanode API as we discussed previously datanode api is taking time to update records.

@smitajoshi12 As discussed, pls change the operational state column value based on decommission info API on datanodes page. Showing tooltip is wrong because datanode decommissioning has already started.

@devmadhuu
I have done changes suggested by you update column manually as comparing uuids from decommission api. If one api fails it will not block other apis attached screenshots. Both Apis handled synchronously called after completing one apis result.

@devmadhuu
Copy link
Contributor

@smitajoshi12 kindly resolve the conflicts in your code.

@smitajoshi12
Copy link
Contributor Author

@smitajoshi12 kindly resolve the conflicts in your code.

@smitajoshi12 kindly resolve the conflicts in your code.

@devmadhuu
Resolved conflicts and added new screenshot of Overview Page as alignment for tile has been changed.

@devmadhuu
Copy link
Contributor

@smitajoshi12 Pls resolve your conflicts.

@smitajoshi12
Copy link
Contributor Author

@smitajoshi12 kindly resolve the conflicts in your code.

@devmadhuu
Hi Devesh Resolved conflicts and merged master with all changes and did testing and pushed changes in latest commit.

@smitajoshi12
Copy link
Contributor Author

@dombizita @devabhishekpal
Hi Zita and Abhishek Could you review this PR

@smitajoshi12
Copy link
Contributor Author

@smitajoshi12 Pls resolve your conflicts.

Resolved Merge Conflicts

@ArafatKhan2198
Copy link
Contributor

Can you finish all the pending changes on this patch @smitajoshi12
Make sure you go through all of them

@smitajoshi12
Copy link
Contributor Author

smitajoshi12 commented Jul 9, 2024

Can you finish all the pending changes on this patch @smitajoshi12 Make sure you go through all of them

@ArafatKhan2198
Solved all merge conflicts after Echart PR and done testing with with cluster data

@dombizita dombizita self-requested a review July 11, 2024 09:27
@devmadhuu
Copy link
Contributor

@smitajoshi12 Thanks for improving the patch. Changes LGTM +1

@devabhishekpal
Copy link
Contributor

Hi @smitajoshi12 , so I was suggesting to change the component layout as below:
Frame 1

This would make sure we are not covering too much information for the DN with the popup
Also could you let me know what No. of Under-Replicated Containers, No. of Unclosed Pipelines, No. of Unclosed Containers are used for?
If it is not very relevant to the user to get this information, we can omit this info I guess.
@devmadhuu @ArafatKhan2198 what do you think?

@devmadhuu
Copy link
Contributor

@devabhishekpal we need that minimal information for a node which is in decommission stage.

@devabhishekpal
Copy link
Contributor

Thanks for letting me know @devmadhuu.
Then I guess we can go with a more vertical layout as I attached in the design file @smitajoshi12.

@smitajoshi12
Copy link
Contributor Author

@devabhishekpal we need that minimal information for a node which is in decommission stage.

@devabhishekpal
https://issues.apache.org/jira/browse/HDDS-10514) I have refered this JIRA for details. Will Check Vertical Layout as it is collapsable if we have less information.

@smitajoshi12
Copy link
Contributor Author

smitajoshi12 commented Jul 31, 2024

Hi @smitajoshi12 , so I was suggesting to change the component layout as below: Frame 1

This would make sure we are not covering too much information for the DN with the popup Also could you let me know what No. of Under-Replicated Containers, No. of Unclosed Pipelines, No. of Unclosed Containers are used for? If it is not very relevant to the user to get this information, we can omit this info I guess. @devmadhuu @ArafatKhan2198 what do you think?

@devabhishekpal

Done the changes in latest commit 8b82426 and other css changes will raise another JIRA as improvement.

image

Copy link
Contributor

@devabhishekpal devabhishekpal left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the patch @smitajoshi12 , the changes look good to me.
Just a few minor comments for better code quality

this.setState({
isLoading: true
});
if (this.state.record && this.state.summaryData !== 'null' && this.state.summaryData !== 'undefined') {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this can be simplified to:

Suggested change
if (this.state.record && this.state.summaryData !== 'null' && this.state.summaryData !== 'undefined') {
if (this.state?.record && this.state?.summaryData) {

Copy link
Contributor Author

@smitajoshi12 smitajoshi12 Aug 5, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done the changes in latest commit 3e08fb7 and done testing with local cluster data.

</>
}
{
containers !== null && containers !== undefined && Object.keys(containers).length !== 0 &&
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since containers is an object I think this can also be simplified to:

Suggested change
containers !== null && containers !== undefined && Object.keys(containers).length !== 0 &&
containers && Object.keys(containers).length !== 0 &&

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@devabhishekpal
Done the changes in latest commit 3e08fb7 and done testing with local cluster data.

// Need to check summarydata is not empty
return (
<>
{ (summaryData !== 'null' && summaryData !== 'undefined' && summaryData && summaryData.length !== 0) ?
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
{ (summaryData !== 'null' && summaryData !== 'undefined' && summaryData && summaryData.length !== 0) ?
{ (summaryData && summaryData.length !== 0) ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done the changes in latest commit 3e08fb7 and done testing with local cluster data.

Copy link
Contributor

@ArafatKhan2198 ArafatKhan2198 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@smitajoshi12, please review the final comments from @devabhishekpal. I have tested the patch, and both the layout and functionality are working well. The patch looks good to me.

@devmadhuu
Copy link
Contributor

Thanks @smitajoshi12 for improving the patch. Now it LGTM +1

Copy link
Contributor

@devabhishekpal devabhishekpal left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the changes @smitajoshi12
These changes looks good to me, +1

@ArafatKhan2198
Copy link
Contributor

Thanks for the patch @smitajoshi12 & thanks @devabhishekpal @devmadhuu for the review.

@ArafatKhan2198 ArafatKhan2198 merged commit b891f7c into apache:master Aug 7, 2024
33 checks passed
errose28 added a commit to errose28/ozone that referenced this pull request Aug 7, 2024
* master: (181 commits)
  HDDS-11289. Bump docker-maven-plugin to 0.45.0 (apache#7024)
  HDDS-11287. Code cleanup in XceiverClientSpi (apache#7043)
  HDDS-11283. Refactor KeyValueStreamDataChannel to avoid spurious IDE build issues (apache#7040)
  HDDS-11257. Ozone write does not work when http proxy is set for the JVM. (apache#7036)
  HDDS-11249. Bump ozone-runner to 20240729-jdk17-1 (apache#7003)
  HDDS-10517. Recon - Add a UI component for showing DN decommissioning detailed status and info (apache#6724)
  HDDS-11270. [hsync] Add DN layout version (HBASE_SUPPORT/version 8) upgrade test. (apache#7021)
  HDDS-11272. Statistics some node status information (apache#7025)
  HDDS-11278. Move code out of Hadoop util package (apache#7028)
  HDDS-11274. (addendum) Replace Hadoop annotations/configs with Ozone-specific ones
  HDDS-11274. Replace Hadoop annotations/configs with Ozone-specific ones (apache#7026)
  HDDS-11280. Add Synchronize in AbstractCommitWatcher.addAckDataLength (apache#7032)
  HDDS-11235. Spare InfoBucket RPC call in FileSystem#mkdir() call. (apache#6990)
  HDDS-11273. Bump commons-compress to 1.26.2 (apache#7023)
  HDDS-11225. Increase ipc.server.read.threadpool.size (apache#7007)
  HDDS-11224. Increase hdds.datanode.handler.count (apache#7011)
  HDDS-11259. [hsync] DataNode should verify HBASE_SUPPORT layout version for every PutBlock. (apache#7012)
  HDDS-11214. Added config to set rocksDB's max log file size and num of log files (apache#7014)
  HDDS-11226. Make ExponentialBackoffPolicy maxRetries configurable (apache#6985)
  HDDS-11260. [hsync] Add Ozone Manager protocol version (apache#7015)
  ...

Conflicts:
hadoop-hdds/container-service/src/main/java/org/apache/hadoop/ozone/audit/DNAction.java
hadoop-hdds/container-service/src/main/java/org/apache/hadoop/ozone/container/keyvalue/KeyValueHandler.java
hadoop-hdds/interface-client/src/main/proto/DatanodeClientProtocol.proto
hadoop-hdds/server-scm/src/test/java/org/apache/hadoop/hdds/scm/container/TestContainerReportHandler.java
errose28 added a commit to errose28/ozone that referenced this pull request Aug 13, 2024
…p-supervisor

Merge conflicts are resolved but the change does not yet build.

* HDDS-10239-container-reconciliation: (183 commits)
  HDDS-10376. Add a Datanode API to supply a merkle tree for a given container. (apache#6945)
  HDDS-11289. Bump docker-maven-plugin to 0.45.0 (apache#7024)
  HDDS-11287. Code cleanup in XceiverClientSpi (apache#7043)
  HDDS-11283. Refactor KeyValueStreamDataChannel to avoid spurious IDE build issues (apache#7040)
  HDDS-11257. Ozone write does not work when http proxy is set for the JVM. (apache#7036)
  HDDS-11249. Bump ozone-runner to 20240729-jdk17-1 (apache#7003)
  HDDS-10517. Recon - Add a UI component for showing DN decommissioning detailed status and info (apache#6724)
  HDDS-10926. Block deletion should update container merkle tree. (apache#6875)
  HDDS-11270. [hsync] Add DN layout version (HBASE_SUPPORT/version 8) upgrade test. (apache#7021)
  HDDS-11272. Statistics some node status information (apache#7025)
  HDDS-11278. Move code out of Hadoop util package (apache#7028)
  HDDS-11274. (addendum) Replace Hadoop annotations/configs with Ozone-specific ones
  HDDS-11274. Replace Hadoop annotations/configs with Ozone-specific ones (apache#7026)
  HDDS-11280. Add Synchronize in AbstractCommitWatcher.addAckDataLength (apache#7032)
  HDDS-11235. Spare InfoBucket RPC call in FileSystem#mkdir() call. (apache#6990)
  HDDS-11273. Bump commons-compress to 1.26.2 (apache#7023)
  HDDS-11225. Increase ipc.server.read.threadpool.size (apache#7007)
  HDDS-11224. Increase hdds.datanode.handler.count (apache#7011)
  HDDS-11259. [hsync] DataNode should verify HBASE_SUPPORT layout version for every PutBlock. (apache#7012)
  HDDS-11214. Added config to set rocksDB's max log file size and num of log files (apache#7014)
  ...

Conflicts:
hadoop-hdds/container-service/src/main/java/org/apache/hadoop/ozone/container/ozoneimpl/OzoneContainer.java
errose28 added a commit to errose28/ozone that referenced this pull request Aug 16, 2024
…rrupt-files

* HDDS-10239-container-reconciliation: (183 commits)
  HDDS-10376. Add a Datanode API to supply a merkle tree for a given container. (apache#6945)
  HDDS-11289. Bump docker-maven-plugin to 0.45.0 (apache#7024)
  HDDS-11287. Code cleanup in XceiverClientSpi (apache#7043)
  HDDS-11283. Refactor KeyValueStreamDataChannel to avoid spurious IDE build issues (apache#7040)
  HDDS-11257. Ozone write does not work when http proxy is set for the JVM. (apache#7036)
  HDDS-11249. Bump ozone-runner to 20240729-jdk17-1 (apache#7003)
  HDDS-10517. Recon - Add a UI component for showing DN decommissioning detailed status and info (apache#6724)
  HDDS-10926. Block deletion should update container merkle tree. (apache#6875)
  HDDS-11270. [hsync] Add DN layout version (HBASE_SUPPORT/version 8) upgrade test. (apache#7021)
  HDDS-11272. Statistics some node status information (apache#7025)
  HDDS-11278. Move code out of Hadoop util package (apache#7028)
  HDDS-11274. (addendum) Replace Hadoop annotations/configs with Ozone-specific ones
  HDDS-11274. Replace Hadoop annotations/configs with Ozone-specific ones (apache#7026)
  HDDS-11280. Add Synchronize in AbstractCommitWatcher.addAckDataLength (apache#7032)
  HDDS-11235. Spare InfoBucket RPC call in FileSystem#mkdir() call. (apache#6990)
  HDDS-11273. Bump commons-compress to 1.26.2 (apache#7023)
  HDDS-11225. Increase ipc.server.read.threadpool.size (apache#7007)
  HDDS-11224. Increase hdds.datanode.handler.count (apache#7011)
  HDDS-11259. [hsync] DataNode should verify HBASE_SUPPORT layout version for every PutBlock. (apache#7012)
  HDDS-11214. Added config to set rocksDB's max log file size and num of log files (apache#7014)
  ...

Conflicts:
hadoop-hdds/container-service/src/main/java/org/apache/hadoop/ozone/container/checksum/ContainerChecksumTreeManager.java
hadoop-hdds/container-service/src/test/java/org/apache/hadoop/ozone/container/checksum/TestContainerChecksumTreeManager.java
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants