-
Notifications
You must be signed in to change notification settings - Fork 514
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
HDDS-9626. [Recon] Disk Usage page with high number of key/bucket/volume #6535
Conversation
@dombizita @devmadhuu Can you please take a look at this patch. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for working on this @smitajoshi12
While testing this patch locally I noticed a few discrepancies while setting the Display Limit :-
- I currently have 56 keys in my cluster all of which are present inside the
buckettest
.
1. When I set the display limit to 5 I notice that 5 objects of the highest size is displayed and also the remaining objects are clubbed inside the Other Objects
2. For 20 I get the correct result as well :-
3. But when I set the limit to 30 I do not see the Other Objects
slot anywhere even though there are a total of 56 keys hence the remaining 26 Keys need to get clubbed into Other Objects
.
@ArafatKhan2198 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for working on this @smitajoshi12. To use the improvements in the namespace endpoint that @ArafatKhan2198 introduced in #6318, you need to change the endpoint that you call here:
Line 132 in 1cbee60
const duEndpoint = `/api/v1/namespace/du?path=${path}&files=true`; |
The
sortSubPaths
needs to be set true
. ozone/hadoop-ozone/recon/src/main/java/org/apache/hadoop/ozone/recon/api/NSSummaryEndpoint.java
Line 115 in 21fa62f
@DefaultValue("true") @QueryParam("sortSubPaths") boolean sortSubpaths) |
@dombizita @ArafatKhan2198 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for updating the patch, @smitajoshi12. We are now using the correct API parameters for sorting the subpaths, but there is still an issue from the UI perspective. Let's say we have three files:
file1 -> Size -> 1 KB
file2 -> Size -> 10 KB
file3 -> Size -> 1 GB
The API endpoint would return a response in descending order of size. However, the problem is that the UI representation becomes skewed, as shown in the image below:
Here, we have three directories with sizes 1 KB, 10 KB, and 1 GB. I believe the size of each part of the pie chart is relative to the file size, but this creates a poor user experience. We need to address this issue to improve the user interface.
Could you please take care of this!
@ArafatKhan2198 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, @smitajoshi12 for working on this.
LGTM
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @smitajoshi12 for working on this. LGTM +1
...one/recon/src/main/resources/webapps/recon/ozone-recon-web/src/views/diskUsage/diskUsage.tsx
Show resolved
Hide resolved
...one/recon/src/main/resources/webapps/recon/ozone-recon-web/src/views/diskUsage/diskUsage.tsx
Outdated
Show resolved
Hide resolved
...one/recon/src/main/resources/webapps/recon/ozone-recon-web/src/views/diskUsage/diskUsage.tsx
Outdated
Show resolved
Hide resolved
…ume Review Comments
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for updating your patch @smitajoshi12! Please take a look at my comments!
...one/recon/src/main/resources/webapps/recon/ozone-recon-web/src/views/diskUsage/diskUsage.tsx
Outdated
Show resolved
Hide resolved
...one/recon/src/main/resources/webapps/recon/ozone-recon-web/src/views/diskUsage/diskUsage.tsx
Outdated
Show resolved
Hide resolved
...one/recon/src/main/resources/webapps/recon/ozone-recon-web/src/views/diskUsage/diskUsage.tsx
Outdated
Show resolved
Hide resolved
...one/recon/src/main/resources/webapps/recon/ozone-recon-web/src/views/diskUsage/diskUsage.tsx
Outdated
Show resolved
Hide resolved
Thanks @smitajoshi12 for working on this patch. Thanks @dombizita , @ArafatKhan2198 for reviewing the patch. |
What changes were proposed in this pull request?
When the number of keys/volume/bucket are huge, the current disk usage UI doesnt make much sense.
This pull request introduces enhancements to the Recon disk usage endpoint to significantly improve usability and performance when dealing with large datasets:
Top Entities Focus: The endpoint has been updated to efficiently sort and display only the top entities by size. This targeted approach helps users easily identify the most significant space consumers, addressing the impracticality of visualizing thousands of records in a single view.
Efficient Sorting with Parallel Streams: To manage and sort vast numbers of records effectively, we've implemented parallel stream processing.
Key advantages of using parallel streams include :-
Better Utilization of Multi-core Processors: Enables concurrent sorting operations across multiple cores, drastically cutting down processing times for large datasets.
Optimized for Large Datasets: The parallelism overhead is more efficiently distributed over a large number of elements, making it particularly suited for our use case.
Backend PR For Reference:-
#6318
What is the link to the Apache JIRA
https://issues.apache.org/jira/browse/HDDS-9626
How was this patch tested?
Manually
Before this PR
After this PR
Tested with Cluster Data