Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

HDDS-10608. Recon can't get full key when using Recon API. #6492

Merged
merged 22 commits into from
May 9, 2024

Conversation

ArafatKhan2198
Copy link
Contributor

@ArafatKhan2198 ArafatKhan2198 commented Apr 8, 2024

What changes were proposed in this pull request?

We've made some updates to improve how we manage and display complete file paths in recon. This includes a new way to build the full paths of files, especially useful for our FSO keys, ensuring that file paths are shown correctly no matter how they're stored.

Key Changes:

  1. NSSummary Adjustments: To support the new path-building functionality, we've made modifications to the Namespace Summary (NSSummary). This includes the introduction of a parentID attribute for directories, enabling the recursive construction of paths from any given file or directory back to its root.
  2. Path Construction Utility : Constructs the full path for a given key based on its OmKeyInfo and the hierarchical structure of parent directories. This method begins with the key name from the provided OmKeyInfo object and recursively prepends the names of parent directories by fetching NSSummary objects from the namespace summary manager. This process continues until a parent ID of -1 is encountered, indicating the root of the directory structure. Finally, the volume and bucket names are prepended to construct the complete path.

To ensure simplicity in addressing the backward compatibility of NSSummary objects during deserialization, we've implemented a straightforward solution. When deserializing old data, if the parentId field is not present or is set to zero, we assign it a value of -1. This indicates that the parentId was not originally set, effectively differentiating old data from new without needing complex versioning systems. This approach is neatly integrated into our deserialization method, keeping the process clean and efficient.

What is the link to the Apache JIRA

https://issues.apache.org/jira/browse/HDDS-10608

How was this patch tested?

Unit Testing :-

Wrote multiple unit test to test out the complete correct path and also to check if the parent ID is getting populated properly.

Manual Testing :-

The implementation of these changes has been thoroughly tested across various bucket layouts, including FSO, Legacy buckets with FsEnabled, and Object Storage Bucket (OBS) layouts.

FSO Bucket Path:

{
  "totalCount": 1,
  "keys": [
    {
      "Volume": "volume1",
      "Bucket": "fso-bucket",
      "Key": "key1-fso"
      "CompletePath": "volume1/fso-bucket/dir1/dir2/dir3/key1-fso",
      "DataSize": 1000000
    }
  ]
}

Legacy Bucket with FsEnabled:

{
  "totalCount": 4,
  "keys": [
    {
      "Volume": "s3v",
      "Bucket": "legacy-bucket",
      "Key": "dir1/dir2/dir3/key1-legacy"
      "CompletePath": "s3v/legacy-bucket/dir1/dir2/dir3/key1-legacy",
      "DataSize": 1000000
    }
  ]
}

OBS Bucket Path:

{
  "totalCount": 1,
  "keys": [
    {
      "Volume": "volume1",
      "Bucket": "obs-bucket",
      "Key": "key1-obs"
      "CompletePath": "volume1/obs-bucket/key1-obs",
      "DataSize": 17289
    }
  ]
}

@adoroszlai adoroszlai requested a review from devmadhuu April 8, 2024 05:47
Copy link
Contributor

@devmadhuu devmadhuu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @ArafatKhan2198 for working on the patch. Changes LGTM +1

Copy link
Contributor

@sumitagrawl sumitagrawl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ArafatKhan2198 Thanks for working over this, given few comments.

Copy link
Contributor

@sumitagrawl sumitagrawl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ArafatKhan2198 backward compatibility is still not resolved, please check comments reply

@ArafatKhan2198
Copy link
Contributor Author

@ArafatKhan2198 backward compatibility is still not resolved, please check comments reply

Thanks for the find @sumitagrawl
You are right if we try to read beyond the available data in the DataInputStream, it will indeed throw an EOFException

Key Changes Made:

  1. Availability Check: Added if (in.available() >= Long.BYTES) to ensure there is enough data left in the stream before attempting to read the parentId. This prevents an EOFException for data streams that are from an older format and do not contain the parentId field.
  2. Default Parent ID: If the data for parentId is not available, the code sets the parentId to -1 to indicate that the data is from an older format or the parentId was not set.

Copy link
Contributor

@sumitagrawl sumitagrawl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ArafatKhan2198 Thanks for working, have few more comments

@ArafatKhan2198 ArafatKhan2198 requested a review from sumitagrawl May 3, 2024 07:45
Copy link
Contributor

@sumitagrawl sumitagrawl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ArafatKhan2198 given few comments, plz check

Copy link
Contributor

@sumitagrawl sumitagrawl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ArafatKhan2198 Plz find few minor comments, other LGTM

Copy link
Contributor

@sumitagrawl sumitagrawl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@ArafatKhan2198
Copy link
Contributor Author

@sumitagrawl sumitagrawl merged commit 9074b8d into apache:master May 9, 2024
39 checks passed
jojochuang pushed a commit to jojochuang/ozone that referenced this pull request May 29, 2024
xichen01 pushed a commit to xichen01/ozone that referenced this pull request Jul 17, 2024
xichen01 pushed a commit to xichen01/ozone that referenced this pull request Jul 17, 2024
xichen01 pushed a commit to xichen01/ozone that referenced this pull request Jul 17, 2024
xichen01 pushed a commit to xichen01/ozone that referenced this pull request Jul 18, 2024
xichen01 pushed a commit to xichen01/ozone that referenced this pull request Jul 18, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants