Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RFC-0561: List metadata reuse #561

Merged
merged 15 commits into from
Aug 25, 2022
Merged

Conversation

ClSlaid
Copy link
Contributor

@ClSlaid ClSlaid commented Aug 23, 2022

Signed-off-by: ClSlaid cailue@bupt.edu.cn

I hereby agree to the terms of the CLA available at: https://databend.rs/dev/policies/cla/

Summary

Open an RFC for adding metadata to DirEntry

related: #558

thanks to: @sandflee

Signed-off-by: ClSlaid <cailue@bupt.edu.cn>
@ClSlaid ClSlaid added documentation Improvements or additions to documentation enhancement New feature or request labels Aug 23, 2022
docs/rfcs/0561-list-with-metadata.md Outdated Show resolved Hide resolved
docs/rfcs/0561-list-with-metadata.md Outdated Show resolved Hide resolved
docs/rfcs/0561-list-with-metadata.md Outdated Show resolved Hide resolved
docs/rfcs/0561-list-with-metadata.md Outdated Show resolved Hide resolved
docs/rfcs/0561-list-with-metadata.md Outdated Show resolved Hide resolved
docs/rfcs/0561-list-with-metadata.md Outdated Show resolved Hide resolved
@ClSlaid ClSlaid changed the title rfc: add metadata to DirEntry RFC-0561: List with metadata Aug 23, 2022
@Xuanwo
Copy link
Member

Xuanwo commented Aug 23, 2022

And for this proposal's naming, list_with_metadata doesn't reflect the thing we want to do. How about reuse_metadata_during_list?

1. fix up format of document

Signed-off-by: ClSlaid <cailue@bupt.edu.cn>
@ClSlaid ClSlaid changed the title RFC-0561: List with metadata RFC-0561: List metadata reuse Aug 23, 2022
docs/rfcs/0561-list-metadata-reuse.md Outdated Show resolved Hide resolved
docs/rfcs/0561-list-metadata-reuse.md Show resolved Hide resolved
docs/rfcs/0561-list-metadata-reuse.md Outdated Show resolved Hide resolved
docs/rfcs/0561-list-metadata-reuse.md Outdated Show resolved Hide resolved
docs/rfcs/0561-list-metadata-reuse.md Outdated Show resolved Hide resolved
docs/rfcs/0561-list-metadata-reuse.md Outdated Show resolved Hide resolved
docs/rfcs/0561-list-metadata-reuse.md Outdated Show resolved Hide resolved
docs/rfcs/0561-list-metadata-reuse.md Outdated Show resolved Hide resolved
docs/rfcs/0561-list-metadata-reuse.md Outdated Show resolved Hide resolved
docs/rfcs/0561-list-metadata-reuse.md Outdated Show resolved Hide resolved
Signed-off-by: ClSlaid <cailue@bupt.edu.cn>
Signed-off-by: ClSlaid <cailue@bupt.edu.cn>
@ClSlaid
Copy link
Contributor Author

ClSlaid commented Aug 24, 2022

Still not ready for review

docs/rfcs/0561-list-metadata-reuse.md Outdated Show resolved Hide resolved
docs/rfcs/0561-list-metadata-reuse.md Outdated Show resolved Hide resolved
docs/rfcs/0561-list-metadata-reuse.md Outdated Show resolved Hide resolved
docs/rfcs/0561-list-metadata-reuse.md Outdated Show resolved Hide resolved
docs/rfcs/0561-list-metadata-reuse.md Show resolved Hide resolved
Signed-off-by: ClSlaid <cailue@bupt.edu.cn>
Signed-off-by: ClSlaid <cailue@bupt.edu.cn>
Signed-off-by: ClSlaid <cailue@bupt.edu.cn>
@ClSlaid ClSlaid requested a review from Xuanwo August 24, 2022 11:28
Signed-off-by: ClSlaid <cailue@bupt.edu.cn>
@ClSlaid
Copy link
Contributor Author

ClSlaid commented Aug 24, 2022

There are two ways of implementation:

  1. add a lightweight option structure storing metadata to DirEntry
struct MetaLite {
    pub content_length: u64,  // size of file
    pub last_modified: OffsetDateTime,
    pub created: OffsetDateTime,    // time created
}

pub struct DirEntry {
    acc: Arc<dyn Accessor>,
    
    mode: ObjectMode,
    path: String,
    
    // newly add metadata struct
    metadata: Option<MetaLite>,
}

impl DirEntry {
    // get size of file
    pub fn content_length(&self) -> Option<u64> {
        self.metadata.as_ref().map(|m| m.content_length)
    }
    // get the last modified time
    pub fn last_modified(&self) -> Option<OffsetDateTime> {
        self.metadata.as_ref().map(|m| m.last_modified)
    }
    // get the create time
    pub fn created(&self) -> Option<OffsetDateTime> {
        self.metadata.as_ref().map(|m| m.created)
    }
}
  1. extend DirEntry with multiple optional metadata fields
pub struct DirEntry {
    acc: Arc<dyn Accessor>,

    mode: ObjectMode,
    path: String,

    // newly add metadata fields
    content_length: Option<u64>,  // size of file
    last_modified: Option<u64>,
    created: Option<u64>,    // time created
}

impl DirEntry {
    // get size of file
    pub fn content_length(&self) -> Option<u64> {
        self.content_length
    }
    // get the last modified time
    pub fn last_modified(&self) -> Option<OffsetDateTime> {
        self.last_modified
    }
    // get the create time
    pub fn created(&self) -> Option<OffsetDateTime> {
        self.created
    }
}

I prefer the first one because it provides a compacter memory layout. The first one saves 8 bytes of space.

Implementing this RFC will increase the size of DirEntry from 40 bytes to 80 bytes, 88 bytes in the second approach.

The second approach may be a little faster because no intermediate function calls are involved. But it's worth sacrificing an infrequently used feature's performance to save memory.

Plan A: simply extend `DirEntry`

Plan B: define a new MetaLite struct and embed it into `DirEntry`
    @ClSlaid prefered.

Plan C: embed `ObjectMetadata` into `DirEntry`
    @Xuanwo prefered

Signed-off-by: ClSlaid <cailue@bupt.edu.cn>
Signed-off-by: ClSlaid <cailue@bupt.edu.cn>
@ClSlaid ClSlaid requested a review from Xuanwo August 25, 2022 06:50
@ClSlaid ClSlaid marked this pull request as ready for review August 25, 2022 06:57
Copy link
Member

@Xuanwo Xuanwo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mostly LGTM!

This RFC is really great.

docs/rfcs/0561-list-metadata-reuse.md Outdated Show resolved Hide resolved
docs/rfcs/0561-list-metadata-reuse.md Outdated Show resolved Hide resolved
docs/rfcs/0561-list-metadata-reuse.md Outdated Show resolved Hide resolved
Signed-off-by: ClSlaid <cailue@bupt.edu.cn>
@Xuanwo
Copy link
Member

Xuanwo commented Aug 25, 2022

Signed-off-by: ClSlaid <cailue@bupt.edu.cn>
…fs-list-with-metadata

Signed-off-by: ClSlaid <cailue@bupt.edu.cn>
@ClSlaid ClSlaid requested a review from Xuanwo August 25, 2022 07:18
Copy link
Member

@Xuanwo Xuanwo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great! Congrats on your first RFC!

@Xuanwo Xuanwo merged commit ce5e6b8 into apache:main Aug 25, 2022
This was referenced Aug 26, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
documentation Improvements or additions to documentation enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants