Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for object attributes in Iter call #63

Merged
merged 5 commits into from
Nov 5, 2024

Conversation

fpetkovski
Copy link
Contributor

@fpetkovski fpetkovski commented Jul 14, 2023

Add support for IterWithAttributes

This commit adds support for an IterWithAttributes on the bucket client.
The method allows iterating through objects and getting multiple attributes
into the callback function, removing the need to do an Iter followed by Attrs.

For now, we only support getting the last updated time as an attribute, but the
implementation allows adding more in the future.

Not all buckets support this method. The client can check whether the bucket has
support by calling the SupportedIterOptions method on the client.

Co-authored-by: Ashwanth Goli iamashwanth@gmail.com
Co-authored-by: Filip Petkovski filip.petkovsky@gmail.com

Signed-off-by: Filip Petkovski filip.petkovsky@gmail.com

  • I added CHANGELOG entry for this change.
  • Change is not relevant to the end user.

Changes

  • Add support for IterWithAttributes.

Verification

  • Unit tests.

Copy link
Member

@brancz brancz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like this change a lot!

return p.bkt.Iter(ctx, pdir, func(s string) error {
return f(strings.TrimPrefix(s, p.prefix+DirDelim))
return p.bkt.Iter(ctx, pdir, func(s string, _ ObjectAttributes) error {
return f(strings.TrimPrefix(s, p.prefix+DirDelim), EmptyObjectAttributes)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why wouldn't it be passed through here?

objstore.go Outdated
// Entries are passed to function in sorted order.
Iter(ctx context.Context, dir string, f func(string) error, options ...IterOption) error
Iter(ctx context.Context, dir string, f func(name string, attrs ObjectAttributes) error, options ...IterOption) error
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we're already breaking this signature, I think we should instead make the whole thing pass only the ObjectAttributes struct and add the Name field to it.

Copy link
Contributor Author

@fpetkovski fpetkovski Jul 14, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This makes sense, and would allow us to extend supported attributes in the future.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Exactly!

Copy link
Member

@kakkoyun kakkoyun left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rather than breaking the API, could we add a new API?

@fpetkovski
Copy link
Contributor Author

Yeah I will look into adding a new APi. I am still not sure how to handle cases where a provider does not support returning attributes.

@pracucci
Copy link
Contributor

pracucci commented Aug 9, 2023

Yeah I will look into adding a new APi. I am still not sure how to handle cases where a provider does not support returning attributes.

An idea may be having a function on the objstore client to check if the feature is supported. Then the Iter() contract is that if the feature is supported, attributes will be fetched, otherwise if the feature is not supported it will error out. It's caller responsability to check if the feature is supported by the actual objstore client and take an informed decision about requesting attributes or not.

@ashwanthgoli
Copy link
Contributor

@fpetkovski are there plans to revive this pr? :)

Loki project is planning to switch to thanos clients. Having this change will reduce the fanout of listing objects along with metadata which now requires Iter call followed by fetching attributes for all objects

@fpetkovski fpetkovski force-pushed the iter-object-attributes branch 3 times, most recently from a85daeb to 6ede0f1 Compare October 29, 2024 15:19
This commit adds support for an IterWithAttributes on the bucket client.
The method allows iterating through objects and getting multiple attributes
into the callback function, removing the need to do an Iter followed by Attrs.

For now, we only support getting the last updated time as an attribute, but the
implementation allows adding more in the future.

Not all buckets support this method. The client can check whether the bucket has
support by calling the SupportedIterOptions method on the client.

Co-authored-by: Ashwanth Goli <iamashwanth@gmail.com>
Co-authored-by: Filip Petkovski <filip.petkovsky@gmail.com>

Signed-off-by: Filip Petkovski <filip.petkovsky@gmail.com>
@fpetkovski
Copy link
Contributor Author

@ashwanthgoli Please take a look again, your changes should be integrated now!

Copy link
Contributor

@JoaoBraveCoding JoaoBraveCoding left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

as far as I can see lgtm

objstore.go Outdated
// WithUpdatedAt is an option that can be applied to Iter() to
// include the last modified time in the attributes.
// NB: Prefixes may not report last modified time.
// This option is currently supported for the azure, aws, bos, gcs and filesystem providers.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: just to match the provider name

Suggested change
// This option is currently supported for the azure, aws, bos, gcs and filesystem providers.
// This option is currently supported for the azure, s3, bos, gcs and filesystem providers.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch, thanks.

Comment on lines +252 to +254
func (i *IterObjectAttributes) SetLastModified(t time.Time) {
i.lastModified = t
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Curious question: why did we make assignments to lastModified come through a setter but not name?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i wanted to make it explicit that lastModified might not always be set, made it a private field so the getter contract can return time.Time, bool

delimiter = ""
}

query := &storage.Query{
it := b.bkt.Objects(ctx, &storage.Query{
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@fpetkovski i didn't realise this was only fetching Name attribute before.
I think I removed changes from a previous commit of yours, we need to add it back :)

Copy link
Contributor Author

@fpetkovski fpetkovski Oct 30, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reverted the name attribute in the default case.

const op = OpIter
b.metrics.ops.WithLabelValues(op).Inc()

err := b.bkt.IterWithAttributes(ctx, dir, f, options...)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

updates to b.metrics.opsDuration metric is missing

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should be there now. PTAL again.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@ashwanthgoli
Copy link
Contributor

thank you! @fpetkovski
i left a couple of comments, but the rest lgtm

Signed-off-by: Filip Petkovski <filip.petkovsky@gmail.com>
Signed-off-by: Filip Petkovski <filip.petkovsky@gmail.com>
Copy link
Contributor

@JoaoBraveCoding JoaoBraveCoding left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

Signed-off-by: Filip Petkovski <filip.petkovsky@gmail.com>
@fpetkovski fpetkovski merged commit b598dce into thanos-io:main Nov 5, 2024
7 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants