-
Notifications
You must be signed in to change notification settings - Fork 512
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat(core/services-azblob): support user defined metadata #5274
Conversation
let mut meta = parse_into_metadata(path, headers)?; | ||
|
||
let user_meta = parse_prefixed_headers(&headers, X_MS_META_NAME_PREFIX); | ||
if !user_meta.is_empty() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This check is also done in the s3 backend.
Should we always call meta.with_user_metadata(user_meta);
instead of checking if the user_meta
is empty?
Or can we for example, add that checking inside meta.with_user_metadata(user_meta);
and do an early return if the user_meta
hashmap is empty?
opendal/core/src/types/metadata.rs
Line 524 in 1927932
pub fn with_user_metadata(&mut self, data: HashMap<String, String>) -> &mut Self { |
I find that checking in each backend if the hashmap is empty is error-prone boilerplate...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's fine to have such a check. OpenDAL tends to have more readable code because we have many services, and that code is rarely modified.
@@ -243,12 +245,19 @@ impl AzblobCore { | |||
|
|||
let mut req = Request::put(&url); | |||
|
|||
if let Some(user_metadata) = args.user_metadata() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm setting the metadata headers only for the azblob_put_blob_request
.
Should this also be done for block
requests, for example, in azblob_put_block_request
? Or the metadata will be only supported for blobs
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see that in OSS
the metadata headers is also set for the append_object_request
. Should the azure append blob requests also send these headers? azblob_append_blob_request
, azblob_init_appendable_blob_request
, etc..?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this also be done for
block
requests, for example, inazblob_put_block_request
? Or the metadata will be only supported forblobs
?
Not sure for now; I will need some time to delve into the documentation.
I see that in
OSS
the metadata headers is also set for theappend_object_request
.
I remember that only the first append request can include user metadata, but I'm not sure about that. Feel free to do more research on this.
@@ -462,7 +463,7 @@ impl S3Core { | |||
// Set user metadata headers. | |||
if let Some(user_metadata) = args.user_metadata() { | |||
for (key, value) in user_metadata { | |||
req = req.header(format!("{}{}", constants::X_AMZ_META_PREFIX, key), value) | |||
req = req.header(format!("{X_AMZ_META_PREFIX}{key}"), value) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changed this in order to be consistent with what I did in the azure part. It is more idiomatic to use string interpolation
core/src/services/azblob/backend.rs
Outdated
let headers = resp.headers(); | ||
let mut meta = parse_into_metadata(path, headers)?; | ||
|
||
let user_meta = parse_prefixed_headers(&headers, X_MS_META_NAME_PREFIX); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should the metadata parsing be done in two steps (first call parse_into_metadata
and then parse_prefixed_headers
) or should we add a new prefix: Optional<&str>
parameter to the parse_into_metadata
function and the prefixed headers parsing will be done there (and also the meta.with_user_metadata
call, as the parse_into_metadata
returns a Metadata
struct)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi, I tend to keep parse_into_metadata
simple because not all services handle metadata and user metadata in the same way.
As I mentioned before, OpenDAL tends to have more readable code. It's acceptable for us to have some duplicated code here and there. We can revisit this part when we have more similar code, rather than just two or three instances.
7bac4cd
to
4d6057b
Compare
user_metadata_prefix: &str, | ||
headers: &HeaderMap, | ||
) -> Result<Metadata> { | ||
pub fn parse_metadata(&self, path: &str, headers: &HeaderMap) -> Result<Metadata> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In the azblob
and s3
services, this code is done in the backend
module
opendal/core/src/services/azblob/backend.rs
Line 552 in 4d6057b
let mut meta = parse_into_metadata(path, headers)?; |
But in the oss
it is done in this parse_metadata
function in the core
module and called from the backend
module
opendal/core/src/services/oss/backend.rs
Line 492 in 4d6057b
let mut meta = self.core.parse_metadata(path, resp.headers())?; |
I think we should unify this in order to be consistent.
In my opinion, maybe we can create a new pub fn parse_prefixed_metadata(path:&str, headers: &HeaderMap, prefix: &str)
in the crate::raw::http_util::header
module that contains the code from this parse_metadata
funtion, and then call that parse_prefixed_metadata
in the backend
module of those three services
What do you think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should unify this in order to be consistent.
Hi, it's better not to do this since every service has its own functions. You'll see if you delve into the implementation of different services.
The parse_metadata
provided by raw::http
will only parse the HTTP standards. Services may implement their own logic to meet their specific needs.
core/src/services/oss/core.rs
Outdated
.collect(); | ||
if !data.is_empty() { | ||
m.with_user_metadata(data); | ||
let user_meta = parse_prefixed_headers(&headers, X_OSS_META_PREFIX); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
refactored this in the same way of azblob
and s3
services, as it was a lot of duplicated code
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you @jorgehermo9 a lot for this great PR!
Relates to #4842
Inspired from #5030
Left some doubts in comments