-
Notifications
You must be signed in to change notification settings - Fork 415
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
documentation: concurrent writes for non-S3 backends #2556
Comments
@inigohidalgo feel free to open a PR to make a change to the docs, contributions are always welcome! :) |
Sounds good. Is it safe to assume that all the backends listed here other than AWS support this by default? I will probably just reword the |
Yup. Though IIRC, Minio, Cloudflare, and other S3-compatible stores will have the same issue, even though we actually could enable it for some of them. |
@wjones127 Cloudflare R2 actually supports copy if not exists with custom headers, which we are able to pass through. But that's the only exception for an S3 implementation afaik |
Ah cool. I know R2 and Minio support using custom headers, but didn't know we had already implemented the proper pass through for R2. Do we have support for Minio as well then? |
Hmm I didn't know minio supported custom headers, then I guess it should work as well (but can't confirm) |
- It was unclear to me that concurrent writing was available by default for non-S3 backends, so I am making the language clearer. - I have also added an extra section showing that R2 and maybe MinIO can enable concurrent writing - Fixed a couple of unrelated formatting issues in the page I edited closes #2556 #2069 also had the same confusion
@wjones127 Do you know if there is any header that could be used for minio, for something similar to cf-copy-destination-if-none-match of R2 ? for reference : the full list of headers supported by minio is here https://github.com/minio/minio/blob/cf371da346199ba39cbbdfbce37d7cfa0b594b85/internal/http/headers.go#L87 |
This one looks to me what you are looking for |
@ion-elgreco 'x-amz-copy-source-if-none-match' is supported by amazon s3 as well . If that could be used , then probably we do not need a locking provider for amazon s3 as well. So I am assuming that 'x-amz-copy-source-if-none-match' could not be used for getting around the locking requirement as for R2 , R2 doc (https://developers.cloudflare.com/r2/api/s3/extensions/) reads
Looking at minio code (https://github.com/minio/minio/blob/cf371da346199ba39cbbdfbce37d7cfa0b594b85/cmd/object-handlers-common.go), the copy object handler seems to be supporting only the headers supported by s3 However put object handler (whose code is in the same file linked above) seems to be supporting 'If-None-Match' . I was wondering if that can be used to get around the locking requirements of minio |
Currently the documentation has a section that details a locking mechanism is needed for S3 to enable concurrent writing.
https://delta-io.github.io/delta-rs/usage/writing/writing-to-s3-with-locking-provider/
There are various mentions of concurrency throughout the docs, but having read through the docs a while ago I was left with the impression that "concurrent writing is only supported on S3, and you need a DynamoDB locking mechanism" when, in reality, concurrent writing is supported by default on (at least one) backends #2069 without needing that locking provider.
This is probably just my own lack of understanding of the delta protocol, but I think it would be good to make this clearer in the documentation, that concurrency is supported by default, and only S3 needs the locking mechanism.
The text was updated successfully, but these errors were encountered: