-
Notifications
You must be signed in to change notification settings - Fork 422
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: Introduce Google Cloud Storage Implementation #4344
Conversation
Signed-off-by: Xuanwo <github@xuanwo.io>
Signed-off-by: Xuanwo <github@xuanwo.io>
Hi @etolbakov, what are your thoughts on the current plan? Should we first merge the OpenDAL storage and then add GCS storage? For GCS, integration with configurations and settings is necessary. Once GCS is operational, we can incorporate metrics and other utilities similar to those provided by S3 or FS. |
Hey @Xuanwo
Could you please elaborate a bit about what do you mean by "working first". Is it a simple integration test that we can draft? Feel free to reach out in the discord if that's fine with you |
Great, moved to https://discord.com/channels/908281611840282624/908281611840282627/1192630218482012180 |
Signed-off-by: Xuanwo <github@xuanwo.io>
…to add-gcs-support
Hi @etolbakov, could you share your Discord ID? I'd like to invite you to our discussion thread: https://discord.com/channels/908281611840282624/1192630218482012180 |
@Xuanwo I like the approach: using a generic opendal storage, and adding a factory for googlefs. |
quickwit/quickwit-storage/src/opendal_storage/google_cloud_storage.rs
Outdated
Show resolved
Hide resolved
quickwit/Cargo.toml
Outdated
@@ -118,6 +118,7 @@ num_cpus = "1" | |||
numfmt = "1.1.1" | |||
once_cell = "1" | |||
oneshot = "0.1.5" | |||
opendal = { version ="0.44", default-features = false } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can we feature gate this to make the development a tiny bit lighter?
(You can mimick what is done for azure.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've added a new feature called google
. Is that what you were looking for?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes
quickwit/quickwit-storage/src/opendal_storage/google_cloud_storage.rs
Outdated
Show resolved
Hide resolved
Signed-off-by: Xuanwo <github@xuanwo.io>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you know a way to mock gfs for our CI? Also have you seen the storage testsuite?
Signed-off-by: Xuanwo <github@xuanwo.io>
Signed-off-by: Xuanwo <github@xuanwo.io>
Signed-off-by: Xuanwo <github@xuanwo.io>
I have added fake-gcs-server in CI and docker compose, but I don't know how get it started. |
Ok, I got the right place. Let me add them. |
Signed-off-by: Xuanwo <github@xuanwo.io>
I have added integration tests for gcs: test google_cloud_storage_test_suite ... ok
test result: ok. 1 passed; 0 failed; 0 ignored; 0 measured; 0 filtered out; finished in 0.19s The mock gcs server doesn't properly support bulk deletion, so I've disabled that test. However, I'm confident it functions correctly on the actual gcs. |
quickwit/quickwit-storage/src/lib.rs
Outdated
.context("write_and_bulk_delete")?; | ||
// Fake GCS Server doesn't support bulk delete correctly. | ||
// ref: <https://github.com/fsouza/fake-gcs-server/issues/1443> | ||
#[cfg(not(feature = "google"))] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's not do the check using the compiler flag, but instead dynamically inspect the storage.
As is, your code is removing this test for azure too in our CI.
(these test are running in the confusingly named "coverage CI" that is triggered after each merge to main. That build takes a bit more time than the regular PR CI and includes azure etc.)
You can have the Storage
trait implement Any
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Addresed, please review again
quickwit/quickwit-storage/src/opendal_storage/google_cloud_storage.rs
Outdated
Show resolved
Hide resolved
Co-authored-by: Paul Masurel <paul@quickwit.io>
Co-authored-by: Paul Masurel <paul@quickwit.io>
…rage.rs Co-authored-by: Paul Masurel <paul@quickwit.io>
Signed-off-by: Xuanwo <github@xuanwo.io>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@Xuanwo looks wonderful! I don't have anything to suggest!
@@ -70,11 +74,19 @@ azure = [ | |||
"azure_storage/enable_reqwest_rustls", | |||
"azure_storage_blobs/enable_reqwest_rustls", | |||
] | |||
google = [ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can we call the feature gfs instead of google?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure, I will update this place tomorrow. 😋
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i ll do it actually... i spotted a bunch of other problems.
Thanks a lot! |
Thank you @Xuanwo that's awesome! |
Description
Part of #4236
This PR implements:
There are many TODOs to finish, but I feel like it's better to make it working first.
How was this PR tested?
Added
google_cloud_storage_test_suite