Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Port object_store integration tests, use github actions #2148

Merged
merged 15 commits into from
Jul 25, 2022

Conversation

alamb
Copy link
Contributor

@alamb alamb commented Jul 23, 2022

Which issue does this PR close?

re #2030

Rationale for this change

#2081 brought in the object_store_rs code, but did not include the test scripts for running object store tests against s3/azure/gcp simulators

What changes are included in this PR?

Adds github action based tests for running object store integration tests
Adds scripts + small object store configuration changes required to run them

It is based on my PR influxdata/object_store_rs#51 which I used to prototype the changes

I ported this using these commands:

$ git fetch git@github.com:alamb/object_store_rs.git alamb/test_gh_actions
$ git cherry-pick 25f4896
# and manually fixed some things

Are there any user-facing changes?

Not really

@alamb alamb changed the title Run object_store_integration tests Port object_store integration tests to arrow-rs / github actions: Jul 23, 2022
@alamb alamb changed the title Port object_store integration tests to arrow-rs / github actions: Port object_store integration tests to arrow-rs / github actions Jul 23, 2022
- object_store/**
- .github/**

jobs:
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@@ -524,7 +553,27 @@ pub fn new_azure(

let (is_emulator, storage_account_client) = if use_emulator {
check_if_emulator_works()?;
(true, StorageAccountClient::new_emulator_default())
// Allow overriding defaults. Values taken from
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is required to specify a URL different than 127.0.0.1 -- the github actions "services" run with some actual dns name rather than on the local loopback (e.g. you need to specify http://auzure:10000 rather than http://127.0.0.1:10000)

@codecov-commenter
Copy link

codecov-commenter commented Jul 23, 2022

Codecov Report

Merging #2148 (427107d) into master (73153fe) will decrease coverage by 0.00%.
The diff coverage is 40.00%.

@@            Coverage Diff             @@
##           master    #2148      +/-   ##
==========================================
- Coverage   82.82%   82.82%   -0.01%     
==========================================
  Files         237      237              
  Lines       61425    61428       +3     
==========================================
+ Hits        50874    50875       +1     
- Misses      10551    10553       +2     
Impacted Files Coverage Δ
object_store/src/azure.rs 0.00% <ø> (ø)
object_store/src/gcp.rs 0.00% <0.00%> (ø)
object_store/src/lib.rs 84.42% <100.00%> (ø)
...row/src/array/builder/string_dictionary_builder.rs 90.64% <0.00%> (-0.72%) ⬇️
parquet/src/encodings/encoding.rs 93.43% <0.00%> (-0.20%) ⬇️
arrow/src/datatypes/datatype.rs 63.00% <0.00%> (+0.31%) ⬆️
parquet_derive/src/parquet_field.rs 66.21% <0.00%> (+0.45%) ⬆️

Help us with your feedback. Take ten seconds to tell us how you rate us.

@@ -575,6 +583,18 @@ mod test {
service_account: String,
}

impl GoogleCloudConfig {
fn build(self) -> Result<GoogleCloudStorage> {
// ignore HTTPS errors in tests so we can use fake-gcs server
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note that this is only done in the tests, whereas the original approach allowed it in production as well)

@@ -90,5 +90,24 @@ $ cargo test --features azure

### GCP

We don't have a good story yet for testing the GCP integration locally. You will need to create a GCS bucket, a
service account that has access to it, and use this to run the tests.
To test the GCS integration, we use [Fake GCS Server](https://github.com/fsouza/fake-gcs-server)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added documentation on how to run these integration tests

@alamb
Copy link
Contributor Author

alamb commented Jul 23, 2022

Here is an example of the new test running: https://github.com/apache/arrow-rs/runs/7483480192?check_suite_focus=true

@alamb alamb marked this pull request as ready for review July 23, 2022 21:20
@alamb
Copy link
Contributor Author

alamb commented Jul 23, 2022

cc @tustvold @wjones127

@alamb alamb changed the title Port object_store integration tests to arrow-rs / github actions Port object_store integration tests, use github actions Jul 23, 2022
object_store/src/gcp.rs Outdated Show resolved Hide resolved
Co-authored-by: Raphael Taylor-Davies <1781103+tustvold@users.noreply.github.com>
@github-actions github-actions bot added the object-store Object Store Interface label Jul 25, 2022
@@ -695,8 +695,8 @@ mod tests {
#[tokio::test]
async fn test_list_lifetimes() {
let store = memory::InMemory::new();
let stream = list_store(&store, "path").await.unwrap();
assert_eq!(stream.count().await, 0);
let mut stream = list_store(&store, "path").await.unwrap();
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure why, but locally the stream.count() construction started erroring like this for me:

cd /Users/alamb/Software/arrow-rs2 && RUST_LOG=trace RUST_BACKTRACE=1 CARGO_TARGET_DIR=/Users/alamb/Software/df-target2 cargo test -p object_store --features=aws,azure,azure_test,gcp
   Compiling object_store v0.3.0 (/Users/alamb/Software/arrow-rs2/object_store)
error[E0599]: `Pin<Box<dyn futures::Stream<Item = std::result::Result<ObjectMeta, Error>> + std::marker::Send>>` is not an iterator
   --> object_store/src/lib.rs:699:27
    |
699 |         assert_eq!(stream.count().await, 0);
    |                           ^^^^^ `Pin<Box<dyn futures::Stream<Item = std::result::Result<ObjectMeta, Error>> + std::marker::Send>>` is not an iterator
    |
   ::: /Users/alamb/.rustup/toolchains/stable-x86_64-apple-darwin/lib/rustlib/src/rust/library/core/src/pin.rs:408:1
    |
408 | pub struct Pin<P> {
    | ----------------- doesn't satisfy `_: Iterator`
    |
   ::: /Users/alamb/.cargo/registry/src/github.com-1ecc6299db9ec823/futures-core-0.3.17/src/stream.rs:27:1
    |
27  | pub trait Stream {
    | ---------------- doesn't satisfy `_: Iterator`
    |
    = note: the following trait bounds were not satisfied:
            `Pin<Box<dyn futures::Stream<Item = std::result::Result<ObjectMeta, Error>> + std::marker::Send>>: Iterator`
            which is required by `&mut Pin<Box<dyn futures::Stream<Item = std::result::Result<ObjectMeta, Error>> + std::marker::Send>>: Iterator`
            `dyn futures::Stream<Item = std::result::Result<ObjectMeta, Error>> + std::marker::Send: Iterator`
            which is required by `&mut dyn futures::Stream<Item = std::result::Result<ObjectMeta, Error>> + std::marker::Send: Iterator`

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That sounds like the StreamExt is not in scope for some reason

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I tried a few ways to get that to work, but rustc gave me the same error and told me that StreamExt was unused ...

@alamb
Copy link
Contributor Author

alamb commented Jul 25, 2022

🤔 now the test https://github.com/apache/arrow-rs/runs/7499943986?check_suite_focus=true is failing with:

Run cargo check -p parquet_derive
    Checking syn v1.0.98
error: failed to write /github/home/target/debug/deps/rmetapeosRm/lib.rmeta: No space left on device (os error 28)
error: could not compile `syn` due to previous error
Error: Process completed with exit code [10](https://github.com/apache/arrow-rs/runs/7499943986?check_suite_focus=true#step:14:11)1.

I guess I will continue on the path laid out in #2149

@alamb
Copy link
Contributor Author

alamb commented Jul 25, 2022

However I don't see any reason to block merging this PR due to an unrelated issue, so merging

@alamb alamb merged commit e68852c into apache:master Jul 25, 2022
@ursabot
Copy link

ursabot commented Jul 25, 2022

Benchmark runs are scheduled for baseline = bcaa0b6 and contender = e68852c. e68852c is a master commit associated with this PR. Results will be available as each benchmark for each run completes.
Conbench compare runs links:
[Skipped ⚠️ Benchmarking of arrow-rs-commits is not supported on ec2-t3-xlarge-us-east-2] ec2-t3-xlarge-us-east-2
[Skipped ⚠️ Benchmarking of arrow-rs-commits is not supported on test-mac-arm] test-mac-arm
[Skipped ⚠️ Benchmarking of arrow-rs-commits is not supported on ursa-i9-9960x] ursa-i9-9960x
[Skipped ⚠️ Benchmarking of arrow-rs-commits is not supported on ursa-thinkcentre-m75q] ursa-thinkcentre-m75q
Buildkite builds:
Supported benchmarks:
ec2-t3-xlarge-us-east-2: Supported benchmark langs: Python, R. Runs only benchmarks with cloud = True
test-mac-arm: Supported benchmark langs: C++, Python, R
ursa-i9-9960x: Supported benchmark langs: Python, R, JavaScript
ursa-thinkcentre-m75q: Supported benchmark langs: C++, Java

@alamb alamb deleted the alamb/port_object_store_integration_test branch July 25, 2022 13:40
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
object-store Object Store Interface
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants