Consider adopting IOx ObjectStore abstraction #2489

wjones127 · 2022-05-08T22:18:17Z

Is your feature request related to a problem or challenge? Please describe what you are trying to do.

In another issue @alamb and @tustvold suggested we might want to use the IOx ObjectStore implementation.

A few nice points I'll mention about the IOx one:

They have some nice path utilities, including a CloudPath struct. That seems nicer than the current one with &str paths.
Has implementations for S3, GCS, Azure Blob Storage included in the repo. There is no HDFS support yet.
Has implementations of put() for writing. There doesn't seem to be streaming write support (multi-part upload).

There are a few differences in the API:

Current API: https://github.com/apache/arrow-datafusion/blob/dfdeb42d7d646cffcf3cff26beefcecffc6cbe62/data-access/src/object_store/mod.rs#L77

IOx API: https://github.com/influxdata/influxdb_iox/blob/94e9ac610acfb94870154d976f66a4d4111b5668/object_store/src/lib.rs#L74

The IOx list() implementation evaluated prefixes on path segments: "Prefixes are evaluated on a path segment basis, i.e. foo/bar/ is a prefix of foo/bar/x but not of foo/bar_baz/x."
IOx doesn't have a synchronous read implementation.

There of course exist other repos that this has implications for:

From what I've seen, it seems like we could reasonably shift to simply use the IOx ObjectStore. But if there's a good reason, we could also reuse useful parts of the implementation to keep the existing API.

cc @matthewmturner @kyotoYaho @roeap

The text was updated successfully, but these errors were encountered:

matthewmturner · 2022-05-09T04:02:33Z

Indeed the IOx ObjectStore implementation looks both robust and feature rich - and has the added benefit of being actively used. While datafusion-objectstore-s3 has been released on crates i am not aware of any production workloads leveraging it.

I also prefer the more generic API interface in the IOx implementation, I had actually planned on proposing something similar on #2445.

From an s3 perspective the only things that come to mind are:

Rusoto is used rather than official AWS Rust Sdk. If thats good enough for IOx then im sure its fine for now, but i think would be good to move to AWS Rust Sdk sooner than later.
We can just double check that our existing test suite in datafusion-objectstore-s3 works without issue. In particular for connecting to services such as MinIO (which is S3 compatible storage) - I have spoken to them and know they were watching the project.

As for the actual implementation, given the IOx implementation already has AWS, GCP, and Azure functionality features could we just create datafusion-objectstore in datafusion-contrib and let people choose what they want by choosing the relevant feature?

alamb · 2022-05-09T10:25:03Z

I believe @tustvold is actively working on preparing the iox code for crates.io release in https://github.com/influxdata/influxdb_iox/pull/4534

tustvold · 2022-05-09T10:38:34Z

Yeah as alluded to by @alamb, my plan is to get the iox code released to crates.io so that DataFusion could use it.

There would then be a couple of potential courses of action for DataFusion:

Do nothing 😄
Migrate to using the object_store crate to fetch parquet files to local disk. This would potentially fetch more bytes from object storage, but as described in RFC: Spill-To-Disk Object Storage Download #2205 this may actually be faster than the current approach. It would also be temporary pending Push-Based Parquet Reader arrow-rs#1605
Wait for Push-Based Parquet Reader arrow-rs#1605 and then migrate to using the object_store crate

tustvold · 2022-05-13T17:01:54Z

I've released the crate to crates.io - https://crates.io/crates/object_store, I'm going to take a stab at integrating this into Datafusion over this weekend. Hopefully I'll get something up as a workable draft

thinkharderdev · 2022-05-16T10:14:21Z

Yeah as alluded to by @alamb, my plan is to get the iox code released to crates.io so that DataFusion could use it.

There would then be a couple of potential courses of action for DataFusion:

Do nothing 😄

Migrate to using the object_store crate to fetch parquet files to local disk. This would potentially fetch more bytes from object storage, but as described in RFC: Spill-To-Disk Object Storage Download #2205 this may actually be faster than the current approach. It would also be temporary pending Push-Based Parquet Reader arrow-rs#1605

Wait for Push-Based Parquet Reader arrow-rs#1605 and then migrate to using the object_store crate

Wrt fetching to local disk, we have an implementation of (datafusion) ObjectStore in our project which adopts the S3A approach to minimize the number of small range requests. Basically, we set a minimum chunk size for S3 reads (usually 64K). If a read of less than 64K is requested, we go ahead and fetch 64K and buffer it in memory. Subsequent reads that fall within that buffer are returned from the in-memory buffer. This minimizes the overhead of small range requests from the PageIterator while still avoiding reads of columns not required for the query.

tustvold · 2022-05-16T10:59:27Z

Yeah, buffered prefetch is one way to mitigate the small read problem. However, it does not allow for coalescing adjacent reads - i.e. you will still likely end up with one request per column chunk unless you have tiny columns.

TBC my preference is for 3, which mirrors the new vectored API if S3a, but I'm currently working on 2 first to ensure there aren't any fundamental integration issues.

alamb · 2022-05-16T11:23:46Z

I believe @tustvold is working on this issue

thinkharderdev · 2022-05-16T12:55:59Z

Yeah, buffered prefetch is one way to mitigate the small read problem. However, it does not allow for coalescing adjacent reads - i.e. you will still likely end up with one request per column chunk unless you have tiny columns.

TBC my preference is for 3, which mirrors the new vectored API if S3a, but I'm currently working on 2 first to ensure there aren't any fundamental integration issues.

Cool. In our case the buffered prefetch helps marginally (but we also have a lot of sparse columns so it is a slightly special case which does a reasonable job at coalescing adjacent reads).

apache/arrow-rs#1605 looks like a really good idea. We're also working on trying to optimize S3 reads at the moment so if there's any way I can help please let me know!

jychen7 · 2022-05-28T13:23:19Z

Has implementations for S3, GCS, Azure Blob Storage included in the repo

that's great, at a time, I want to create datafusion-objectstore-gcs similar to s3/azure, since I mostly play with GCP. And want to try how datafusion and also ballista can query GCS

* Switch to object_store crate (#2489) * Test fixes * Update to object_store 0.2.0 * More windows pacification * Fix windows test * Fix windows test_prefix_path * More windows fixes * Simplify ListingTableUrl::strip_prefix * Review feedback * Update to latest arrow-rs * Use ParquetRecordBatchStream * Simplify predicate pruning * Add host to ObjectStoreRegistry Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org>

wjones127 added the enhancement New feature or request label May 8, 2022

tustvold mentioned this issue May 10, 2022

[EPIC]: Morsel-Driven Scheduler IO #2504

Closed

tustvold mentioned this issue May 14, 2022

Reduce duplication in file scan tests #2533

Merged

alamb assigned tustvold May 16, 2022

This was referenced May 17, 2022

File URI Scheme Interpretation #2562

Closed

Decouple FileFormat from datafusion_data_access #2572

Merged

Extract Listing URI logic into ListingTableUri structure #2578

Merged

wjones127 mentioned this issue May 20, 2022

Adopt influxdata/object_store_rs? delta-io/delta-rs#610

Closed

tustvold mentioned this issue May 31, 2022

Remove ObjectStore from FileScanConfig and ListingTableConfig #2668

Merged

tustvold added a commit to tustvold/arrow-datafusion that referenced this issue Jun 1, 2022

Switch to object_store crate (apache#2489)

9e05086

This was referenced Jun 1, 2022

Switch to object_store crate (#2489) #2677

Merged

Use ParquetRecordBatchStream #2711

Closed

alamb mentioned this issue Jun 28, 2022

Propose donating object_store_rs to Apache Arrow project influxdata/object_store_rs#41

Closed

alamb closed this as completed in #2677 Jul 4, 2022

yahoNanJing mentioned this issue Jul 7, 2022

Implement based on the new object store abstraction datafusion-contrib/datafusion-objectstore-hdfs#5

Closed

alamb mentioned this issue Jul 8, 2022

Incorporate object_store into arrow-rs repository apache/arrow-rs#2030

Closed

14 tasks

tustvold mentioned this issue Jul 17, 2022

bug: new ObjectStore breaks backward compatibility with contrib plugins #2931

Closed

alamb mentioned this issue Sep 13, 2022

[EPIC] Parquet filter pushdown into scan #3462

Open

27 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Consider adopting IOx ObjectStore abstraction #2489

Consider adopting IOx ObjectStore abstraction #2489

wjones127 commented May 8, 2022

matthewmturner commented May 9, 2022 •

edited

Loading

alamb commented May 9, 2022 •

edited

Loading

tustvold commented May 9, 2022 •

edited

Loading

tustvold commented May 13, 2022

thinkharderdev commented May 16, 2022

tustvold commented May 16, 2022 •

edited

Loading

alamb commented May 16, 2022

thinkharderdev commented May 16, 2022

jychen7 commented May 28, 2022

Consider adopting IOx ObjectStore abstraction #2489

Consider adopting IOx ObjectStore abstraction #2489

Comments

wjones127 commented May 8, 2022

matthewmturner commented May 9, 2022 • edited Loading

alamb commented May 9, 2022 • edited Loading

tustvold commented May 9, 2022 • edited Loading

tustvold commented May 13, 2022

thinkharderdev commented May 16, 2022

tustvold commented May 16, 2022 • edited Loading

alamb commented May 16, 2022

thinkharderdev commented May 16, 2022

jychen7 commented May 28, 2022

matthewmturner commented May 9, 2022 •

edited

Loading

alamb commented May 9, 2022 •

edited

Loading

tustvold commented May 9, 2022 •

edited

Loading

tustvold commented May 16, 2022 •

edited

Loading