Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Vortex Layouts #1676

Open
1 of 17 tasks
gatesn opened this issue Dec 13, 2024 · 0 comments
Open
1 of 17 tasks

Vortex Layouts #1676

gatesn opened this issue Dec 13, 2024 · 0 comments
Assignees
Labels
epic wire-break Includes a break to the serialized IPC or file format

Comments

@gatesn
Copy link
Contributor

gatesn commented Dec 13, 2024

[edit] I've hijacked the "Refactor VortexFileWriter" issue to cover the broader work towards vortex layouts.

  • Add better Debug impl for LayoutData that shows the full tree. Requires a visitor.
  • Expression splitting (so we can push-down expressions to struct fields and recombine the results)
  • Layout statistics for a given row mask, including whether the results are exact / inexact.
  • Change the segment read/write APIs to use aligned ByteBuffer. Have ArrayParts choose a reasonable alignment.
  • Configure the writer to include/exclude padding
  • Make it easy to run segment reader against memory mapped file
  • Cache segments
  • Coalesce segments
  • Add option to dedupe segments at write time
  • Add option to LZ4 compress flatbuffers

The description here is WIP

Vortex Layouts

Vortex Layouts should be seen as symmetrical to Vortex Arrays, except that the data is stored externally and they can be lazily queried with push-down pruning. They are fully independent of the storage system, requiring only a key(u32)/value(bytes) API.

A secondary benefit of separating the layout logic from storage, is that we remove all I/O from this code path allowing us to have much better control over when, where and how we interact with I/O (as well as removing all traces of async Rust from the layout logic).

Built-in Layouts

Our initial prototype ships with three layouts:

  • Flat - an atomic array (of any DType). The serialized form does not include the dtype (to prevent duplication).
  • Chunked - row-based partitioning of an array.
  • Struct - col-based partitioning of a struct array.

Additional layouts for the future:

  • Dictionary - pull out a shared dictionary across another child layout, e.g. store values as flat layout, store codes in a chunked layout to use one dictionary across all chunks of a column.
  • FlatStruct - same as Struct, except with flattened nested fields.

As with everything in Vortex, these will be extensible by consumers of the Rust library.

Layout Strategies

The power of layouts is in how we choose them. I think by default Vortex will ship with a StructOfChunks layout strategy that first splits into columns, then each column is independently chunked based on a target byte size (rather than row-count). This can be pretty useful when storing columns with large values. It means the returned batches are sized based on the data that's scanned, and doesn't suffer the write-time skew that e.g. Parquet might introduce with its "ChunksOfStruct" (row-groups of columns) strategy.

A LayoutStrategy is fed a series of ArrayData batches that represent a row-wise chunk of an array. Each strategy may manipulate, buffer, split, this batch however it likes and pass it to some child strategies, eventually culminating in a FlatStrategy that simply serializes the data. At the end of this process, the layout strategy returns the LayoutData (remember the analogy to ArrayData?).

Layout Scan

Scanning a layout is sort of analogous to an array compute function. It's probably the only "compute function" we support on layouts for a while (maybe we support sum/min/max etc?).

From a LayoutData, we construct a LayoutScan, and from this we construct a RangeScan per horizontal "split" of the file that we wish to read. This two-level approach is so for a single scan operator layouts can share some state across ranges, e.g. DictLayout would want to share the canonicalized values array.


The old issue:

Tracking a bunch of breaking changes to the format and refactoring of the writer

  • VortexFileWriter to pass chunks into an abstract LayoutWriter.
    • We ship a default LayoutWriter that is struct-of-chunked.
  • Impl LayoutWriter for struct, chunked, flat, and later dict layouts.
  • Layout flat buffer to hold BlockIds instead of offset/length. This allows Layout to be abstract over the storage format and not assume linear bytes (e.g. can store in an arbitrary block device).
  • Postscript to point to DType + FileLayout and not assume contiguous bytes.
  • Ensure we don't write extra padding / framing, e.g. we sometimes use IPCMessage and MessageWriter for writing to the file, even though this adds framing for a streaming format.
  • Allow configurable padding in the file to support mem-mapping with zero-copy.
  • Allow configurable compression for buffers.
gatesn added a commit that referenced this issue Dec 16, 2024
@gatesn gatesn mentioned this issue Dec 16, 2024
@gatesn gatesn added wire-break Includes a break to the serialized IPC or file format epic labels Jan 2, 2025
@gatesn gatesn self-assigned this Jan 2, 2025
@gatesn gatesn changed the title Refactor VortexFileWriter Vortex Layouts Jan 2, 2025
gatesn added a commit that referenced this issue Jan 3, 2025
Initial implementation of the new structure of vortex layouts per #1676 

* Only flat layout works.
* I'm not 100% sure on the trait APIs, these will evolve as we pad out
the implementation.
* StructLayout will be worked on as part of #1782 so will probably come
last.

Up next:
* Implementation of ChunkedLayout

Open Questions:
* What is the API that e.g. Python users have to precisely configure
layout strategies? Can I override a layout writer for a specific field?
* Similarly, how can we configure the layout scanners? Can I configure a
level 0 chunked layout differently from level 2 in a
chunk-of-struct-of-chunk world?
gatesn added a commit that referenced this issue Jan 13, 2025
Introduces the idea of a `driver` as a pluggable way of orchestrating
I/O and CPU based work.

An `ExecDriver` takes a `Stream<Future<T>>` and returns a `Stream<T>`.
The abstraction means we can implement drivers that use the current
runtime thread (Inline), spawn the work onto an explicit Runtime, or
spawn the work onto a thread pool.

The `IoDriver` takes a stream of segment requests and has to resolve
them. For now, we have a FileIoDriver that assumes the bytes are laid
out on disk (and therefore coalescing is desirable), so it pulls
requests off the queue, coalesces them, then launches concurrent I/O
requests to resolve them.

The VortexFile remains generic over the I/O driver, meaning the
send-ness of the resulting ArrayStream is inferred based on the
send-ness of the I/O driver (in the case of FileDriver, that is the
send-ness of the VortexReadAt).

Part of #1676
gatesn added a commit that referenced this issue Jan 13, 2025
gatesn added a commit that referenced this issue Jan 13, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
epic wire-break Includes a break to the serialized IPC or file format
Projects
None yet
Development

No branches or pull requests

1 participant