Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Generify ColumnReaderImpl and RecordReader (#1040) #1041

Merged
merged 20 commits into from
Jan 11, 2022

Conversation

tustvold
Copy link
Contributor

@tustvold tustvold commented Dec 13, 2021

This is highly experimental, I want to get further fleshing out #171 and #1037 before settling on this. In particular I want to get some numbers about performance. However, I wanted to give some visibility into what I'm doing

Builds on top of #1021

This introduces some limited generics into RecordReader and ColumnReaderImpl to allow for optimisations such as #1054 and #1082. Having implemented initial cuts of these, I am happy that this interface is sufficiently flexible for implementing various arrow-related optimisations.

Which issue does this PR close?

Closes #1040.

Rationale for this change

See ticket

What changes are included in this PR?

See ticket

Are there any user-facing changes?

No 😁

@github-actions github-actions bot added the parquet Changes to the parquet crate label Dec 13, 2021
@codecov-commenter
Copy link

codecov-commenter commented Dec 13, 2021

Codecov Report

Merging #1041 (28228b2) into master (07660c6) will increase coverage by 0.01%.
The diff coverage is 81.32%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master    #1041      +/-   ##
==========================================
+ Coverage   82.30%   82.31%   +0.01%     
==========================================
  Files         168      172       +4     
  Lines       49026    50082    +1056     
==========================================
+ Hits        40351    41227     +876     
- Misses       8675     8855     +180     
Impacted Files Coverage Δ
parquet/src/arrow/array_reader.rs 76.72% <ø> (-0.03%) ⬇️
parquet/src/column/reader.rs 69.88% <75.94%> (-2.45%) ⬇️
parquet/src/column/reader/decoder.rs 76.27% <76.27%> (ø)
parquet/src/arrow/record_reader/buffer.rs 85.10% <85.10%> (ø)
parquet/src/arrow/record_reader.rs 94.00% <87.17%> (+1.23%) ⬆️
...rquet/src/arrow/record_reader/definition_levels.rs 90.32% <90.32%> (ø)
parquet/src/util/memory.rs 91.12% <100.00%> (+0.08%) ⬆️
arrow/src/datatypes/native.rs 66.66% <0.00%> (-6.25%) ⬇️
arrow/src/compute/kernels/comparison.rs 89.75% <0.00%> (-3.48%) ⬇️
arrow/src/csv/reader.rs 88.10% <0.00%> (-2.48%) ⬇️
... and 37 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 07660c6...28228b2. Read the comment docs.

@tustvold
Copy link
Contributor Author

tustvold commented Dec 14, 2021

Running benchmarks on my local machine I get somewhat erratic results, from which I conclude this has no major impact on performance

arrow_array_reader/read Int32Array, plain encoded, mandatory, no NULLs - old                                                                             
                        time:   [3.7939 us 3.8031 us 3.8114 us]
                        change: [-3.6579% -3.4154% -3.1951%] (p = 0.00 < 0.05)
                        Performance has improved.
arrow_array_reader/read Int32Array, plain encoded, mandatory, no NULLs - new                                                                             
                        time:   [2.3030 us 2.3048 us 2.3073 us]
                        change: [+2.5908% +2.7441% +2.9142%] (p = 0.00 < 0.05)
                        Performance has regressed.
arrow_array_reader/read Int32Array, plain encoded, optional, no NULLs - old                                                                            
                        time:   [59.193 us 59.275 us 59.363 us]
                        change: [-4.2623% -4.1285% -4.0009%] (p = 0.00 < 0.05)
                        Performance has improved.
arrow_array_reader/read Int32Array, plain encoded, optional, no NULLs - new                                                                             
                        time:   [23.209 us 23.221 us 23.236 us]
                        change: [+32.531% +32.663% +32.835%] (p = 0.00 < 0.05)
                        Performance has regressed.
arrow_array_reader/read Int32Array, plain encoded, optional, half NULLs - old                                                                            
                        time:   [142.37 us 142.41 us 142.44 us]
                        change: [+5.5942% +6.6789% +7.7376%] (p = 0.00 < 0.05)
                        Performance has regressed.
arrow_array_reader/read Int32Array, plain encoded, optional, half NULLs - new                                                                            
                        time:   [139.07 us 139.89 us 140.59 us]
                        change: [+0.4422% +0.9960% +1.6028%] (p = 0.00 < 0.05)
                        Change within noise threshold.
arrow_array_reader/read Int32Array, dictionary encoded, mandatory, no NULLs - old                                                                             
                        time:   [21.919 us 21.923 us 21.927 us]
                        change: [+1.3392% +1.7681% +2.0113%] (p = 0.00 < 0.05)
                        Performance has regressed.
arrow_array_reader/read Int32Array, dictionary encoded, mandatory, no NULLs - new                                                                            
                        time:   [99.347 us 101.00 us 102.37 us]
                        change: [+5.5715% +6.7636% +8.2107%] (p = 0.00 < 0.05)
                        Performance has regressed.
arrow_array_reader/read Int32Array, dictionary encoded, optional, no NULLs - old                                                                            
                        time:   [75.648 us 75.663 us 75.681 us]
                        change: [-1.5816% -1.5384% -1.4963%] (p = 0.00 < 0.05)
                        Performance has improved.
arrow_array_reader/read Int32Array, dictionary encoded, optional, no NULLs - new                                                                            
                        time:   [112.52 us 113.33 us 114.36 us]
                        change: [+5.2751% +7.2166% +9.0108%] (p = 0.00 < 0.05)
                        Performance has regressed.
arrow_array_reader/read Int32Array, dictionary encoded, optional, half NULLs - old                                                                            
                        time:   [144.77 us 144.80 us 144.83 us]
                        change: [-11.013% -10.318% -9.6258%] (p = 0.00 < 0.05)
                        Performance has improved.
arrow_array_reader/read Int32Array, dictionary encoded, optional, half NULLs - new                                                                            
                        time:   [191.06 us 191.12 us 191.18 us]
                        change: [+3.4773% +3.5370% +3.5957%] (p = 0.00 < 0.05)
                        Performance has regressed.
arrow_array_reader/read StringArray, plain encoded, mandatory, no NULLs - old                                                                            
                        time:   [800.06 us 800.19 us 800.32 us]
                        change: [-1.6826% -1.6388% -1.5967%] (p = 0.00 < 0.05)
                        Performance has improved.
arrow_array_reader/read StringArray, plain encoded, mandatory, no NULLs - new                                                                            
                        time:   [124.84 us 124.86 us 124.88 us]
                        change: [+4.1077% +4.1575% +4.2088%] (p = 0.00 < 0.05)
                        Performance has regressed.
arrow_array_reader/read StringArray, plain encoded, optional, no NULLs - old                                                                            
                        time:   [846.35 us 846.59 us 846.87 us]
                        change: [+0.8637% +0.9228% +0.9834%] (p = 0.00 < 0.05)
                        Change within noise threshold.
arrow_array_reader/read StringArray, plain encoded, optional, no NULLs - new                                                                            
                        time:   [143.25 us 143.30 us 143.35 us]
                        change: [+2.6977% +2.7794% +2.8847%] (p = 0.00 < 0.05)
                        Performance has regressed.
arrow_array_reader/read StringArray, plain encoded, optional, half NULLs - old                                                                            
                        time:   [773.74 us 776.61 us 779.87 us]
                        change: [+3.2218% +3.4681% +3.7063%] (p = 0.00 < 0.05)
                        Performance has regressed.
arrow_array_reader/read StringArray, plain encoded, optional, half NULLs - new                                                                            
                        time:   [264.22 us 264.80 us 265.57 us]
                        change: [-1.3401% -1.1712% -0.9903%] (p = 0.00 < 0.05)
                        Change within noise threshold.
arrow_array_reader/read StringArray, dictionary encoded, mandatory, no NULLs - old                                                                            
                        time:   [726.17 us 726.74 us 727.44 us]
                        change: [+1.2812% +1.3725% +1.4618%] (p = 0.00 < 0.05)
                        Performance has regressed.
arrow_array_reader/read StringArray, dictionary encoded, mandatory, no NULLs - new                                                                            
                        time:   [116.83 us 116.91 us 116.99 us]
                        change: [-3.2217% -3.0893% -2.9282%] (p = 0.00 < 0.05)
                        Performance has improved.
arrow_array_reader/read StringArray, dictionary encoded, optional, no NULLs - old                                                                            
                        time:   [802.16 us 803.89 us 805.57 us]
                        change: [-0.4055% -0.2549% -0.1073%] (p = 0.00 < 0.05)
                        Change within noise threshold.
arrow_array_reader/read StringArray, dictionary encoded, optional, no NULLs - new                                                                            
                        time:   [134.39 us 134.43 us 134.48 us]
                        change: [+0.0304% +0.2086% +0.3678%] (p = 0.02 < 0.05)
                        Change within noise threshold.
arrow_array_reader/read StringArray, dictionary encoded, optional, half NULLs - old                                                                            
                        time:   [742.00 us 742.57 us 743.00 us]
                        change: [+3.4464% +3.6453% +3.8440%] (p = 0.00 < 0.05)
                        Performance has regressed.
arrow_array_reader/read StringArray, dictionary encoded, optional, half NULLs - new                                                                            
                        time:   [236.67 us 237.14 us 238.07 us]
                        change: [+1.7094% +1.9629% +2.5264%] (p = 0.00 < 0.05)
                        Performance has regressed.

What is strange to me is that this seems to have a consistent ~5% impact on the "new" ArrowArrayReader despite this change touching none of the code used by it. I suspect we're in the weeds of the wims of LLVM, which I'm not really sure it makes sense to optimise for at this stage - there's a lot of lower hanging fruit. It's also worth noting that ArrowArrayReader is not used for anything bar strings at this stage, and I intend to introduce an optimised StringArrayReader that should be significantly faster.

My takeaway - no major cause for concern at this stage

.current_encoding
.expect("current_encoding should be set");

let current_decoder = self
Copy link
Contributor

@yordan-pavlov yordan-pavlov Dec 23, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why not set a current_decoder field in the set_data method (where the decoder has to be selected anyway to call set_data on it), so that it doesn't have to be looked up on every call of read here? It should perform better (no lookup) and simplify this read method as well.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't write this logic, just moved it, but my guess is this is a way to placate the borrow checker. Decoder::get requires a mutable reference, and we wish for decoders, in particular the dictionary decoder, to be usable across multiple set_data calls.

In order to have a current_decoder construct we would either need to perform a convoluted move dance moving data in and out of the decoder map, or use Rc<RefCell>. This is simpler, if possibly a little less performant. FWIW I'd wager that the overheads of a hashmap keyed on a low cardinality enumeration are pretty low.


/// An implementation of [`ColumnLevelDecoder`] for `[i16]`
pub struct ColumnLevelDecoderImpl {
inner: LevelDecoderInner,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if the inner level decoder can be a generic parameter instead - wouldn't that remove the need to match &mut self.inner in the read method?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This would require introducing some type representation of the encoding type. This would be a fair bit of additional code/complexity that I don't think would not lead to a meaningful performance uplift. Assuming ColumnLevelDecoderImpl::read is called with a reasonable batch size of ~1024, the overheads of a jump table are likely to be irrelevant.

) -> impl Iterator<Item = usize> + '_ {
let max_def_level = self.max_level;
let slice = self.buffer.as_slice();
range.rev().filter(move |x| slice[*x] == max_def_level)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it might be more efficient to calculate a boolean array for the null bitmap using arrow::compute::eq_scalar as used in ArrowArrayReader here https://github.com/apache/arrow-rs/blob/master/parquet/src/arrow/arrow_array_reader.rs#L570 , because it can use SIMD (if enabled)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Currently BooleanBufferBuilder doesn't have a story for appending other BooleanBuffers - #1039 adds this but I'd rather not make this PR depend on it.

Additionally the cost of the memory allocation and copy may outweigh the gains from SIMD.

Given this I'm going to leave this as is, especially as #1054 will remove this code from the decode path for files without nested nullability.

) {
let slice = self.as_slice_mut();

for (value_pos, level_pos) in range.rev().zip(rev_position_iter) {
Copy link
Contributor

@yordan-pavlov yordan-pavlov Dec 26, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it might be more efficient to insert null values using arrow::compute::SlicesIterator as used in ArrowArrayReader here https://github.com/apache/arrow-rs/blob/master/parquet/src/arrow/arrow_array_reader.rs#L606 , since it works with sequences rather than single values

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a cool suggestion, I was not aware of this component. Unfortunately it does not appear to support reverse iteration, which is required here, so I will leave this as a potential future optimization.

@@ -200,7 +200,6 @@ pub struct PrimitiveArrayReader<T: DataType> {
rep_levels_buffer: Option<Buffer>,
column_desc: ColumnDescPtr,
record_reader: RecordReader<T>,
_type_marker: PhantomData<T>,
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seemed to be an orphan so I just removed it

}

#[inline]
fn configure_dictionary(&mut self, page: Page) -> Result<bool> {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This logic is moved into ColumnValueDecoder

@@ -392,38 +419,6 @@ impl<T: DataType> ColumnReaderImpl<T> {
Ok(true)
}

/// Resolves and updates encoding and set decoder for the current page
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This logic is also moved into ColumnValueDecoder

@tustvold
Copy link
Contributor Author

tustvold commented Jan 1, 2022

I've renamed a number of the methods and traits based on the great feedback, and also added a load of doc comments. In particular I took inspiration from std::Vec, in particular Vec::spare_capacity_mut and Vec::set_len which is effectively an unsafe version of what is going on here.

I'm happy that this interface is sufficiently flexible for the optimisations I have in mind, many of which I've already got draft PR with initial cuts of, and so I'm marking this ready for review.

I am aware this is a relatively complex change, to an already complex part of the codebase so if anything isn't clear please let me know.

Edit: I have tested this code change with #1053 and the tests are green (with ArrowArrayReader replaced with ComplexObjectArrayReader to workaround #1111)

@tustvold tustvold marked this pull request as ready for review January 1, 2022 20:01
Copy link
Contributor

@alamb alamb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for the comments @tustvold

I went through this code pretty carefully -- and other than the places I noted it looks like a really nice job to me. I think the additional testing such as #1110 gives me extra confidence that this is working as designed

To other reviewers, I would summarize this change as "pulls out common and redundant logic from some of the RecordReader impls into a set of common structures and traits.

parquet/src/arrow/record_reader.rs Show resolved Hide resolved
self.buffer.resize(num_bytes, 0);
self.len -= len;

std::mem::replace(&mut self.buffer, remaining).into()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TIL: std::mem::replace

///
/// # Panics
///
/// Implementations must panic if `len` is beyond the initialized length
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't understand the must panic bit here -- how would implementations know what the initialized length (data written to the location returned by space_capacity_mut) is? Or is this referring to the capacity ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was trying to distinguish this from Vec::set_len which is unsafe because it doesn't know how much is initialized. In the case of RecordBuffer the entire capacity is initialized, just possibly not set to anything useful. The result may not be desirable, but isn't UB and therefore unsafe

Comment on lines +144 to +145
self.buffer
.resize((self.len + batch_size) * std::mem::size_of::<T>(), 0);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it ok to initialize everything to 0? I am wondering if 0 isn't a valid representation for some type T? Perhaps this should be T::default() instead?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sadly this is not possible with MutableBuffer the second parameter is u8. IMO MutableBuffer is pretty unfortunate and should really be typed based on what it contains, but changing this would be a major breaking change to a lot of arrow...

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe it is called arrow2 :trollface:

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Indeed arrow2 could definitely serve as inspiration for such a change. I have some ideas on how to make such a change without major churn, but nothing fully formed just yet 😁

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

arrow2 no longer uses MutableBuffer<T: NativeType>: it recently migrated to std::Vec<T: NativeType>, for ease of use (and maintain).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it recently migrated to std::Vec<T: NativeType>

Is there some way to force Vec to use stricter alignment than needed by T? i.e. for SIMD stuffs?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you mean e.g. use 128 bytes instead of the minimum layout required by T? I do not think it is possible on the stable channel, no.

parquet/src/arrow/record_reader/buffer.rs Outdated Show resolved Hide resolved
/// A [`BufferQueue`] capable of storing column values
pub trait ValuesBuffer: BufferQueue {
/// Iterate through the indexes in `range` in reverse order, moving the value at each
/// index to the next index returned by `rev_valid_position_iter`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the code also seems to assume that rev_valid_position_iter is sorted

num_decoded_values: u32,

// Cache of decoders for existing encodings
decoders: HashMap<Encoding, Box<dyn Decoder<T>>>,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For anyone else following along, the cache is moved into ColumnValueDecoderImpl below

use crate::util::bit_util::BitReader;

/// A slice of levels buffer data that is written to by a [`ColumnLevelDecoder`]
pub trait LevelsBufferSlice {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I missed it somewhere along the line -- what is the point of Generisizing (sp?) levels, rather than just using [i16]? Can definition or repetition levels ever be something other than i16?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes - #1054

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah -- got it

num_buffered_values: u32,
encoding: Encoding,
buf: ByteBufferPtr,
) -> Result<ByteBufferPtr> {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a reason to replicate the logic in LevelDecoder::v1(enc, max_level); here ? Could that level decoder simply be reused? Especially since it already has tests, etc

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The short answer is because I found the interface for LevelDecoder incredibly confusing, and this isn't actually interested in the decoder, just working out how many bytes of level data there are...

I can change if you feel strongly

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, I was just curios

@alamb
Copy link
Contributor

alamb commented Jan 4, 2022

cc @nevi-me @sunchao and @jorgecarleitao

Please let us know if anyone else is interested in reviewing this PR. If not I'll plan to merge it in soon


#[inline]
pub fn as_slice(&self) -> &[T] {
let (prefix, buf, suffix) = unsafe { self.buffer.as_slice().align_to::<T>() };
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @alamb for the ping. I haven't look into this PR semantics in detail because I am not familiar with this code base.

I think that this line is sound iff T: plain old data (in the sense that they fulfill the invariants of Pod).

However, bool, which is not Pod, implements ParquetValueType, and we pass T: DataType::T to TypedBuffer here.

Note that like bool, Int96 contains Option<[u32; 3]> which is also not plain old data, and also implements ParquetValueType.

Maybe restrict T to TypedBuffer<T: PrimitiveType> or something, so that we do not allow non-plain old data types to be passed here?

Copy link
Contributor Author

@tustvold tustvold Jan 4, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah the typing here is a bit unfortunate, there is a cludge in PrimitiveArrayReader to handle bools, and prevent Int96 but I'm not going to argue it isn't pretty gross 😅

It's no worse than before, but it certainly isn't ideal... I'll have a think about how to improve this without breaking the APIs 🤔

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we could at least document it (or mark it as unsafe to force the callsites to acknowledge they aren't using bool)?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Going to mark this as a draft whilst I fix #1132 which should in turn fix this

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

#1155 contains the fix

@tustvold tustvold marked this pull request as draft January 10, 2022 22:26
@tustvold
Copy link
Contributor Author

Unfortunately the code I added in #1155 didn't quite carry across as I had hoped for, as parquet doesn't have an Int16Type but definition levels and repetition levels are parsed as i16. This required some more finagling, but the general concept of restricting the valid types remains unchanged

@@ -1033,21 +1032,6 @@ pub(crate) mod private {
self
}
}

/// A marker trait for [`DataType`] with a [scalar] physical type
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was added in #1155 but unfortunately didn't work as anticipated because of the lack of Int16Type which is needed for decoding levels data

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we impl ScalarDataType for i16?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you you need to remove this code, then we should probably reopen the original ticket #1132

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

impl ScalarDataType for i16

In short, no... DataType is tightly coupled with what it means to be a physical parquet type, which i16 is not

If you you need to remove this code, then we should probably reopen the original ticket

It is an alternative way of fixing that ticket. Rather than constraining T: DataType we constrain T::T. The two approaches are equivalent, but the latter allows implementing the marker trait for types that don't have a corresponding DataType

@tustvold tustvold marked this pull request as ready for review January 11, 2022 15:01
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
parquet Changes to the parquet crate
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Generify ColumnReaderImpl and RecordReader
5 participants