-
Notifications
You must be signed in to change notification settings - Fork 839
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Use Vec instead of Slice in ColumnReader #5177
Comments
|
|
Hello together, maintainer of Don't have any measurements around this, but hurts me a little. Just some feedback. Keep up the good work. |
I should add I only have to pessimize insertion and also only if the parquet and ODBC type are binary identical for required columns. So its not the most important thing in the world. Yet it was nice that in this case no additional copy had been needed. |
If it makes you feel any better, this would likely be subtly incorrect for data with repetition levels 😄
I would be interested in any numbers where this represents a regression, mainly because #5193 represented a non-trivial performance uplift in many cases. I'd be willing to consider alternative suggestions to fixing #5150 if this represents a major regression for your workloads. |
Only had flat, table like data, so I might have been lucky in the past.
I have little reason to doubt you. Some tests are still failing, but I will just update. I would not have the time to set a benchmark up. Even if I would, there are so many variables (type of column, number of rows etc.) that its outcome unless mindbogglingly significant would not allow any conclusions. Nice to hear that you had some nice speedups though. |
Is your feature request related to a problem or challenge? Please describe what you are trying to do.
Currently ColumnValueDecoderImpl and by extension ColumnReader accepts slices of
[T::T]
whereT: DataType
.This was preserved by #1041 which extracted generics to allow using owned buffer constructions instead for the arrow read path, whilst preserving the existing API for non-arrow readers.
However, preserving this API has a couple of fairly substantial drawbacks:
ColumnValueDecoder
Describe the solution you'd like
I would like to update
ColumnValueDecoderImpl
to acceptVec<T>
instead of[T::T]
. This would not only simplifyRecordReader
, and improve its performance for nested data, but would eliminate issues like #5150Describe alternatives you've considered
Additional context
The text was updated successfully, but these errors were encountered: