-
-
Notifications
You must be signed in to change notification settings - Fork 154
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Can I read DataColumnStatistics of the column only before reading the entire column data ? #252
Comments
No, not now, but it should be easy to implement. |
@mirosuav You should be able to access |
Thanks, @aloneguid. I already have a working solution for that, will PR it once I'm done testing. |
@mirosuav just wondering, is this in any way related to GeoParquet? |
@aloneguid no, I don't know GeoParquet :) We're doing our own research on comparing different storage format for big real time data. |
Hi
Is there a way to read only the DataColumnStatistics before actually loading the entire column data into memory ?
Essentially I have a method that checks if the search value exists in the data column by checking column Min and Max value
and if value doesn't happen to exist in within the column then I can skip entire column without loading its vaules into memory. However I've noticed that
ParquetRowGroupReader.ReadColumnAsync
loads the entire column data into memory.
How to only load column statistics and optionaly load column data on demand ?
The text was updated successfully, but these errors were encountered: