You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
{{ message }}
This repository has been archived by the owner on Feb 18, 2024. It is now read-only.
When reading a parquet file with a string column with many NAs both
parquet_read
andparquet_record_read
examples skip a few rows.To reproduce I run the following script that creates a data frame with a NA every 20 rows:
running the script with 100,000 rows:
In
parquet_read
I changed the code to print the number of rows:This print an array of length 99992:
I also tried changing the loop in
parquet_record_read
to double check for NA:When I run it there are 7 missing values (we should have 95,000 non null):
There are no problems if the generator outputs a column of say Int64.
I am running latest master:
> git rev-parse HEAD dbb7b8a69a990a1f37c81b2d8dfeadaf3fba48a8
The text was updated successfully, but these errors were encountered: