You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
importpolarsaspldf=pl.scan_parquet('*.parquet')
# ComputeError: error while reading ethereum_contracts__v1_1_0__16800030_to_16800039.parquet: External format error: File out of specification: Repetition level must be defined for a primitive typedf=pl.scan_parquet('ethereum_contracts__v1_1_0__16800030_to_16800039.parquet')
# ArrowErrorException: ExternalFormat("File out of specification: Repetition level must be defined for a primitive type")
it appears that when a batch doesn't produce results, no schema is written to a parquet file, which makes it impossible to load it via glob.
The text was updated successfully, but these errors were encountered:
banteg
changed the title
small batches with no results make parquets not loadable
small batches with no results make parquets not loadable via glob
Aug 5, 2023
how to replicate
it appears that when a batch doesn't produce results, no schema is written to a parquet file, which makes it impossible to load it via glob.
The text was updated successfully, but these errors were encountered: