Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove all use of parquet's validate_schema #110

Merged
merged 1 commit into from
Mar 6, 2023

Conversation

ianthomas23
Copy link
Member

Fixes #109.

Test suite passes using latest pyarrow == 11.0.0.

Fix isn't quite as simple as removing the final use of validate_schema keyword argument. It was also necessary when identifying which columns to read from the parquet file to check which are classified as columns rather than indexes. I have also simplified the code a bit as it no longer needs a separate load of the metadata before creating the ParquetDataset.

This fix works for pyarrow >= 5 (July 2021). I will try out another PR to support earlier pyarrow but the changes will be wider-ranging as there are a number of places in the code that do not currently support pyarrow < 5 before this PR is considered.

@ianthomas23 ianthomas23 merged commit d102aa0 into holoviz:main Mar 6, 2023
@ianthomas23 ianthomas23 deleted the 109_remove_validate_schema branch March 6, 2023 09:18
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

ValueError: Keyword 'validate_schema' is not yet supported with the new Dataset API
2 participants