We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
What happened:
With version 0.6.3 or below, we can't read data containing null values. It throws following error:
IndexError: Out of bounds on buffer access (axis 0)
What you expected to happen:
Should be able to read data containing null.
Minimal Complete Verifiable Example:
In [1]: import glob In [2]: import pyspark In [3]: from fastparquet import ParquetFile In [4]: spark = pyspark.sql.SparkSession.builder.getOrCreate() In [5]: sdf = spark.createDataFrame([[None, 1, None], ['a', None, None]], schema='a string, b int, c string') In [6]: sdf.show() +----+----+----+ | a| b| c| +----+----+----+ |null| 1|null| | a|null|null| +----+----+----+ In [7]: path = 'spark_null' In [8]: sdf.write.format('parquet').save(path) In [10]: file_list = glob.glob(f'{path}/*.parquet') In [11]: pdf = ParquetFile(file_list).to_pandas() .../lib/python3.7/site-packages/fastparquet/speedups.pyx in fastparquet.speedups.unpack_byte_array() IndexError: Out of bounds on buffer access (axis 0)
Anything else we need to know?:
Environment:
The text was updated successfully, but these errors were encountered:
It works fine in version 0.7.0, but we cannot do the upgrade due to this issue: 646
It would be nice if we can have a hotfix in 0.6.x. Otherwise the compatibility with Spark is totally broken.
Sorry, something went wrong.
If you agree that my comments in the other issue are sufficient, please close this issue.
Close this issue as we have a workaround in #646.
No branches or pull requests
What happened:
With version 0.6.3 or below, we can't read data containing null values. It throws following error:
IndexError: Out of bounds on buffer access (axis 0)
What you expected to happen:
Should be able to read data containing null.
Minimal Complete Verifiable Example:
Anything else we need to know?:
Environment:
The text was updated successfully, but these errors were encountered: