-
Notifications
You must be signed in to change notification settings - Fork 80
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
NSFS | ChunkFS | Inconsistent data corruption when copying using Spark #8574
Comments
@romayalon @shirady Step2 structuredDF.write.parquet("s3a://buck1/p")
|
Ok, I got the issue. So to fix this issue ,need to make sure the split operation handles lines with fewer than 2 parts.
I am able to create parquet file using the above code. thanks |
Environment info
Actual behavior
a. See the parquet file having
PAR1PAR1
at the endb. See the parquet file having
PAR11
at the endAdditional details of Spark internal "COPY" work -
Expected behavior
Steps to reproduce
journalctl -u noobaa -f > noobaa_during_spark_run.log
to get the logs of noobaa during the reproductionMore information - Screenshots / Logs / Other output
The text was updated successfully, but these errors were encountered: