-
Notifications
You must be signed in to change notification settings - Fork 1.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
manifest is not a Parquet file. expected magic number #736
Comments
Hey @michael-j-thomas - what was the solution here? Edit: just read #706 - we should expect the metastore entries in Hive to work with Presto, Athena, etc. but not for Spark. |
@m-credera @michael-j-thomas Did either of you find a solution for this? I am also trying to use the Glue Catalog (to be able to query those tables using Spark SQL), but I'm experiencing the same issue since switching to delta/parquet. |
Does this issue still exists? We are facing the same issue, not able to read the Athena table created on DeltaLake file using Spark 3.4. |
Are you creating a Delta table entry in Glue Catalog as the right table type? Reading delta table via manifests requires setting up the table in very specific way that will work only via manifest but not as a native delta table in other engines that understand delta natively. See more details here - https://docs.delta.io/latest/presto-integration.html#step-2-configure-presto-trino-or-athena-to-read-the-generated-manifests |
Hi, I have a delta table integrated with the AWS Glue data catalog.
When running a query against the table via spark, I am getting the following error:
FileReadException: Error while reading file s3://.. _symlink_format_manifest/manifest
Caused by: RuntimeException: .../manifest is not a Parquet file. expected magic number at tail`
And the integrated spectrum table is generating a
S3ServiceException:The specified key does not exist.,Status 404,Error NoSuchKey
Furthermore, not all partitions are added after attempting to generate the manifests even though there are files in the partition.
Related issues #365
I ran the Spark SQL on a Databricks Cluster with runtime: 8.1 - Apache Spark 3.1.1 and Delta 1.0.0
Please let me know if i can supply more details.
The text was updated successfully, but these errors were encountered: