Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

manifest is not a Parquet file. expected magic number #706

Closed
faycal-merouane opened this issue Jun 23, 2021 · 2 comments
Closed

manifest is not a Parquet file. expected magic number #706

faycal-merouane opened this issue Jun 23, 2021 · 2 comments

Comments

@faycal-merouane
Copy link

faycal-merouane commented Jun 23, 2021

Hello ,
while trying to read delta table using hive i got the following error :

Caused by: java.lang.RuntimeException: hdfs:/.../_symlink_format_manifest/manifest is not a Parquet file. expected magic number at tail [80, 65, 82, 49] but found [117, 101, 116, 10].
while searching for the cause of the issue is fixed in spark 3 and delta 0.7.0 see #365 .

i'm using the following commands to create external table in hive :

DeltaTable.forPath(spark, deltaLakeSilverPath).generate("symlink_format_manifest")

CREATE EXTERNAL TABLE IF NOT EXISTS $table_name ($schema) " + "ROW FORMAT SERDE 'org.apache.hadoop.hive.ql.io.parquet.serde.ParquetHiveSerDe' " + "STORED AS INPUTFORMAT 'org.apache.hadoop.hive.ql.io.SymlinkTextInputFormat' " + "OUTPUTFORMAT 'org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat' " + "LOCATION "+symlink_format_manifest_Path
I'm using spark version 3.1.2 and delta 1.0.0 .

@tdas
Copy link
Contributor

tdas commented Jun 23, 2021

Hive does not actually support reading non-text tables using "org.apache.hadoop.hive.ql.io.SymlinkTextInputFormat". This feature was first designed in Hive to work text formatted tables, but was never extended to work with non-text formatted tables. But other systems like Presto extended it to work with Parquet, etc.

@faycal-merouane
Copy link
Author

@tdas Thx a lot for the clarification, i just deploy presto and tested it work like charm .

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants