Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Load profile.json exception #455

Open
cic1988 opened this issue Feb 5, 2024 · 1 comment
Open

Load profile.json exception #455

cic1988 opened this issue Feb 5, 2024 · 1 comment

Comments

@cic1988
Copy link

cic1988 commented Feb 5, 2024

Hello experts,

I followed the protocol example to build the reference server. The server generated the presigned URL when table/query endpoint is called.

Assumed that my table_url is profile.json#share.schema.table.

By using df = delta_sharing.load_as_pandas(table_url, limit=3) it loads the data well. But it has failed if I use load_as_spark.

Following code:

from pyspark.sql import SparkSession
spark = SparkSession.builder.appName("Delta Share Demo") \
    .config('spark.jars', 'packages/haddop-azure-3.3.6.jar,packages/delta-sharing-spark_2.12-0.6.4.jar') \
    .getOrCreate()

...

import delta_sharing
df = delta_sharing.load_as_spark(table_url)
df.limit(2).select("path").show()

In the error, it shows:

java.lang.RuntimeException: delta-sharing:/profile.json%23share.schema.table/123/25169076 is not a Parquet file. Expected magic number at tail, but found [0, 20, 14, 55]

Have you seen the error before?

@andyl-db andyl-db reopened this Feb 28, 2024
@linzhou-db
Copy link
Collaborator

@cic1988 sorry haven't seen it before.
Is this still happening?
Do you have a full stack trace?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants