We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
I'm trying to read data from azure blob storage using spark in Data bricks.
code: RELEASE = '2024-09-18.0' theme = 'base' type='water' subtype= 'water' classes= ['water', 'tidal']
classes_str = "('{}')".format("','".join(classes)) query = """SELECT * FROM parquet.https://overturemapswestus2.blob.core.windows.net/{}/theme={}/type={}/ WHERE subType='{}' AND class IN {};""".format(RELEASE, theme, type, subtype, classes_str)
result_df = spark.sql(query)
I'm getting error: [[DELTA_INVALID_FORMAT](https://docs.microsoft.com/azure/databricks/error-messages/error-classes#delta_invalid_format)] Incompatible format detected.
A transaction log for Delta was found at https://overturemapswestus2.blob.core.windows.net/release/2024-09-18.0/theme=base/type=water/part-00000-284b06bd-9385-4936-a4bb-71a4a6df08ac-c000.zstd.parquet/_delta_log, but you are trying to read from https://overturemapswestus2.blob.core.windows.net/release/2024-09-18.0/theme=base/type=water/part-00000-284b06bd-9385-4936-a4bb-71a4a6df08ac-c000.zstd.parquet using format("parquet"). You must use 'format("delta")' when reading and writing to a delta table.
querying using amazon s3 storage, works. RELEASE = '2024-09-18.0' theme = 'base' type='water' subtype= 'water' classes= ['water', 'tidal']
result_df = spark.sql(query) display(result_df)
what is the right way to query data using spark from azure blob storage?
The text was updated successfully, but these errors were encountered:
ibnt1
No branches or pull requests
I'm trying to read data from azure blob storage using spark in Data bricks.
code:
RELEASE = '2024-09-18.0'
theme = 'base'
type='water'
subtype= 'water'
classes= ['water', 'tidal']
classes_str = "('{}')".format("','".join(classes))
query = """SELECT * FROM parquet.https://overturemapswestus2.blob.core.windows.net/{}/theme={}/type={}/ WHERE subType='{}' AND class IN {};""".format(RELEASE, theme, type, subtype, classes_str)
result_df = spark.sql(query)
I'm getting error:
[[DELTA_INVALID_FORMAT](https://docs.microsoft.com/azure/databricks/error-messages/error-classes#delta_invalid_format)] Incompatible format detected.
A transaction log for Delta was found at https://overturemapswestus2.blob.core.windows.net/release/2024-09-18.0/theme=base/type=water/part-00000-284b06bd-9385-4936-a4bb-71a4a6df08ac-c000.zstd.parquet/_delta_log,
but you are trying to read from https://overturemapswestus2.blob.core.windows.net/release/2024-09-18.0/theme=base/type=water/part-00000-284b06bd-9385-4936-a4bb-71a4a6df08ac-c000.zstd.parquet using format("parquet"). You must use
'format("delta")' when reading and writing to a delta table.
querying using amazon s3 storage, works.
RELEASE = '2024-09-18.0'
theme = 'base'
type='water'
subtype= 'water'
classes= ['water', 'tidal']
classes_str = "('{}')".format("','".join(classes))
query = """SELECT * FROM parquet.https://overturemapswestus2.blob.core.windows.net/{}/theme={}/type={}/ WHERE subType='{}' AND class IN {};""".format(RELEASE, theme, type, subtype, classes_str)
result_df = spark.sql(query)
display(result_df)
what is the right way to query data using spark from azure blob storage?
The text was updated successfully, but these errors were encountered: