Feat!: Add new table_format property alongside storage_format #3175
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Context: #3154 (comment)
Currently, the
storage_format
property is a bit overloaded. It's defined as setting the on-disk file format (egparquet
,orc
) which makes sense but it's also been overloaded in various places to include table formats such asiceberg
.This PR introduces a
table_format
property to clearly separate table formats from storage formats.The problem with the current setup of a single property is that it makes it difficult to describe concepts like "Iceberg + ORC", "Iceberg + Parquet", "Hive + Ion" etc.
Since
orc
andparquet
are already valid values forstorage_format
,table_format
is a new property to hold the table format (eghive
oriceberg
). To keep backwards compatibility, it's opt-in, so engine adapters need to explicitly take advantage of it.Implementing this is a precursor for compatibility with models created using dbt-athena which cleanly separates these concepts.
It also means that any users that may have gone "all in" on a format like Iceberg/ORC can use SQLMesh to create tables that are compatible with their existing downstream consumers. Currently, SQLMesh assumes Parquet when
storage_format=iceberg