Fix!: bump sqlglot to v25.29.0, fix info schema view handling in bigquery #3332

georgesittas · 2024-11-05T20:00:12Z

The goal of this PR is to enable SQLMesh to correctly handle information schema view references in BigQuery. The main problem with those until now was that, in their fully-qualified form, they comprised 4 identifiers:

project.dataset_or_region.INFORMATION_SCHEMA.SOME_VIEW

This means that we'd end up with Table references of mixed nesting, e.g. model names comprise 3 identifiers:

project.dataset.model_name

Mixing multiple nesting levels in table references is prohibited by SQLGlot's schema module [1, 2], in order to avoid issues related to ambiguity. So, one workaround for that was to represent information schema views using 3 identifiers at parse time, only for BigQuery. Other engines don't allow >3 identifiers in their table references based on my investigation.

I went with this approach because making the schema module more lenient, i.e. allowing multiple nesting depths, was quite complex. We rely on the invariant that the depth is the same in several places and the scope of the current approach seemed way smaller in comparison.

I guess one thing we'll need to be careful about is that parsing BigQuery's information schema views without specifying the dialect can result in an incorrect AST representation, because we represent the first example with the 4 identifiers using a Dot instead of merging the last two parts into a single Identifier, as done in BigQuery's parser.

For additional context, please refer to:

erindru

I'm.. amazed this works

…uery

georgesittas requested review from tobymao and a team November 5, 2024 20:00

erindru approved these changes Nov 5, 2024

View reviewed changes

izeigerman approved these changes Nov 5, 2024

View reviewed changes

Fix!: bump sqlglot to v25.29.0, fix info schema view handling in bigq…

a215ac0

…uery

georgesittas force-pushed the jo/bump_sqlglot_to_v25_29_0 branch from 01be9d6 to a215ac0 Compare November 6, 2024 07:46

georgesittas merged commit 6af38f6 into main Nov 6, 2024
23 checks passed

georgesittas deleted the jo/bump_sqlglot_to_v25_29_0 branch November 6, 2024 09:05

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix!: bump sqlglot to v25.29.0, fix info schema view handling in bigquery #3332

Fix!: bump sqlglot to v25.29.0, fix info schema view handling in bigquery #3332

georgesittas commented Nov 5, 2024

erindru left a comment

Fix!: bump sqlglot to v25.29.0, fix info schema view handling in bigquery #3332

Fix!: bump sqlglot to v25.29.0, fix info schema view handling in bigquery #3332

Conversation

georgesittas commented Nov 5, 2024

erindru left a comment

Choose a reason for hiding this comment