Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support bq legacySQL queries, to access partition metadata #2552

Closed
jtcohen6 opened this issue Jun 16, 2020 · 2 comments · Fixed by #2596
Closed

Support bq legacySQL queries, to access partition metadata #2552

jtcohen6 opened this issue Jun 16, 2020 · 2 comments · Fixed by #2596
Labels
bigquery enhancement New feature or request

Comments

@jtcohen6
Copy link
Contributor

jtcohen6 commented Jun 16, 2020

Describe the feature

Allow dbt to run Legacy SQL queries to access older BigQuery features that are not yet supported in Standard SQL.

The required change involves an ability to dynamically override job_params in raw_execute. Ideally, there would be an additional argument to run_query/statement such as:

  • use_legacy_sql: true
  • job_params: {"use_legacy_sql": true}

Specific use case

There is a compelling and free (zero-byte) way to access partition metadata, but it's only available to Legacy SQL:

#legacySQL
select * from [bigquery-public-data:wikipedia.pageviews_2020$__PARTITIONS_SUMMARY__]

This offers substantial savings over the Standard SQL query to get the latest partition value—which is a big chunk of the overhead in the dynamic insert_overwrite incremental strategy.

Describe alternatives you've considered

Wait for BigQuery to release a zero-cost way of accessing partition metadata from Standard SQL. The signs of that happening soon aren't promising, but I'm not a huge fan of adding new support for legacy functionality.

Who will this benefit?

BigQuery users with large partitioned tables

@jtcohen6 jtcohen6 added enhancement New feature or request bigquery labels Jun 16, 2020
@ran-eh
Copy link
Contributor

ran-eh commented Jun 16, 2020

@jtcohen6 It may be overkill to add a full blown feature for it - changing the signatures of run_query/statement, handling the new parameter as an error for non BQ etc. It may be quicker (and dirtier) to have the BQ adapter look for #legacySQL at the top of the query text.

@drewbanin
Copy link
Contributor

@jtcohen6 my instinct here is that we should not add new general-purpose support for legacy sql. If we want to make some helper adapter function that gets the zero-cost partitions for a table, we can certainly do that, but the more we can firewall this functionality from the rest of the plugin, the better IMO!

@jtcohen6 jtcohen6 linked a pull request Jul 7, 2020 that will close this issue
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bigquery enhancement New feature or request
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants