Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[CT-1322] [Feature] Enable setting execution_project in the dbt_project.yml #343

Closed
3 tasks done
matt-winkler opened this issue Oct 10, 2022 · 4 comments
Closed
3 tasks done

Comments

@matt-winkler
Copy link
Contributor

Is this your first time submitting a feature request?

  • I have read the expectations for open source contributors
  • I have searched the existing issues, and I could not find an existing issue for this feature
  • I am requesting a straightforward extension of existing dbt-bigquery functionality, rather than a Big Idea better suited to a discussion

Describe the feature

Similar conceptually to how we enable end users to override virtual warehouses at the folder / model level for Snowflake, is it possible to enable setting the execution_project on bigquery in the dbt_project.yml?

Seems related to #67

Describe alternatives you've considered

Multiple target profiles in dbt-core, multiple projects in dbt Cloud.

Who will this benefit?

Bigquery users who want more fine-grained control over billing resources associated with runs.

Are you interested in contributing this feature?

Yes

Anything else?

No response

@github-actions github-actions bot changed the title [Feature] Enable setting execution_project in the dbt_project.yml [CT-1322] [Feature] Enable setting execution_project in the dbt_project.yml Oct 10, 2022
@dbeatty10
Copy link
Contributor

@matt-winkler thanks for proposing this feature! Can definitely see the value in fine-grained control over billing resources💰

TL;DR

I'm not aware of a simple way we could implement the proposed feature.

Details

From what I can tell, the current execution_project configuration within profiles.yml takes affect only at connection time and applies to all queries that use that connection. It ultimately includes the value as a parameter to google.cloud.bigquery.Client() when creating a client connection (see BiqQuery docs here).

For dbt-snowflake, the warehouse can also be specified at connection time using the warehouse specified in profiles.yml.

But in contrast, it can be customized on a per model basis (using the snowflake_warehouse config). It will execute a query like this as a pre-hook of each model:

use warehouse my_purple_tshirt

To take the same approach in dbt-bigquery, we'd need an analagous query to execute. Do you know of anything like the following in BigQuery?

use project my_billing_project

If not, I don't know a direct approach to implement this.

@dbeatty10
Copy link
Contributor

Maybe something like this?

SET @@dataset_project_id = 'MyProject';

@jtcohen6
Copy link
Contributor

This is a similar ask to the one made in databricks/dbt-databricks#59. We don't really* expose node-level information when creating / reusing connections. We'd need that in order to pull an execution_project config off the node, and use it to instantiate a different connection.

*We sorta do this, in order to support query comments/headers containing node-specific info, but that code is pretty gnarly.

@dbeatty10 As you mention, Snowflake lets us cheat around this via the use warehouse meta-query, without actually needing to alter the connection itself.

@github-actions
Copy link
Contributor

This issue has been marked as Stale because it has been open for 180 days with no activity. If you would like the issue to remain open, please remove the stale label or comment on the issue, or it will be closed in 7 days.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants