Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BigQuery check if schema already exists before trying to create it #220

Merged
merged 2 commits into from
Aug 30, 2023

Conversation

thomas-vl
Copy link
Contributor

@thomas-vl thomas-vl commented Jun 21, 2023

Description & motivation

Before trying to create a new dataset we should check if it already exists without using the CREATE statement.
It could be that the DBT SA or DBT user that runs the command does not has the permission to create a dataset.

Checklist

  • I have verified that these changes work locally
  • I have updated the README.md (if applicable)
  • I have added an integration test for my fix/feature (if applicable)

@thomas-vl thomas-vl requested a review from jeremyyeo as a code owner June 21, 2023 19:41
@JCZuurmond
Copy link
Contributor

@thomas-vl : Do we have a test that covers this scenario: a schema exists and dbt does not try to create it again

@JCZuurmond
Copy link
Contributor

@dataders : could you merge this PR?

I discussed testing this with @thomas-vl, however, we would need a user in the CI that has permissions to create tables and not schemas. This is the situation Thomas is facing, though less likely in a testing setup. The change can not be covered with a additional (unit) test.

@Fleid
Copy link

Fleid commented Jul 24, 2023

@dataders brought this up at our latest triage review, and it looks good to go! :)

@dataders dataders self-assigned this Jul 24, 2023
@jackwelty
Copy link

I would add that not all users necessarily have access to INFORMATION_SCHEMA.SCHEMATA, especially if they don't have permission to create schemas. Is there a way to handle this without that access?

@thomas-vl
Copy link
Contributor Author

I would add that not all users necessarily have access to INFORMATION_SCHEMA.SCHEMATA, especially if they don't have permission to create schemas. Is there a way to handle this without that access?

I think that everyone one in security will have no problem giving out read access to INFORMATION_SCHEMA in favour of not giving out create access for datasets.

Copy link
Collaborator

@dataders dataders left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm approving this as it's been in use in the field for quite some time.

@dataders dataders merged commit 9a4d4da into dbt-labs:main Aug 30, 2023
jarno-r pushed a commit to hurtigruten/dbt-external-tables that referenced this pull request Jan 2, 2024
…bt-labs#220)

* fix check schema

* fix when we do not need to update the schema

---------

Co-authored-by: Thomas van Latum <tvanlatum@sligro.nl>
jarno-r pushed a commit to hurtigruten/dbt-external-tables that referenced this pull request Jan 2, 2024
…bt-labs#220)

* fix check schema

* fix when we do not need to update the schema

---------

Co-authored-by: Thomas van Latum <tvanlatum@sligro.nl>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants