Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make schema creation optional (at least on BigQuery) #304

Closed
nicolas-gaillard opened this issue Jun 10, 2024 · 1 comment
Closed

Make schema creation optional (at least on BigQuery) #304

nicolas-gaillard opened this issue Jun 10, 2024 · 1 comment
Labels
duplicate This issue or pull request already exists enhancement New feature or request

Comments

@nicolas-gaillard
Copy link

Describe the feature

I think the objective of this package is mainly to create external tables and it should probably not handle the schema creation (or make it optional at least). Indeed, it is common practice to manage databases and schema using an infrastructure-as-code tool such as Terraform.

On BigQuery, it's really painful right now because it queries the INFORMATION_SCHEMA.SCHEMATA table as a first step which requires permissions on the whole GCP project and I think it goes against the principle of least privilege.

Describe alternatives you've considered

Maybe we can add a parameter which is True by default not to change the current behavior. This parameter would control the schema creation.

Additional context

I don't think this is a database-specific feature and to be honest, I don't know if this is the case on databases other than BigQuery. Is it the kind of feature we can develop for only one provider?

Who will this benefit?

It will be useful to improve security and to be more in line with the README of the package.

@nicolas-gaillard nicolas-gaillard added enhancement New feature or request triage labels Jun 10, 2024
@jeremyyeo jeremyyeo added duplicate This issue or pull request already exists and removed triage labels Jun 11, 2024
@jeremyyeo
Copy link
Collaborator

This is an interesting one - and if you want some context on the introduction of this package ultimately being able to create schemas - you can see this issue: #100 where there was quite a lot of support for being able to do so - i.e. being able to create schemas on the fly.

I think in hindsight, one would have perhaps added this feature and make it toggle-able create_schemas_if_not_exist: True.

At the same time, there this other related issue: #222 and I think this comment in particular: #222 (comment) - worth keeping the discussion in that issue imho.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
duplicate This issue or pull request already exists enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants