Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

support clustering on BigQuery #918

Closed
drewbanin opened this issue Aug 10, 2018 · 3 comments
Closed

support clustering on BigQuery #918

drewbanin opened this issue Aug 10, 2018 · 3 comments
Assignees
Labels
bigquery enhancement New feature or request

Comments

@drewbanin
Copy link
Contributor

Feature

Feature description

BigQuery recently announced support for table clustering. With this new feature, a list of up to four "clustering columns" can be supplied to a create table as (...) statement. BigQuery will use the clustering columns to colocate similar data. This should speed up filters and aggregations, and BigQuery can both skip scanning whole blocks of data, as well as avoid shuffling data between nodes for certain aggregations. Docs: https://cloud.google.com/bigquery/docs/creating-clustered-tables

Who will this benefit?

This feature will benefit users of BigQuery. It will work in concert with table partitioning.

Proposed Implementation

This feature can be implemented in the BigQuery implementation of create_table_as(). It's implementation will look similar to the partition_by implementation.

@beckjake beckjake self-assigned this Sep 6, 2018
@beckjake
Copy link
Contributor

beckjake commented Sep 6, 2018

cluster_by can't be used without partition_by. Should we enforce that in the macro, or let the database return the error when a user doesn't specify both?

@drewbanin
Copy link
Contributor Author

Yeah, I'm fine with letting BQ return an error in that scenario

beckjake added a commit that referenced this issue Sep 11, 2018
@beckjake
Copy link
Contributor

Fixed in #978

@drewbanin drewbanin added this to the 0.11.1 - Lucretia Mott milestone Sep 19, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bigquery enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants