Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added support for BigQuery relation renaming #2521

Merged
merged 6 commits into from
Jun 10, 2020
Merged

Added support for BigQuery relation renaming #2521

merged 6 commits into from
Jun 10, 2020

Conversation

azhard
Copy link
Contributor

@azhard azhard commented Jun 9, 2020

resolves #2520

Description

Implements functionality to rename BigQuery relations using dbt.

Checklist

  • I have signed the CLA
  • I have run this code in development and it appears to resolve the stated issue
  • This PR includes tests, or tests are not required/relevant for this PR
  • I have updated the CHANGELOG.md and added information about my change to the "dbt next" section.

@cla-bot cla-bot bot added the cla:yes label Jun 9, 2020
Copy link
Contributor

@beckjake beckjake left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey @azhard, thanks for contributing! This looks great, I have some minor feedback on ordering and some extra tests but this seems good.

Also, could you please add an integration test of some sort? The 054_adapter_methods test folder is probably appropriate. I'd hate to have this regress.

raise dbt.exceptions.NotImplementedException(
'`rename_relation` is not implemented for this adapter!'
)
self.cache_renamed(from_relation, to_relation)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you move this down to just before the copy/delete calls? I think we want to avoid updating the cache until we're just about to do it, if possible.

from_relation.schema,
from_relation.identifier,
conn)
from_table = client.get_table(from_table_ref)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this also ensure that from_relation.type and to_relation.type are the same type? And that to_relation.type is not RelationType.View?

@azhard
Copy link
Contributor Author

azhard commented Jun 9, 2020

Hey @azhard, thanks for contributing! This looks great, I have some minor feedback on ordering and some extra tests but this seems good.

Also, could you please add an integration test of some sort? The 054_adapter_methods test folder is probably appropriate. I'd hate to have this regress.

Thanks for the feedback @beckjake. For the integration test, can you give me some more guidance on what exactly I should be testing / how. So for example am I testing that the rename functionality works across all adapters and if that's correct, how do I "confirm" that a rename took place?

Or alternatively, am I testing the BigQuery specific rename (eg. copy and delete happen, doesn't work on views, etc.) If that's the case, I see similar testing done in test_bigquery_adapter.py file where I can utilize the mock connection so that to me is a little more clear.

@beckjake
Copy link
Contributor

beckjake commented Jun 9, 2020

So for example am I testing that the rename functionality works across all adapters and if that's correct, how do I "confirm" that a rename took place?

You can just test bigquery for this, since that's all you've added! Here's a test I imagined:

  • a macro (rename_named_relation or whatever) that builds a from_relation/to_relation based on some text parameters and then calls {{ rename_relation(...) }} with those relations
  • a seed (call it my_seed) with anything at all
  • a source that refers to a table named something different (renamed_seed or whatever)
  • a model that does select * from {{ source('my_source', 'renamed_seed') }}
  • a test that runs dbt seed, dbt run-operation rename_named_relation --args '{from_name: 'my_seed', to_name: 'renamed_seed'}, dbt run. None of those should error.

Does that seem reasonable?

@beckjake
Copy link
Contributor

beckjake commented Jun 9, 2020

Basically, I don't care how relations are renamed, but in my ideal world if bigquery offered "alter table rename ..." and we switched to using that, we'd probably want the test to still work.

@azhard
Copy link
Contributor Author

azhard commented Jun 9, 2020

Yeah that makes a ton of sense to me, I'll look into adding that

Comment on lines 3 to 4
{%- set from_relation = adapter.get_relation(database=target.database, schema=target.schema, identifier=from_name) -%}
{%- set to_relation = adapter.get_relation(database=target.database, schema=target.schema, identifier=to_name) -%}
Copy link
Contributor

@beckjake beckjake Jun 9, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this needs to use api.Relation.create(database=target.database, schema=target.schema, identifier=from_name, type=RelationType.View) rather than adapter.get_relation(), because if the relation doesn't exist yet, the value will be None.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh good call, except I'm guessing you meant type = Table instead of View as the view rename isn't currently supported

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, yes, that's exactly what I meant! I think it's actually got to be just the string 'table' - RelationType isn't available in macros.

@beckjake
Copy link
Contributor

beckjake commented Jun 9, 2020

Oh, I didn't understand what was going wrong with the postgres tests before, but now I realize. I think you'll need to put this new model/schema.yml into a new folder (maybe bigquery-models?), and override the models property on your new test class:

    @property
    def models(self):
        return 'bigquery-models'

@azhard
Copy link
Contributor Author

azhard commented Jun 9, 2020

@beckjake updated and 🤞 hopefully it should work now. I also updated Makefile (found a missing comma) and the BigQuery fields in test.env.sample as I realized that those values aren't used anywhere. The bigquery_profile function only looks for the BIGQUERY_SERVICE_ACCOUNT_JSON and BIGQUERY_TEST_ALT_DATABASE inside certain tests.

Let me know if you'd rather I didn't make these changes or if there's a reason they're written like that and I can revert them.

@beckjake
Copy link
Contributor

beckjake commented Jun 9, 2020

That all sounds great to me, thank you. I've kicked off the tests and we'll get this merged in for 0.18.0 once the tests pass. I don't really ever use the Makefile, so thank you for fixing that especially!

@azhard
Copy link
Contributor Author

azhard commented Jun 10, 2020

Any idea why the CI is failing? I don't thiiink the error message is related to any of my changes

@beckjake
Copy link
Contributor

It looks like azure pipelines has added an undocumented postgresql install that prevents us installing it 🙄
installed here: https://github.com/actions/virtual-environments/blob/a0b45fba7a6b8da1181ae23a6e593dd09c8a84e7/images/win/Windows2016-Azure.json#L323-L326
not documented here: https://github.com/actions/virtual-environments/blob/a0b45fba7a6b8da1181ae23a6e593dd09c8a84e7/images/win/Windows2016-Readme.md
There is a PR here that says it updates the docs, but I don't see where it actually ends up: actions/runner-images#993

I'm going to merge your PR anyway and I guess today is now a CI day.

@beckjake
Copy link
Contributor

Thanks for all your hard work on this @azhard! I'm going to merge this and it'll ship in 0.18.0.

@beckjake beckjake merged commit c9b3468 into dbt-labs:dev/marian-anderson Jun 10, 2020
@azhard azhard deleted the bigquery-rename-relation branch June 10, 2020 15:49
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Implement BigQuery Rename Relation functionality
2 participants