Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Not dropping table for incremental full refresh with delta #287

Merged
merged 9 commits into from
Jul 5, 2022
Merged

Not dropping table for incremental full refresh with delta #287

merged 9 commits into from
Jul 5, 2022

Conversation

grindheim
Copy link
Contributor

resolves 286

Description

Since the spark__create_table_as implementation already runs create or replace table if the file format is delta, this PR disables dropping the table first in this case. The advantage being that it will preserve grants on the table, in addition to it simply being unnecessary.

Checklist

  • I have signed the CLA
  • I have run this code in development and it appears to resolve the stated issue
  • This PR includes tests, or tests are not required/relevant for this PR
  • I have updated the CHANGELOG.md and added information about my change to the "dbt-spark next" section.

@cla-bot
Copy link

cla-bot bot commented Feb 11, 2022

Thank you for your pull request and welcome to our community. We could not parse the GitHub identity of the following contributors: Sindre Grindheim.
This is most likely caused by a git client misconfiguration; please make sure to:

  1. check if your git client is configured with an email to sign commits git config --list | grep email
  2. If not, set it up using git config --global user.email email@example.com
  3. Make sure that the git commit email is configured in your GitHub account settings, see https://github.com/settings/emails

1 similar comment
@cla-bot
Copy link

cla-bot bot commented Feb 11, 2022

Thank you for your pull request and welcome to our community. We could not parse the GitHub identity of the following contributors: Sindre Grindheim.
This is most likely caused by a git client misconfiguration; please make sure to:

  1. check if your git client is configured with an email to sign commits git config --list | grep email
  2. If not, set it up using git config --global user.email email@example.com
  3. Make sure that the git commit email is configured in your GitHub account settings, see https://github.com/settings/emails

@cla-bot
Copy link

cla-bot bot commented Feb 11, 2022

The cla-bot has been summoned, and re-checked this pull request!

@cla-bot
Copy link

cla-bot bot commented Feb 11, 2022

Thank you for your pull request and welcome to our community. We could not parse the GitHub identity of the following contributors: Sindre Grindheim.
This is most likely caused by a git client misconfiguration; please make sure to:

  1. check if your git client is configured with an email to sign commits git config --list | grep email
  2. If not, set it up using git config --global user.email email@example.com
  3. Make sure that the git commit email is configured in your GitHub account settings, see https://github.com/settings/emails

@cla-bot
Copy link

cla-bot bot commented Feb 11, 2022

The cla-bot has been summoned, and re-checked this pull request!

@cla-bot cla-bot bot added the cla:yes label Feb 11, 2022
@cla-bot
Copy link

cla-bot bot commented Feb 11, 2022

The cla-bot has been summoned, and re-checked this pull request!

@McKnight-42 McKnight-42 self-requested a review May 4, 2022 15:17
@McKnight-42
Copy link
Contributor

Hi @grindheim sorry for the delay in response, gathering context now , and was wondering if you wouldn't mind updating your code to be in parity with main and re-push up so we can start to get some hanging tests running in the meantime.

@grindheim
Copy link
Contributor Author

@McKnight-42 Updated the code. Let me know if there's anything else you'd like me to do.

@McKnight-42
Copy link
Contributor

@grindheim for some reason the unit test didn't kick off and looks like we might need to now also update to main to take get access to our newly added black formatter and pre-commit functionality in-order for the code-quality check to pass. i'm hoping doing this will trigger both tests if not I'll keep looking into why the unit test didn't trigger. sorry for any inconvenience.

@jtcohen6 jtcohen6 added the triage:ready-for-review Externally contributed PR has functional approval, ready for code review from Core engineering label Jun 15, 2022
Copy link
Contributor

@jtcohen6 jtcohen6 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry for the delay @grindheim! Hoping we can get this merged soon, it'll definitely help out folks who are running into this issue. One small comment from me on the specific code change.

Comment on lines 32 to 34
{% if existing_relation.is_view or file_format != 'delta' %}
{% do adapter.drop_relation(existing_relation) %}
{% endif %}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Two quick thoughts here:

  • I think this conditional logic can be simplified to avoid the nested if statements
  • We need to check both that the table already exists in Delta format, and will be created in Delta format

So further up:

{% set is_delta = (file_format == 'delta' and existing_relation.is_delta) %}

And then:

  {% elif existing_relation.is_view or (full_refresh_mode and not is_delta) %}
    {% do adapter.drop_relation(existing_relation) %}

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the thoughts and suggestions, and sorry for my delay. I've implemented and pushed your suggested changes.

@grindheim grindheim requested a review from jtcohen6 June 28, 2022 10:45
Copy link
Contributor

@jtcohen6 jtcohen6 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@grindheim Noticed one thing in the logic that doesn't look quite right. If you could take a look in the next few days, we can try to get this in for v1.2!

Co-authored-by: Jeremy Cohen <jtcohen6@gmail.com>
@jtcohen6
Copy link
Contributor

jtcohen6 commented Jul 4, 2022

Just going to push up an empty commit commit an update to the changelog, and to trigger CircleCI tests

Copy link
Contributor

@jtcohen6 jtcohen6 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks so much @grindheim! Glad to finally get this in :)

ueshin added a commit to databricks/dbt-databricks that referenced this pull request Jul 5, 2022
### Description

Ports the upstream change: Not dropping table for incremental full refresh with delta (dbt-labs/dbt-spark#287)
francescomucio pushed a commit to francescomucio/dbt-spark that referenced this pull request Jul 26, 2022
)

* Not dropping table for incremental full refresh with delta

* Updated changelog

* Simplified conditional logic according to suggestion

* Updated changelog

* Only drop table if not delta table

Co-authored-by: Jeremy Cohen <jtcohen6@gmail.com>

* Update changelog, trigger CircleCI tests

Co-authored-by: Jeremy Cohen <jeremy@dbtlabs.com>
Co-authored-by: Jeremy Cohen <jtcohen6@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cla:yes triage:ready-for-review Externally contributed PR has functional approval, ready for code review from Core engineering
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants