Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Downtime for incremental models on full-refresh runs (Snowflake) #1387

Closed
clrcrl opened this issue Apr 3, 2019 · 3 comments
Closed

Downtime for incremental models on full-refresh runs (Snowflake) #1387

clrcrl opened this issue Apr 3, 2019 · 3 comments

Comments

@clrcrl
Copy link
Contributor

clrcrl commented Apr 3, 2019

Full refresh runs on incremental models on Snowflake result in either:

  • The table not existing while it is being rebuilt (destructive runs)
  • The table existing but having no records in it while it is being rebuilt (non-destructive runs)

This is because Snowflake doesn't have transactions, so we can't do a nice swap within one.
Can we instead do some trickery around creating a "my_table__dbt_tmp" style materialization?

@clrcrl
Copy link
Contributor Author

clrcrl commented Apr 3, 2019

cc: @bastienboutonnet

@bastienboutonnet
Copy link
Contributor

bastienboutonnet commented Apr 6, 2019

Just a follow up on this. I was looking into the code (it's the first time so I may have gotten this completely wrong) and it looks like you guys implement a non-destructive run as follows:
https://github.com/fishtown-analytics/dbt/blob/dev/wilt-chamberlain/core/dbt/include/global_project/macros/materializations/table/table.sql#L49-L62

  1. backup old
  2. truncate old
  3. create new (intermediary)
  4. insert new into truncated old.

But I would assume truncating or dropping could be avoided all together by leveraging the swap function of snowflake (https://docs.snowflake.net/manuals/sql-reference/sql/alter-table.html) alter table <old_table> swap with <new_table> the flow would be something like that:

  1. backup
  2. generate new
  3. swap new with old
  4. drop backup and "new" which is now old

But maybe I'm missing a few subtleties.
Oh, also, I know the issue was originally created about incremental models but it's valid for both and I assume a full-refresh on an incremental run is the same as a regular table but again I might be wrong.

If the snowflake logic I outlined seems to make sense I'd be happy to help implement it.

@drewbanin
Copy link
Contributor

This is a dupe of #525, closing.

@bastienboutonnet we intend to eventually remove the --non-destructive flag altogether. That functionality was built to work around cascading drops (which only affect postgres/redshift, not Snowflake). Really, there's limited merit to doing a truncate on a database like Snowflake IMO, but happy to discuss if you have any thoughts.

The root cause here is that we drop the table before re-building it in full-refresh mode. While alter table ... swap will probably be helpful here on Snowflake, we'll also need to generally change the order of operations in the incremental materialization.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants