-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Disable most transactional logic on Snowflake #3480
Labels
Milestone
Comments
1 task
We had a chance to chat with some folks from Snowflake, who provided the following guidance:
Given the above, Snowflake strongly recommends three clear best practices:
|
4 tasks
Resolved by #3510 |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Describe the feature
Today, dbt treats Snowflake as it would any transactional database (e.g. Postgres, Redshift):
begin
+commit
sandwich all DDL/DML for materializations, tests, snapshot-freshness queries, etc. This has the effect of running more queries than strictly necessary (#2748). More recently, we've come to understand that transactional logic weights heavily in the Cloud Services layer.The simplest solution here would be:
begin
+commit
in only the places where it's needed, i.e. multiple DDL/DML statements that must be executed within a single transactionStep 1 could be as simple as replicating the code in dbt-bigquery:
https://github.com/fishtown-analytics/dbt/blob/eb46bfc3d6fd70ee5352fa758470da03dc3700d1/plugins/bigquery/dbt/adapters/bigquery/connections.py#L147-L148
https://github.com/fishtown-analytics/dbt/blob/eb46bfc3d6fd70ee5352fa758470da03dc3700d1/plugins/bigquery/dbt/adapters/bigquery/connections.py#L196-L200
My sense is that the code changes for this are actually quite easy. The trickiness comes in considering and testing all possible permutations / edge cases, and validating that Snowflake's empirical behavior in each case matches its documented behavior. To that end, it's worth really digging into how
autocommit
works on Snowflake (docs), and understanding exactly how dbt interacts with each note below:Here's what sticks out to me:
AUTOCOMMIT
is on by default, but it could be off. Should dbt demand to work withAUTOCOMMIT
on? Should we add this as an option inprofiles.yml
?AUTOCOMMIT
is off.delete+insert
incremental strategy requires running two DML statements within the same transaction. We'll need explicitbegin
+end
logic there.Another trade-off here:
auto_begin
(for statements) andinside_transaction
(for pre- and post-hooks) would no longer work. But it's not clear to me that they're working well today, either. We could advise users to explicitly specifybegin
+commit
within their statements or hook definitions, rather than relying on dbt's built-in methods to open up or close out transactions on their behalf.Describe alternatives you've considered
begin
+commit
in place, to the tune of much unnecessary credit expenditureWho will this benefit?
dbt + Snowflake users
The text was updated successfully, but these errors were encountered: