Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

POC: Incremental Predicates #3294

Conversation

dm03514
Copy link

@dm03514 dm03514 commented Apr 25, 2021

refs #3293

Description

This PR updates the snowflake delete+insert incrementnal load stratgy to allow for the config to specify incremental predicates:

{{
    config(
      materialized='incremental',
      incremental_strategy='delete+insert',
      incremental_predicates=[
        "collector_hour >= dateadd('day', -7, CONVERT_TIMEZONE('UTC', current_timestamp()))"
      ],
      unique_key='unique_id'
    )
}}

I believe that this is distinct from the predicates which is already present in some of the functions. It looks like predicates applies directly to merge column. incremental_predicates are applied globally to the top level DML.

TODO

  • default incremental - add param
  • default incremental upsert - add param
  • common/merge - add param
  • snowflake merge
  • snowflake incremental
  • bigquery incremental
  • bigquery merge
  • postgres incremental
  • postgres merge
  • redshift incremental
  • redshift merge

Checklist

  • I have signed the CLA
  • I have run this code in development and it appears to resolve the stated issue
  • This PR includes tests, or tests are not required/relevant for this PR
  • I have updated the CHANGELOG.md and added information about my change to the "dbt next" section.

@cla-bot
Copy link

cla-bot bot commented Apr 25, 2021

Thanks for your pull request, and welcome to our community! We require contributors to sign our Contributor License Agreement and we don't seem to have your signature on file. Check out this article for more information on why we have a CLA.

In order for us to review and merge your code, please submit the Individual Contributor License Agreement form attached above above. If you have questions about the CLA, or if you believe you've received this message in error, don't hesitate to ping @drewbanin.

CLA has not been signed by users: @dm03514

Copy link
Contributor

@jtcohen6 jtcohen6 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe that this is distinct from the predicates which is already present in some of the functions. It looks like predicates applies directly to merge column. incremental_predicates are applied globally to the top level DML.

Is that so? I've definitely been thinking about these as the same thing: a way to filter the target table, and limit the amount of data scanned. In delete+insert (upsert) operations, that just looks like a filter on the bottom, but in merge operations, I think that filter rightly goes on the join/match criteria. I may be thinking about this wrong, so let me know if you disagree.

@@ -1,25 +1,26 @@


{% macro get_merge_sql(target, source, unique_key, dest_columns, predicates=none) -%}
{{ adapter.dispatch('get_merge_sql')(target, source, unique_key, dest_columns, predicates) }}
{% macro get_merge_sql(target, source, unique_key, dest_columns, predicates=none, incrementnal_predicates=none) -%}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
{% macro get_merge_sql(target, source, unique_key, dest_columns, predicates=none, incrementnal_predicates=none) -%}
{% macro get_merge_sql(target, source, unique_key, dest_columns, predicates=none, incremental_predicates=none) -%}

{%- endmacro %}


{% macro default__get_merge_sql(target, source, unique_key, dest_columns, predicates) -%}
{% macro default__get_merge_sql(target, source, unique_key, dest_columns, predicates, incremental_predictates=none) -%}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
{% macro default__get_merge_sql(target, source, unique_key, dest_columns, predicates, incremental_predictates=none) -%}
{% macro default__get_merge_sql(target, source, unique_key, dest_columns, predicates, incremental_predicates=none) -%}

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jtcohen6 sorry i def think you're right on re the predicates. We've adopted both strategies internally, and we are using the predicates directly on the merge statement, as you mentioned. I was thinking about it wrong :p

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants