Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature: support for arbitrary predicates for incremental models #4546

Closed
wants to merge 2 commits into from

Conversation

dave-connors-3
Copy link
Contributor

@dave-connors-3 dave-connors-3 commented Jan 3, 2022

resolves #3293

Description

This PR adds an optional configuration to incremental models that allows the user to supply an arbitrary filter expression to merge/delete statements to reduce the amount of data scanned while performing those operations.

V1 of this logic allows the user to pass a list of dictionaries to the new incremental predicates config to supply a column and expression to evaluate at runtime. The dictionary format allows proper table aliasing in these statements.

{{
    config(
        materialized='incremental',
        unique_key = 'customerorderid',
        incremental_predicates = [{
            'source_col' : 'customerorderid',
            'expression' : '> 1'
        }]
        
    )
}}

Checklist

  • I have signed the CLA
  • I have run this code in development and it appears to resolve the stated issue
  • This PR includes tests, or tests are not required/relevant for this PR
  • I have updated the CHANGELOG.md and added information about my change

@cla-bot cla-bot bot added the cla:yes label Jan 3, 2022
@jtcohen6 jtcohen6 added the Team:Adapters Issues designated for the adapter area of the code label Jan 4, 2022
@dave-connors-3 dave-connors-3 changed the title postgres support for predicates Feature: support for arbitrary predicates for incremental models Jan 5, 2022
#}
{%- if user_predicates -%}
{%- set predicates %}
{%- for condition in user_predicates -%} and {{ target_relation.name }}.{{ condition.source_col }} {{ condition.expression }} {% endfor -%}

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is there a reason to call the parameter source_col?

I'd expect something like target_col or target_column to clarify that the predicate is applied to the target table in the context of an incremental load into a target model (sometimes src or source is used as alias in a merge statement)

{#

This behavior should only be observed when dbt calls the default
`get_delete_insert_merge_sql` strategy in dbt-core

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

does this mean incremental predicates will only work for delete+insert strategy? is this just to get the feature started with least amount of work, or is there a blocker to implement for the merge strategy?

I don't exactly have an exact use case to share, so just curious and generally thinking about pros/cons of moving from merge to delete+insert strategy.

@dave-connors-3
Copy link
Contributor Author

@jtcohen6 -- I think it's fair to say this is a bit outdated -- i had a new branch going most recently that I can use to consolidate with @NiallRees's open PR, so we can close this one!

@jtcohen6 jtcohen6 closed this Aug 18, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cla:yes Team:Adapters Issues designated for the adapter area of the code
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Incremental Load Predicates To Bound unique_id scans
3 participants