Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

New components for incremental strategies #1761

Closed
1 task done
Tracked by #9290
jtcohen6 opened this issue Jul 21, 2022 · 2 comments
Closed
1 task done
Tracked by #9290

New components for incremental strategies #1761

jtcohen6 opened this issue Jul 21, 2022 · 2 comments
Assignees
Labels
adapters content Improvements or additions to content improvement Use this when an area of the docs needs improvement as it's currently unclear incremental models priority: high Technical inaccuracy, missing/incorrect information, or broken links. Negatively affects workflows size: large This change will more than a week to address and might require more than one person

Comments

@jtcohen6
Copy link
Collaborator

jtcohen6 commented Jul 21, 2022

Contributions

  • I have read the contribution docs, and understand what's expected of me.

Link to the page on docs.getdbt.com requiring updates

What part(s) of the page would you like to see updated?

New pattern

  • incremental_strategy is a real node config
  • If incremental_strategy is set to XYZ, dbt will look for a macro named get_incremental_XYZ_sql, to return the SQL (DML) it needs to run in the incremental materialization
  • Default value of incremental_strategy config is None, in which case dbt will look for a macro named get_incremental_default_sql
  • The default strategy for the base adapter is append, i.e. insert into (no upsert/merge, no unique_key / partition support). This is a change worth calling out: It used to be delete+insert, implicitly. Now it is append explicitly.

Built-in strategies and their corresponding macros

incremental_strategy Corresponding macro
append get_incremental_append_sql
delete+insert get_incremental_delete_insert_sql
merge get_incremental_merge_sql
insert_overwrite get_incremental_insert_overwrite_sql

For adapter maintainers

  • Adapters declare which incremental strategy macro to use by default, by overriding get_incremental_default_sql, and passing in the appropriate macro (get_incremental_delete_insert_sql, get_incremental_merge_sql, etc)
  • Adapters explicitly declare which incremental strategies are supported, via a valid_incremental_strategies method (Python) that returns a list of strategy names. They can also override any of the get_incremental_x_sql macros with a more database-friendly implementation.
  • Adapter maintainers can add new strategies that are both "valid" + "builtin"
  • For the current built-in strategies (append, delete+insert, merge, insert_overwrite), dbt will validate that the user has selected one that their adapter explicitly supports

For end users (covered by #4716)

  • Users can also define their own "custom" strategies, by defining a macro get_incremental_silly_sql, and then configuring incremental_strategy: silly.
  • dbt won't validate user-defined strategies, it will just look for the macro by that name, and raise an error if it can't find one.

The benefit to end users is that there's a lot less code involved than if creating an entirely new materialization

Toggle to show silly example

Here's an example of an incremental model that just appends (i.e., identical behavior as the append strategy).

Add a file named macros/whatever_file_name_you_want.sql:

{% macro get_incremental_silly_sql(arg_dict) %}

  {% do return(some_custom_macro_with_sql(arg_dict["target_relation"], arg_dict["temp_relation"], arg_dict["unique_key"], arg_dict["dest_columns"], arg_dict["incremental_predicates"])) %}

{% endmacro %}


{% macro some_custom_macro_with_sql(target_relation, temp_relation, unique_key, dest_columns, incremental_predicates) %}

    {%- set dest_cols_csv = get_quoted_csv(dest_columns | map(attribute="name")) -%}

    insert into {{ target_relation }} ({{ dest_cols_csv }})
    (
        select {{ dest_cols_csv }}
        from {{ temp_relation }}
    )

{% endmacro %}

Define a model models/my_model.sql:

{{ config(
    materialized="incremental",
    incremental_strategy="silly",
    unique_key="id",
) }}

select * from {{ ref("some_model") }}

Outstanding questions

  • How can users or organization share their custom incremental strategies or overrides? Can they share in a dbt package on the dbt Package Hub for example?
  • What is the resolution order when a project or a package has the same macro name as that for a built-in incremental strategy?

See dbt-core #8769 for a related conversation.

Additional information

Changes in dbt-core: dbt-labs/dbt-core#5359

Refactor in dbt-snowflake, to use core changes: dbt-labs/dbt-snowflake#196

Base compatibility (for now), without actually taking advantage of full core changes:

@jtcohen6 jtcohen6 added content Improvements or additions to content improvement Use this when an area of the docs needs improvement as it's currently unclear dbt-core v1.3 Docs impact for the v1.3 release (Oct 2022) adapters labels Jul 21, 2022
@jtcohen6 jtcohen6 added the idea Proposes an idea for new content label Sep 21, 2022
@jtcohen6 jtcohen6 self-assigned this Nov 28, 2022
dataders added a commit that referenced this issue Jun 14, 2023
…ud supported adapters (#3530)

## What are you changing in this pull request and why?
<!---
Describe your changes and why you're making them. If linked to an open
issue or a pull request on dbt Core, then link to them here! 

To learn more about the writing conventions used in the dbt Labs docs,
see the [Content style
guide](https://github.com/dbt-labs/docs.getdbt.com/blob/current/contributing/content-style-guide.md).
-->

This section of our docs page was under-documented. The entire page
needs TLC, and will receive it next sprint (see #1761). This is a
monkeypatch to enable @jiezhen-chen to contribute a docs PR for
dbt-labs/dbt-redshift#490.

## Checklist
- [ ] Review the [Content style
guide](https://github.com/dbt-labs/docs.getdbt.com/blob/current/contributing/content-style-guide.md)
and [About
versioning](https://github.com/dbt-labs/docs.getdbt.com/blob/current/contributing/single-sourcing-content.md#adding-a-new-version)
so my content adheres to these guidelines.
- [ ] Add a checklist item for anything that needs to happen before this
PR is merged, such as "needs technical review" or "change base branch."
@runleonarun runleonarun removed the dbt-core v1.3 Docs impact for the v1.3 release (Oct 2022) label Oct 12, 2023
@dbeatty10 dbeatty10 added priority: high Technical inaccuracy, missing/incorrect information, or broken links. Negatively affects workflows size: large This change will more than a week to address and might require more than one person labels Jan 4, 2024
mirnawong1 added a commit that referenced this issue Jan 8, 2024
[Preview](https://docs-getdbt-com-git-dbeatty-custom-incremental-d92d96-dbt-labs.vercel.app/docs/build/incremental-models#custom-strategies)

## What are you changing in this pull request and why?

This addresses the "**For end users**" portion of
#1761.

The feature request in dbt-labs/dbt-core#5245
describes the value proposition as well as the previous and new
behavior:

#### Functional Requirement
- Advanced users that wish to specify a custom incremental strategy must
be able to do so.

#### Previous behavior
- Advanced dbt users who wished to specify a custom incremental strategy
must override the same boilerplate Jinja macro by copy pasting it into
their dbt project.

#### New behavior
- Advanced dbt users who wish to specify a custom incremental strategy
will only need to create a macro that conforms to the naming convention
`get_incremental_NAME_sql` that produces the correct SQL for the target
warehouse.

## Also

To address the questions raised in
dbt-labs/dbt-core#8769, we also want to
document how to utilize custom incremental macros that come from a
package.

For example, to use the `merge_null_safe` custom incremental strategy
from the `example` package, first [install the
package](/build/packages#how-do-i-add-a-package-to-my-project), then add
this macro to your project:

```sql
{% macro get_incremental_merge_null_safe_sql(arg_dict) %}
    {% do return(example.get_incremental_merge_null_safe_sql(arg_dict)) %}
{% endmacro %}
```

## 🎩 

<img width="503" alt="image"
src="https://github.com/dbt-labs/docs.getdbt.com/assets/44704949/51c3266e-e3fb-49bd-9428-7c43920a5412">

## Checklist
- [x] Review the [Content style
guide](https://github.com/dbt-labs/docs.getdbt.com/blob/current/contributing/content-style-guide.md)
so my content adheres to these guidelines.
- [x] For [docs
versioning](https://github.com/dbt-labs/docs.getdbt.com/blob/current/contributing/single-sourcing-content.md#about-versioning),
review how to [version a whole
page](https://github.com/dbt-labs/docs.getdbt.com/blob/current/contributing/single-sourcing-content.md#adding-a-new-version)
and [version a block of
content](https://github.com/dbt-labs/docs.getdbt.com/blob/current/contributing/single-sourcing-content.md#versioning-blocks-of-content).
@runleonarun runleonarun removed the idea Proposes an idea for new content label Mar 13, 2024
@mirnawong1
Copy link
Contributor

Matt will review and verify if this was handled. See docs here

@mirnawong1
Copy link
Contributor

hey @jtcohen6 , i believe the docs covers this now and is addressed (example here). please let us know if not though and we'll be happy to reopen!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
adapters content Improvements or additions to content improvement Use this when an area of the docs needs improvement as it's currently unclear incremental models priority: high Technical inaccuracy, missing/incorrect information, or broken links. Negatively affects workflows size: large This change will more than a week to address and might require more than one person
Projects
None yet
Development

No branches or pull requests

6 participants