Skip to content

Commit

Permalink
Documentation for snapshotting hard deleted records.
Browse files Browse the repository at this point in the history
  • Loading branch information
joelluijmes committed Sep 18, 2020
1 parent 57cb993 commit b4e7188
Show file tree
Hide file tree
Showing 4 changed files with 95 additions and 12 deletions.
35 changes: 34 additions & 1 deletion website/docs/docs/building-a-dbt-project/snapshots.md
Original file line number Diff line number Diff line change
Expand Up @@ -233,6 +233,39 @@ The `check` snapshot strategy can be configured to track changes to _all_ column

</File>


### Hard deletes (opt-in)
Rows that are deleted from the source query are not invalidated by default. With the config option `invalidate_hard_deletes`, dbt can track rows that no longer exist. This is done by left joining the snapshot table with the source table, and filtering the rows that are still valid at that point, but no longer can be found in the source table. `dbt_valid_to` will be set to the current snapshot time.

This configuration is not a different strategy as described above, but is an additional opt-in feature. It is not enabled by default since it alters the previous behavior.

For this configuration to work, the configured `updated_at` column must be of timestamp type. Otherwise, queries will fail due mixing data types.

**Example Usage**

<File name='snapshots/hard_delete_example.sql'>

```sql
{% snapshot orders_snapshot_hard_delete %}

{{
config(
target_schema='snapshots',
strategy='timestamp',
unique_key='id',
updated_at='updated_at',
invalidate_hard_deletes=True,
)
}}

select * from {{ source('jaffle_shop', 'orders') }}

{% endsnapshot %}
```

</File>


## Configuring snapshots
### Snapshot configurations
There are a number of snapshot-specific configurations:
Expand All @@ -245,6 +278,7 @@ There are a number of snapshot-specific configurations:
| [unique_key](unique_key) | A primary key column or expression for the record | Yes | order_id |
| [check_cols](check_cols) | If using the `check` strategy, then the columns to check | Only if using the `check` strategy | ["status"] |
| [updated_at](updated_at) | If using the `timestamp` strategy, the timestamp column to compare | Only if using the `timestamp` strategy | updated_at |
| [invalidate_hard_deletes](invalidate_hard_deletes) | Find hard deleted records in source, and set `dbt_valid_to` current time if no longer exists | No | True |

A number of other configurations are also supported (e.g. `tags` and `post-hook`), check out the full list [here](snapshot-configs).

Expand Down Expand Up @@ -362,7 +396,6 @@ Snapshot results:

## FAQs
<FAQ src="run-one-snapshot" />
<FAQ src="snapshot-hard-deletes" />
<FAQ src="snapshot-frequency" />
<FAQ src="snapshot-schema-changes" />
<FAQ src="snapshot-hooks" />
Expand Down
11 changes: 0 additions & 11 deletions website/docs/faqs/snapshot-hard-deletes.md

This file was deleted.

60 changes: 60 additions & 0 deletions website/docs/reference/resource-configs/invalidate_hard_deletes.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,60 @@
---
resource_types: [snapshots]
datatype: column_name
---
<File name='snapshots/<filename>.sql'>

```jinja2
{{
config(
strategy="timestamp",
invalidate_hard_deletes=True
)
}}
```

</File>

<File name='dbt_project.yml'>

```yml
snapshots:
[<resource-path>](resource-path):
+strategy: timestamp
+invalidate_hard_deletes: true

```

</File>

## Description
Opt-in feature to enable invalidating hard deleted records while snapshotting the query.


## Default
By default the feature is disabled.

## Example

<File name='snapshots/orders.sql'>

```sql
{% snapshot orders_snapshot %}

{{
config(
target_schema='snapshots',
strategy='timestamp',
unique_key='id',
updated_at='updated_at',
invalidate_hard_deletes=True,
)
}}

select * from {{ source('jaffle_shop', 'orders') }}

{% endsnapshot %}
```

</File>
1 change: 1 addition & 0 deletions website/sidebars.js
Original file line number Diff line number Diff line change
Expand Up @@ -277,6 +277,7 @@ module.exports = {
"reference/resource-configs/target_schema",
"reference/resource-configs/unique_key",
"reference/resource-configs/updated_at",
"reference/resource-configs/invalidate_hard_deletes",
],
},
"reference/resource-configs/bigquery-configs",
Expand Down

0 comments on commit b4e7188

Please sign in to comment.