Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add dbt_valid_to_current #6308

Open
wants to merge 26 commits into
base: current
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from 17 commits
Commits
Show all changes
26 commits
Select commit Hold shift + click to select a range
a2d5c4c
add page
mirnawong1 Oct 17, 2024
3d54fee
add custom future date
mirnawong1 Oct 17, 2024
2093b8b
Merge branch 'current' into add-custom-date
mirnawong1 Oct 17, 2024
e57fbc2
Merge branch 'current' into add-custom-date
mirnawong1 Oct 18, 2024
d1d9aeb
Merge branch 'current' into add-custom-date
mirnawong1 Oct 21, 2024
614e368
Merge branch 'current' into add-custom-date
mirnawong1 Oct 22, 2024
7b846e7
Merge branch 'current' into add-custom-date
mirnawong1 Oct 28, 2024
99cfb62
Merge branch 'current' into add-custom-date
mirnawong1 Oct 30, 2024
0d788e4
Merge branch 'current' into add-custom-date
mirnawong1 Nov 4, 2024
44b6e14
Update website/docs/reference/resource-configs/dbt_valid_to_current.md
mirnawong1 Nov 4, 2024
7729456
Merge branch 'current' into add-custom-date
mirnawong1 Nov 4, 2024
50cdb0a
Merge branch 'current' into add-custom-date
mirnawong1 Nov 4, 2024
1db6875
Merge branch 'current' into add-custom-date
mirnawong1 Nov 7, 2024
82864c0
Merge branch 'current' into add-custom-date
mirnawong1 Nov 12, 2024
bea946d
Merge branch 'current' into add-custom-date
mirnawong1 Nov 13, 2024
33483f0
fold in doug's feedback
mirnawong1 Nov 13, 2024
9be457a
Update release-notes.md
mirnawong1 Nov 13, 2024
40eac92
Update website/docs/reference/resource-configs/dbt_valid_to_current.md
mirnawong1 Nov 13, 2024
bf40b21
Update dbt_valid_to_current.md
mirnawong1 Nov 13, 2024
c1eb3f1
Merge branch 'current' into add-custom-date
mirnawong1 Nov 13, 2024
15e71c9
Update website/docs/reference/resource-configs/dbt_valid_to_current.md
mirnawong1 Nov 15, 2024
3dd2a3d
Merge branch 'current' into add-custom-date
mirnawong1 Nov 15, 2024
704879f
Update website/docs/docs/build/snapshots.md
mirnawong1 Nov 15, 2024
3fb5435
Update website/docs/docs/build/snapshots.md
mirnawong1 Nov 15, 2024
50c0f35
Merge branch 'current' into add-custom-date
mirnawong1 Nov 15, 2024
edd6df8
Merge branch 'current' into add-custom-date
mirnawong1 Nov 15, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
37 changes: 23 additions & 14 deletions website/docs/docs/build/snapshots.md
Original file line number Diff line number Diff line change
Expand Up @@ -36,12 +36,6 @@

## Configuring snapshots

:::info Previewing or compiling snapshots in IDE not supported

It is not possible to "preview data" or "compile sql" for snapshots in dbt Cloud. Instead, [run the `dbt snapshot` command](#how-snapshots-work) in the IDE.

:::

<VersionBlock lastVersion="1.8" >

- To configure snapshots in versions 1.8 and earlier, refer to [Configure snapshots in versions 1.8 and earlier](#configure-snapshots-in-versions-18-and-earlier). These versions use an older syntax where snapshots are defined within a snapshot block in a `.sql` file, typically located in your `snapshots` directory.
Expand Down Expand Up @@ -70,7 +64,7 @@
[updated_at](/reference/resource-configs/updated_at): column_name
[invalidate_hard_deletes](/reference/resource-configs/invalidate_hard_deletes): true | false
[snapshot_meta_column_names](/reference/resource-configs/snapshot_meta_column_names): dictionary

[dbt_valid_to_current](/reference/resource-configs/dbt_valid_to_current): string
```

</File>
Expand All @@ -87,13 +81,14 @@
| [check_cols](/reference/resource-configs/check_cols) | If using the `check` strategy, then the columns to check | Only if using the `check` strategy | ["status"] |
| [updated_at](/reference/resource-configs/updated_at) | If using the `timestamp` strategy, the timestamp column to compare | Only if using the `timestamp` strategy | updated_at |
| [invalidate_hard_deletes](/reference/resource-configs/invalidate_hard_deletes) | Find hard deleted records in source and set `dbt_valid_to` to current time if the record no longer exists | No | True |
| [dbt_valid_to_current](/reference/resource-configs/dbt_valid_to_current) | Set a custom future date for `dbt_valid_to` in new snapshot columns. When set, dbt will use this specified value instead of `NULL` for `dbt_valid_to` in the snapshot table. | No | string |
mirnawong1 marked this conversation as resolved.
Show resolved Hide resolved
mirnawong1 marked this conversation as resolved.
Show resolved Hide resolved
| [snapshot_meta_column_names](/reference/resource-configs/snapshot_meta_column_names) | Customize the names of the snapshot meta fields | No | dictionary |


- In versions prior to v1.9, the `target_schema` (required) and `target_database` (optional) configurations defined a single schema or database to build a snapshot across users and environment. This created problems when testing or developing a snapshot, as there was no clear separation between development and production environments. In v1.9, `target_schema` became optional, allowing snapshots to be environment-aware. By default, without `target_schema` or `target_database` defined, snapshots now use the `generate_schema_name` or `generate_database_name` macros to determine where to build. Developers can still set a custom location with [`schema`](/reference/resource-configs/schema) and [`database`](/reference/resource-configs/database) configs, consistent with other resource types.
- A number of other configurations are also supported (for example, `tags` and `post-hook`). For the complete list, refer to [Snapshot configurations](/reference/snapshot-configs).
- You can configure snapshots from both the `dbt_project.yml` file and a `config` block. For more information, refer to the [configuration docs](/reference/snapshot-configs).


### Add a snapshot to your project

To add a snapshot to your project follow these steps. For users on versions 1.8 and earlier, refer to [Configure snapshots in versions 1.8 and earlier](#configure-snapshots-in-versions-18-and-earlier).
Expand All @@ -112,6 +107,7 @@
unique_key: id
strategy: timestamp
updated_at: updated_at
dbt_valid_to_current: "to_date('9999-12-31')" # Specifies that current records should have `dbt_valid_to` set to `'9999-12-31'` instead of `NULL`.

```
</File>
Expand Down Expand Up @@ -172,6 +168,15 @@

</Expandable>


<Expandable alt_header="Use dbt_valid_to_current for easier date range queries">

By default, `dbt_valid_to` is `NULL` for current records. However, if you set the [`dbt_valid_to_current` configuration](/reference/resource-configs/dbt_valid_to_current) (available in Versionless and 1.9 and higher), `dbt_valid_to` will be set to your specified value (such as `9999-12-31`) for current records.

This simplifies your SQL queries by avoiding `NULL` checks and allowing for straightforward date range filtering.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What does "simplifies your SQL queries by avoiding NULL checks" mean?

Copy link
Contributor Author

@mirnawong1 mirnawong1 Nov 15, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i meant users not needing to include checks or add'l logic to check for null records.

i can remove if it's confusing or clarify further!


</Expandable>

<Expandable alt_header="Ensure your unique key is really unique">

The unique key is used by dbt to match rows up, so it's extremely important to make sure this key is actually unique! If you're snapshotting a source, I'd recommend adding a uniqueness test to your source ([example](https://github.com/dbt-labs/jaffle_shop/blob/8e7c853c858018180bef1756ec93e193d9958c5b/models/staging/schema.yml#L26)).
Expand Down Expand Up @@ -204,12 +209,14 @@
### How snapshots work

When you run the [`dbt snapshot` command](/reference/commands/snapshot):
* **On the first run:** dbt will create the initial snapshot table — this will be the result set of your `select` statement, with additional columns including `dbt_valid_from` and `dbt_valid_to`. All records will have a `dbt_valid_to = null`.
* **On the first run:** dbt will create the initial snapshot table — this will be the result set of your `select` statement, with additional columns including `dbt_valid_from` and `dbt_valid_to`. All records will have a `dbt_valid_to = null` or the value specified in [`dbt_valid_to_current`](/reference/resource-configs/dbt_valid_to_current) (available in Versionless and 1.9 and higher) if configured.
* **On subsequent runs:** dbt will check which records have changed or if any new records have been created:
- The `dbt_valid_to` column will be updated for any existing records that have changed
- The updated record and any new records will be inserted into the snapshot table. These records will now have `dbt_valid_to = null`
- The `dbt_valid_to` column will be updated for any existing records that have changed.
- The updated record and any new records will be inserted into the snapshot table. These records will now have `dbt_valid_to = null` or the value configured in `dbt_valid_to_current` (available in Versionless and 1.9 and higher).

Note, these column names can be customized to your team or organizational conventions using the [snapshot_meta_column_names](#snapshot-meta-fields) config.
#### Note
- These column names can be customized to your team or organizational conventions using the [snapshot_meta_column_names](#snapshot-meta-fields) config.

Check warning on line 218 in website/docs/docs/build/snapshots.md

View workflow job for this annotation

GitHub Actions / vale

[vale] website/docs/docs/build/snapshots.md#L218

[custom.Typos] Oops there's a typo -- did you really mean 'snapshot_meta_column_names'?
Raw output
{"message": "[custom.Typos] Oops there's a typo -- did you really mean 'snapshot_meta_column_names'? ", "location": {"path": "website/docs/docs/build/snapshots.md", "range": {"start": {"line": 218, "column": 94}}}, "severity": "WARNING"}
- If you have set the `dbt_valid_to_current` configuration option, then instead of `NULL`, the `dbt_valid_to` field in future records will be set to your specified value (such as `9999-12-31`).
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we're missing some clarity that this is set to NULL (or whatever you have set dbt_valid_to_current to for current records)


Snapshots can be referenced in downstream models the same way as referencing models — by using the [ref](/reference/dbt-jinja-functions/ref) function.

Expand Down Expand Up @@ -394,12 +401,14 @@

Snapshot <Term id="table">tables</Term> will be created as a clone of your source dataset, plus some additional meta-fields*.

Starting in 1.9 or with [dbt Cloud Versionless](/docs/dbt-versions/upgrade-dbt-version-in-cloud#versionless), these column names can be customized to your team or organizational conventions via the [`snapshot_meta_column_names`](/reference/resource-configs/snapshot_meta_column_names) config.
Starting in 1.9 or with [dbt Cloud Versionless](/docs/dbt-versions/upgrade-dbt-version-in-cloud#versionless):
- These column names can be customized to your team or organizational conventions using the [`snapshot_meta_column_names`](/reference/resource-configs/snapshot_meta_column_names) config.
- Use the [`dbt_valid_to_current`](/reference/resource-configs/dbt_valid_to_current) config to set a custom future date for `dbt_valid_to` in new snapshot columns. When set, dbt will use this specified value instead of `NULL` for `dbt_valid_to` in the snapshot table.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

again, i would rephrase this as suggested above


| Field | Meaning | Usage |
| -------------- | ------- | ----- |
| dbt_valid_from | The timestamp when this snapshot row was first inserted | This column can be used to order the different "versions" of a record. |
| dbt_valid_to | The timestamp when this row became invalidated. | The most recent snapshot record will have `dbt_valid_to` set to `null`. |
| dbt_valid_to | The timestamp when this row became invalidated. <br /> For current records, this is `NULL` by default <VersionBlock firstVersion="1.9"> or the value specified in `dbt_valid_to_current`.</VersionBlock> | The most recent snapshot record will have `dbt_valid_to` set to `NULL` <VersionBlock firstVersion="1.9"> or the specified value. </VersionBlock> |
| dbt_scd_id | A unique key generated for each snapshotted record. | This is used internally by dbt |
| dbt_updated_at | The updated_at timestamp of the source record when this snapshot row was inserted. | This is used internally by dbt |

Expand Down
3 changes: 2 additions & 1 deletion website/docs/docs/dbt-versions/release-notes.md
Original file line number Diff line number Diff line change
Expand Up @@ -19,6 +19,7 @@ Release notes are grouped by month for both multi-tenant and virtual private clo
\* The official release date for this new format of release notes is May 15th, 2024. Historical release notes for prior dates may not reflect all available features released earlier this year or their tenancy availability.

## November 2024
- **New**: Use the [`dbt_valid_to_current`](/reference/resource-configs/dbt_valid_to_current) config to set a custom future date for `dbt_valid_to` in new snapshot columns. When set, dbt will use this specified value instead of `NULL` for `dbt_valid_to` in the snapshot table.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

again, rephrase as suggested above

- **Fix**: This update improves [dbt Semantic Layer Tableau integration](/docs/cloud-integrations/semantic-layer/tableau) making query parsing more reliable. Some key fixes include:
- Error messages for unsupported joins between saved queries and ALL tables.
- Improved handling of queries when multiple tables are selected in a data source.
Expand Down Expand Up @@ -50,7 +51,7 @@ Release notes are grouped by month for both multi-tenant and virtual private clo
- [Python SDK](https://docs.getdbt.com/docs/dbt-cloud-apis/sl-python) is now generally available

</Expandable>

- **Behavior change:** [Multi-factor authentication](/docs/cloud/manage-access/mfa) is now enforced on all users who log in with username and password credentials.
- **Enhancement**: The dbt Semantic Layer JDBC now allows users to paginate `semantic_layer.metrics()` and `semantic_layer.dimensions()` for metrics and dimensions using `page_size` and `page_number` parameters. Refer to [Paginate metadata calls](/docs/dbt-cloud-apis/sl-jdbc#querying-the-api-for-metric-metadata) for more information.
- **Enhancement**: The dbt Semantic Layer JDBC now allows you to filter your metrics to include only those that contain a specific substring, using the `search` parameter. If no substring is provided, the query returns all metrics. Refer to [Fetch metrics by substring search](/docs/dbt-cloud-apis/sl-jdbc#querying-the-api-for-metric-metadata) for more information.
Expand Down
102 changes: 102 additions & 0 deletions website/docs/reference/resource-configs/dbt_valid_to_current.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,102 @@
---
resource_types: [snapshots]
description: "Snapshot dbt_valid_to_current custom date"
datatype: "{<dictionary>}"
default_value: {NULL}
id: "dbt_valid_to_current"
---

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't know where we want this, but we should probably have a migration callout/warning like what we have for the other new configs cc: @dbeatty10

Available in 1.9 or with [Versionless](/docs/dbt-versions/upgrade-dbt-version-in-cloud#versionless) dbt Cloud.

<File name='snapshots/schema.yml'>

```yaml
snapshots:
my_project:
+dbt_valid_to_current: "to_date('9999-12-31')"

```

</File>

<File name='snapshots/<filename>.sql'>

```sql
{{
config(
unique_key='id',
strategy='timestamp',
updated_at='updated_at',
dbt_valid_to_current='to_date('9999-12-31')'
)
}}
```

</File>

<File name='dbt_project.yml'>

```yml
snapshots:
[<resource-path>](/reference/resource-configs/resource-path):
+dbt_valid_to_current: "to_date('9999-12-31')"
```

</File>

## Description

Use the `dbt_valid_to_current` config to set a custom future date for `dbt_valid_to` in new snapshot columns. When set, dbt will use this specified value instead of `NULL` for `dbt_valid_to` in the snapshot table.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

again, rephrase as suggested above

thinking about this more, i do like that we're calling out that the main use case for this is "a future date"


This approach makes it easier to assign a custom date date, work in a join, or perform range-based filtering that require an end date.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"date date"

mirnawong1 marked this conversation as resolved.
Show resolved Hide resolved

## Usage
mirnawong1 marked this conversation as resolved.
Show resolved Hide resolved

- **Date expressions** &mdash; Provide a hardcoded date expression compatible with your data platform, such as to_date`('9999-12-31')`. Note that syntax may vary by warehouse (for example, to_date('YYYY-MM-DD') or date(YYYY, MM, DD)).

Check warning on line 55 in website/docs/reference/resource-configs/dbt_valid_to_current.md

View workflow job for this annotation

GitHub Actions / vale

[vale] website/docs/reference/resource-configs/dbt_valid_to_current.md#L55

[custom.Typos] Oops there's a typo -- did you really mean 'to_date'?
Raw output
{"message": "[custom.Typos] Oops there's a typo -- did you really mean 'to_date'? ", "location": {"path": "website/docs/reference/resource-configs/dbt_valid_to_current.md", "range": {"start": {"line": 55, "column": 112}}}, "severity": "WARNING"}
- **Jinja limitation** &mdash; `dbt_valid_to_current` only accepts static SQL expressions. Jinja expressions (like `{{ var('my_future_date') }}`) are not supported.
- **Deferral and `state:modified`** &mdash; Changes to `dbt_valid_to_current` are compatible with deferral and `--select state:modified`. When this configuration changes, it'll appear in `state:modified` selections, raising a warning to manually make the necessary snapshot updates.

## Default

By default, `dbt_valid_to` is set to `NULL` for current (most recent) records in your snapshot table. This means that these records are still valid and have no defined end date.

If you prefer to use a specific value instead of `NULL` for `dbt_valid_to` in current and future records, you can use the `dbt_valid_to_current` configuration option. For example, setting a date in the far future, `9999-12-31`.

The value assigned to `dbt_valid_to_current` should be a string representing a valid date or timestamp, depending on your database's requirements. Use expressions that work within the data platform.

### Managing records
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a more specific descriptor here for the heading?

- **For existing records** &mdash; To avoid any unintentional data modification, dbt will _not_ automatically adjust the current value in the existing `dbt_valid_to` column. Existing current records will still have `dbt_valid_to` set to `NULL`.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah here's the migration callout - I think we just want to make sure this warning exists elsewhere as well!


- **For new records** &mdash; Any new records inserted after applying the `dbt_valid_to_current` configuration will have `dbt_valid_to` set to the specified value (for example, '9999-12-31'), instead of `NULL`.

This means your snapshot table will have current records with `dbt_valid_to` values of both `NULL` (from existing data) and the new specified value (from new data). If you'd rather have consistent `dbt_valid_to` values for current records, you can either manually update existing records in your snapshot table where `dbt_valid_to` is `NULL` to match your `dbt_valid_to_current` value or rebuild your snapshot table.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"rebuild your snapshot table" would be very risky (lose all of your historical data!)


## Example

<File name='snapshots/schema.yml'>

```yaml
snapshots:
- name: my_snapshot
config:
strategy: timestamp
updated_at: updated_at
dbt_valid_to_current: "to_date('9999-12-31')"
columns:
- name: dbt_valid_from
description: The timestamp when the record became valid.
- name: dbt_valid_to
description: >
The timestamp when the record ceased to be valid. For current records,
this is either `NULL` or the value specified in `dbt_valid_to_current`
(like `'9999-12-31'`).
```

</File>

The resulting snapshot table contains the configured dbt_valid_to column value:

| id | dbt_scd_id | dbt_updated_at | dbt_valid_from | dbt_valid_to |
| -- | -------------------- | -------------------- | -------------------- | -------------------- |
| 1 | 60a1f1dbdf899a4dd... | 2024-10-02 ... | 2024-10-02 ... | 9999-12-31 ... |

Check warning on line 101 in website/docs/reference/resource-configs/dbt_valid_to_current.md

View workflow job for this annotation

GitHub Actions / vale

[vale] website/docs/reference/resource-configs/dbt_valid_to_current.md#L101

[custom.Typos] Oops there's a typo -- did you really mean '60a1f1dbdf899a4dd'?
Raw output
{"message": "[custom.Typos] Oops there's a typo -- did you really mean '60a1f1dbdf899a4dd'? ", "location": {"path": "website/docs/reference/resource-configs/dbt_valid_to_current.md", "range": {"start": {"line": 101, "column": 8}}}, "severity": "WARNING"}
| 2 | b1885d098f8bcff51... | 2024-10-02 ... | 2024-10-02 ... | 9999-12-31 ... |

Check warning on line 102 in website/docs/reference/resource-configs/dbt_valid_to_current.md

View workflow job for this annotation

GitHub Actions / vale

[vale] website/docs/reference/resource-configs/dbt_valid_to_current.md#L102

[custom.Typos] Oops there's a typo -- did you really mean 'b1885d098f8bcff51'?
Raw output
{"message": "[custom.Typos] Oops there's a typo -- did you really mean 'b1885d098f8bcff51'? ", "location": {"path": "website/docs/reference/resource-configs/dbt_valid_to_current.md", "range": {"start": {"line": 102, "column": 8}}}, "severity": "WARNING"}
1 change: 1 addition & 0 deletions website/sidebars.js
Original file line number Diff line number Diff line change
Expand Up @@ -978,6 +978,7 @@ const sidebarSettings = {
"reference/resource-configs/updated_at",
"reference/resource-configs/invalidate_hard_deletes",
"reference/resource-configs/snapshot_meta_column_names",
"reference/resource-configs/dbt_valid_to_current",
],
},
{
Expand Down
Loading