Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Initial "wireframe" for model contracts #2890

Merged
merged 18 commits into from
Feb 27, 2023
Merged
Show file tree
Hide file tree
Changes from 4 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
41 changes: 33 additions & 8 deletions website/dbt-versions.js
Original file line number Diff line number Diff line change
@@ -1,4 +1,9 @@
exports.versions = [
{
version: "1.5",
EOLDate: "2024-04-26",
isPrerelease: true,
},
{
version: "1.4",
EOLDate: "2024-01-25",
Expand All @@ -22,6 +27,34 @@ exports.versions = [
]

exports.versionedPages = [
{
"page": "docs/collaborate/publish/model-contracts",
"firstVersion": "1.5",
},
{
"page": "docs/collaborate/publish/model-access",
"firstVersion": "1.5",
},
{
"page": "docs/collaborate/publish/model-versions",
"firstVersion": "1.5",
},
{
"page": "reference/resource-configs/contract",
"firstVersion": "1.5",
},
{
"page": "reference/resource-properties/constraints",
"firstVersion": "1.5",
},
{
"page": "reference/dbt-jinja-functions/local-md5",
"firstVersion": "1.4",
},
{
"page": "reference/warehouse-setups/fal-setup",
"firstVersion": "1.3",
},
Comment on lines +51 to +57
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No functional change here, I just reordered the list so it's in a consistent (descending) order by firstVersion. I figure, in the future, we could remove items from this list as we deprecate older versions.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you!

{
"page": "reference/dbt-jinja-functions/set",
"firstVersion": "1.2",
Expand Down Expand Up @@ -50,12 +83,4 @@ exports.versionedPages = [
"page": "reference/dbt-jinja-functions/print",
"firstVersion": "1.1",
},
{
"page": "reference/dbt-jinja-functions/local-md5",
"firstVersion": "1.4",
},
{
"page": "reference/warehouse-setups/fal-setup",
"firstVersion": "1.3",
},
]
61 changes: 61 additions & 0 deletions website/docs/docs/collaborate/publish/model-access.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,61 @@
---
title: "Model access"
---

:::info Beta functionality
This functionality is new in v1.5! These docs exist to provide a high-level overview of what's to come. Specific syntax is liable to change.

For more details, and to leave your feedback, check out the GitHub discussion:
* ["Model groups & access" (dbt-core#6730)](https://github.com/dbt-labs/dbt-core/discussions/6730)
:::

## Related documentation
* TK: `groups`
* TK: `access` modifiers

### Groups

Models can be grouped together under a common designation, with a shared owner.

Why define model `groups`?
- It turns implicit relationships into an explicit grouping
- It enables you to mark certain models as "private," for use _only_ within that group

### Access modifiers

Some models (not all of them) are designed to be shared across groups.

https://en.wikipedia.org/wiki/Access_modifiers

| Keyword | Meaning |
|-----------|----------------------|
| private | same group |
| protected | same project/package |
| public | everybody* |

By default, all models are "protected." This means that they can be referenced by other models in the same project.

:::info Under construction 🚧
More to come! The syntax below is suggestive only, it does not yet work.
:::

<File name="models/marts/customers.yml">

```yaml
groups:
- name: cx
owner:
name: Customer Success Team
email: cx@jaffle.shop

models:
- name: dim_customers
group: cx
access: public
# this is an intermediate transformation -- relevant to the CX team only
- name: int__customer_history_rollup
group: cx
access: private
```

</File>
86 changes: 86 additions & 0 deletions website/docs/docs/collaborate/publish/model-contracts.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,86 @@
---
title: "Model contracts"
---

:::info Beta functionality
This functionality is new in v1.5! These docs exist to provide a high-level overview of what's to come. Specific syntax is liable to change.

For more details, and to leave your feedback, check out the GitHub discussion:
* ["Model contracts" (dbt-core#6726)](https://github.com/dbt-labs/dbt-core/discussions/6726)
:::

## Related documentation
* [`contract`](resource-configs/contract)
* [`columns`](resource-properties/columns)
* [`constraints`](resource-properties/constraints)

## Why define a contract?

Defining a dbt model is as easy as writing a SQL `select` statement, or a Python Data Frame transformation. Your query naturally produces a dataset with columns of names and types, based on the columns you're selecting and the transformations you're applying.

While this is great for quick & iterative development, for some models, constantly changing the shape of the model's returned dataset poses a risk, when there are other people and processes querying that model. It's better to define a set of **upfront guarantees** about the shape of your model. We call this set of guarantees a "contract." While building your model, dbt will verify that your model's transformation will produce a dataset matching up with its contract—or it will fail to build.

## How to define a contract

Let's say you have a model with a query like:

<File name="models/marts/dim_customers.sql">

```sql
-- lots of SQL

final as (

select
-- lots of columns
from ...

)

select * from final
```
</File>

Your contract **must** include every column's `name` and `data_type` (where `data_type` matches the type understood by your data platform). If your model is being materialized as `table` or `incremental`, you may optionally specify that certain columns must be `not_null` (i.e. contain zero null values). Depending on your data platform, you may also be able to define additional `constraints` that are enforced while the model is being built.

Finally, you configure your model with `contract: true`.

<File name="models/marts/customers.yml">

```yaml
models:
- name: dim_customers
config:
contract: true
columns:
- name: customer_id
data_type: int
not_null: true
- name: customer_name
data_type: string
...
```

</File>

When building a model with a defined contract, dbt will do two things differently:
1. dbt will run a "pre-flight" check to ensure that the model's query will return a set of columns with names and data types matching the ones you have defined.
2. dbt will pass the column names, types, `not_null` and other constraints into the DDL statements it submits to the data platform, where they will be enforced while building the table.

## FAQs

### Which models should have contracts?

Any model can define a contract. It's especially important to define contracts for "public" models that are being shared with other groups, teams, and (soon) dbt projects.

### How are contracts different from tests?

A model's contract defines the **shape** of the returned dataset.

[Tests](tests) are a more flexible mechanism for validating the content of your model. So long as you can write the query, you can run the test. Tests are also more configurable, via `severity` and custom thresholds, and easier to debug after finding failures, because the model has already built, and the relevant records can be materialized in the data warehouse by [storing failures](resource-configs/store_failures).

In blue/green deployments (docs link TK), ... <!-- TODO write more here -->

In the parallel for software APIs:
- The structure of the API response is the contract
- Quality and reliability ("uptime") are also **crucial**, but not part of the contract per se.
28 changes: 28 additions & 0 deletions website/docs/docs/collaborate/publish/model-versions.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,28 @@
---
title: "Model versions"
---

:::info Beta functionality
This functionality is new in v1.5! These docs exist to provide a high-level overview of what's to come. Specific syntax is liable to change.

For more details, and to leave your feedback, check out the GitHub discussion:
* ["Model versions" (dbt-core#6736)](https://github.com/dbt-labs/dbt-core/discussions/6736)
:::

API versioning is **a hard problem** in software engineering. It's also very important. Our goal is to _make a hard thing possible_.

## Related documentation
* TK: `version` & `latest` (_not_ [this one](project-configs/version))
* TK: `deprecation_date`

## Why version a model?

If a model defines a ["contract"](model-contracts) (a set of guarantees for its structure), it's also possible to change that model's contract in a way that "breaks" the previous set of guarantees.

One approach is to force every consumer of the model to immediately handle the breaking change, as soon as it's deployed to production. While this may work at smaller organizations, or while iterating on an immature set of data models, it doesn't scale much beyond that.

Instead, the owner of the model can create a **new version**, and provide a **deprecation window**, during which consumers can migrate from the old version to the new.

In the meantime, anywhere that model is used downstream, it can be referenced at a specific version.

When a model is reaching its deprecation date, consumers of that model will hear about it. When the date is reached, it goes away.
57 changes: 57 additions & 0 deletions website/docs/guides/migration/versions/02-upgrading-to-v1.5.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,57 @@
---
title: "Trying v1.5 (prerelease)"
description: New features and changes in dbt Core v1.5
---

:::info
v1.5 is currently available as a **beta prerelease.** Availability in dbt Cloud coming soon!
:::

### Resources

- [Changelog](https://github.com/dbt-labs/dbt-core/blob/main/CHANGELOG.md)
- [CLI Installation guide](/docs/get-started/installation)
- [Cloud upgrade guide](/docs/dbt-versions/upgrade-core-in-cloud)
- [Release schedule](https://github.com/dbt-labs/dbt-core/issues/6715)

**Planned final release:** April 26, 2023

dbt Core v1.5 is a feature release, with two big additions planned:
1. **"Models as APIs,"** the first phase of [multi-project deployments](https://github.com/dbt-labs/dbt-core/discussions/6725)
2. An initial **Python API for dbt-core,** supporting programmatic invocations at parity with the CLI

## What to know before upgrading

dbt Labs is committed to providing backward compatibility for all versions 1.x, with the exception of any changes explicitly mentioned below. If you encounter an error upon upgrading, please let us know by [opening an issue](https://github.com/dbt-labs/dbt-core/issues/new).

### Breaking changes

As part of our refactor of `dbt-core` internals, we need to make some **very precise** changes to runtime configuration. The net result of these changes is more sensible configuration options, clearer documentation, cleaner APIs, and a more legible codebase.

Wherever possible, we will aim to provide backwards compatibility and deprecation warnings for at least one minor version, before actually removing the old functionality. In those cases, we still reserve the right to fully remove the backward-compatible functionality in a future v1.x minor version of `dbt-core`.

Changes planned for v1.5:
- Renaming ["global configs"](global-configs) for consistency ([dbt-core#6903](https://github.com/dbt-labs/dbt-core/issues/6903))
- Moving `log-path` and `target-path` out of `dbt_project.yml`, for consistency with other global configs ([dbt-core#6882](https://github.com/dbt-labs/dbt-core/issues/6882))

### For consumers of dbt artifacts (metadata)

The manifest schema version will be updated to `v9`. Specific changes to be noted here.

### For maintainers of adapter plugins

Forthcoming: GH discussion detailing interface changes, and offering a forum for Q&A

## New and changed documentation

:::caution Under construction 🚧
More to come!
:::

### Publishing models as APIs
- [Model contracts](model-contracts) ([#2839](https://github.com/dbt-labs/docs.getdbt.com/issues/2839))
- [Model access](model-access) ([#2840](https://github.com/dbt-labs/docs.getdbt.com/issues/2840))
- [Model versions](model-versions) ([#2841](https://github.com/dbt-labs/docs.getdbt.com/issues/2841))

### dbt-core Python API
- Auto-generated documentation ([#2674](https://github.com/dbt-labs/docs.getdbt.com/issues/2674)) for dbt-core CLI & Python API for programmatic invocations
7 changes: 5 additions & 2 deletions website/docs/reference/model-configs.md
Original file line number Diff line number Diff line change
Expand Up @@ -105,6 +105,7 @@ models:
[+](plus-prefix)[full_refresh](full_refresh): <boolean>
[+](plus-prefix)[meta](meta): {<dictionary>}
[+](plus-prefix)[grants](grants): {<dictionary>}
[+](plus-prefix)[constraints_enabled](constraints_enabled): true | false

```

Expand Down Expand Up @@ -134,6 +135,7 @@ models:
[full_refresh](full_refresh): <boolean>
[meta](meta): {<dictionary>}
[grants](grants): {<dictionary>}
[constraints_enabled](constraints_enabled): true | false
```

</File>
Expand All @@ -157,8 +159,9 @@ models:
[schema](resource-configs/schema)="<string>",
[alias](resource-configs/alias)="<string>",
[persist_docs](persist_docs)={<dict>},
[meta](meta)={<dict>}
[grants](grants)={<dict>}
[meta](meta)={<dict>},
[grants](grants)={<dict>},
[constraints_enabled](constraints_enabled)=true | false
) }}

```
Expand Down
Loading