Skip to content

Feature: Ingest DBT Contract Information as a DataHub Data Contract #11927

@matthew-coudert-cko

Description

@matthew-coudert-cko

We (Checkout.com) have started using DataHub's contract feature more intensely over the past few months, and have implemented a custom mapping between DBT contracts and DataHub's data contract feature. We propose implementing this as a part of the native DBT Core ingestion with the following functionality:

  1. DBT Contracts prevent breaking changes (column removals or column type changes), so they are equivalent to a schema contract in DataHub.
  2. DBT Tests assigned with an arbitrary tag (default contract) have their assertion added to the data contract.
  3. Optionally DBT constraints that are enforced in the target data platform (e.g not_null in Snowflake) could be added into the contract as well as always passing.

Example DBT Yaml:

- name: dbt_contract_test_view
  description: This view is used to test the data contract checks for the dbt models.
  config:
    contract:
      enforced: true # this adds a schema contract to the DataHub data contract.
  columns:
      - name: urn
        data_type: text
        description: The urn of the object.
        data_tests:
          - unique
             tags: ['contract'] # this is included in the data contract
          - not_null # this is not

We're happy to contribute this if there's appetite, happy to hide it behind a feature flag in the DBT config as well.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions