ID Stitching dbt Package

This dbt package stitches together identifiers in an ID graph table.

Overview

The primary ouput of this package is id_graph. There are a few intermediate models used to create this model.

Model	Description
queries	Generates select statements which pull IDs from your tables.
edges	Combines the results of those select statement to create a table containing edges (IDs) the first time it is run, and matches edges on subsequent runs.
check_edges	Determines if there are still edges to match.
id_graph	Creates an ID graph table.

Installation

Check dbt Hub for the latest installation instructions, or read the docs for more information on installing packages.

Configuration

Set ID columns and IDs to exclude in dbt_project.yml:

vars:
  id-columns: ('anonymous_id', 'user_id', 'email')
  ids-to-exclude: ('sources','user@company.com')

This package searches your data warehouse for tables that include multiple columns defined in id-columns. Any IDs defined in ids-to-exclude are disregarded.

Usage

The edges model must be run enough times to match all edges (IDs). Five or six passes is usually sufficient. The check_edges model will show 0 when all edges have been matched. Edit your job commands for dbt Cloud or run.sh script for dbt CLI to run the edges model however many times is necessary.

dbt Cloud

Create a job with the following commands:

dbt run --full-refresh --select queries edges
dbt run --select edges
dbt run --select edges
dbt run --select edges
dbt run --select edges
dbt run --select edges check_edges id_graph

dbt CLI

Run the included run.sh shell script:

./run.sh

Additional intstrumentation can be created to evaluate the check_edges model to determine programatically whether to run the edges model subsequent times.

License

MIT

Name		Name	Last commit message	Last commit date
Latest commit History 41 Commits
models		models
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
dbt_project.yml		dbt_project.yml
packages.yml		packages.yml
run.sh		run.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

ID Stitching dbt Package

Overview

Installation

Configuration

Usage

dbt Cloud

dbt CLI

License

About

Releases

Packages

Languages

License

vibeus/dbt-id-stitching

Folders and files

Latest commit

History

Repository files navigation

ID Stitching dbt Package

Overview

Installation

Configuration

Usage

dbt Cloud

dbt CLI

License

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages