Rework sources to be dbt models rather than manually created #188

jaypeedevlin · 2022-08-31T02:24:35Z

Closes #153, #190 and unblocks #175, #177

This PR replaces our manual schema generation macros with incremental models with on_schema_change configuration designed to make schema evolution more painfree.

Of note, it bumps the require-dbt-version to >= 1.2.0.

Additionally this removes the dependency on dbt_utils by using a combination of the new cross DB macros in core >=1.2.0 and a copy/paste of the surrogate_key macro.

README.md

macros/surrogate_key.sql

README.md

NiallRees · 2022-09-06T10:16:38Z

models/sources/sources.sql

+    cast(null as {{ type_string() }}) as name,
+    cast(null as {{ type_string() }}) as identifier,
+    cast(null as {{ type_string() }}) as loaded_at_field,
+    cast(null as {{ type_json() }}) as freshness


This was an ARRAY type before for Snowflake which is currently leading to errors when I do:
dbt run -m dbt_artifacts

when I install this branch into a project which already is using dbt_artifacts:

10:11:48 Database Error in model sources (models/sources/sources.sql) 10:11:48 002023 (22000): SQL compilation error: 10:11:48 Expression type does not match column data type, expecting ARRAY but got OBJECT for column FRESHNESS 10:11:48 compiled SQL at target/run/dbt_artifacts/models/sources/sources.sql`

Hmm, this will be tricky to solve in a way that's compatible to both Snowflake and Bigquery I think, given that one uses JSON and the other ARRAY

dbt_artifacts/macros/create_sources_table_if_not_exists.sql

Lines 36 to 66 in 9146d71

{% macro snowflake__get_create_sources_table_if_not_exists_statement(database_name, schema_name, table_name) -%}

create table {{database_name}}.{{schema_name}}.{{table_name}} (

command_invocation_id STRING,

node_id STRING,

run_started_at TIMESTAMP_TZ,

database STRING,

schema STRING,

source_name STRING,

loader STRING,

name STRING,

identifier STRING,

loaded_at_field STRING,

freshness ARRAY

)

{%- endmacro %}

{% macro bigquery__get_create_sources_table_if_not_exists_statement(database_name, schema_name, table_name) -%}

create table {{database_name}}.{{schema_name}}.{{table_name}} (

command_invocation_id STRING,

node_id STRING,

run_started_at TIMESTAMP,

database STRING,

schema STRING,

source_name STRING,

loader STRING,

name STRING,

identifier STRING,

loaded_at_field STRING,

freshness JSON

)

{%- endmacro %}

One option would be to create a specific type helper for just this column, but that seems a bit suboptimal IMO. Do you have any ideas @NiallRees?

Maybe we could offer a one-time migration macro for the source tables and cut a new major version with this new method?

Think I'd be inclined to use a specific type helper (just doing an if snowflake else in the model vs a macro) rather than adding more complexity to the migration process.

@NiallRees done and ready for another test.

NiallRees

Tested successfully on Snowflake, BigQuery and Databricks when tables already existed.

Rework sources to be dbt models rather than manually created

3dd4f60

jaypeedevlin requested a review from NiallRees August 31, 2022 02:24

jaypeedevlin had a problem deploying to Approve Integration Tests August 31, 2022 02:24 Failure

jaypeedevlin temporarily deployed to Approve Integration Tests August 31, 2022 02:24 Inactive

Bigquery where/from workaround

e3631ec

jaypeedevlin temporarily deployed to Approve Integration Tests August 31, 2022 02:36 Inactive

jaypeedevlin had a problem deploying to Approve Integration Tests August 31, 2022 02:36 Failure

jaypeedevlin temporarily deployed to Approve Integration Tests August 31, 2022 02:36 Inactive

jaypeedevlin had a problem deploying to Approve Integration Tests August 31, 2022 02:36 Failure

Add missing BQ column

6259c3d

jaypeedevlin temporarily deployed to Approve Integration Tests August 31, 2022 02:42 Inactive

jaypeedevlin had a problem deploying to Approve Integration Tests August 31, 2022 02:42 Failure

Fix BQ array type

d6df399

jaypeedevlin had a problem deploying to Approve Integration Tests August 31, 2022 02:48 Failure

jaypeedevlin temporarily deployed to Approve Integration Tests August 31, 2022 02:48 Inactive

Another JSON column

eee7262

jaypeedevlin temporarily deployed to Approve Integration Tests August 31, 2022 02:54 Inactive

jaypeedevlin had a problem deploying to Approve Integration Tests August 31, 2022 02:54 Failure

jaypeedevlin temporarily deployed to Approve Integration Tests August 31, 2022 02:54 Inactive

jaypeedevlin mentioned this pull request Aug 31, 2022

[CT-1110] [Feature] Cross-database macro for type_boolean() dbt-labs/dbt-core#5739

Closed

3 tasks

NiallRees reviewed Aug 31, 2022

View reviewed changes

README.md Outdated Show resolved Hide resolved

Merge branch 'main' into JD/rework_sources

9f6af83

NiallRees temporarily deployed to Approve Integration Tests August 31, 2022 08:13 Inactive

NiallRees had a problem deploying to Approve Integration Tests September 2, 2022 06:50 Failure

NiallRees reviewed Sep 2, 2022

View reviewed changes

macros/surrogate_key.sql Outdated Show resolved Hide resolved

Update macros/surrogate_key.sql

79606a0

NiallRees temporarily deployed to Approve Integration Tests September 2, 2022 06:51 Inactive

jaypeedevlin mentioned this pull request Sep 4, 2022

sql server compatibility #194

Open

NiallRees reviewed Sep 6, 2022

View reviewed changes

README.md Outdated Show resolved Hide resolved

Update README.md

dfa8b26

NiallRees had a problem deploying to Approve Integration Tests September 6, 2022 10:10 Failure

NiallRees reviewed Sep 6, 2022

View reviewed changes

Adjust source freshness type for Snowflake

f57ed7a

jaypeedevlin temporarily deployed to Approve Integration Tests September 7, 2022 23:38 Inactive

NiallRees approved these changes Sep 8, 2022

View reviewed changes

jaypeedevlin temporarily deployed to Approve Integration Tests September 10, 2022 20:12 Inactive

jaypeedevlin merged commit f9fe8ec into main Sep 13, 2022

jaypeedevlin deleted the JD/rework_sources branch September 13, 2022 03:02

This was referenced Sep 13, 2022

Provide source data as JSON blob #199

Closed

Dynamic database and schema targets #201

Closed

patkearns10 mentioned this pull request Oct 1, 2022

add more fields to tests #213

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Rework sources to be dbt models rather than manually created #188

Rework sources to be dbt models rather than manually created #188

jaypeedevlin commented Aug 31, 2022 •

edited

Loading

NiallRees Sep 6, 2022

jaypeedevlin Sep 6, 2022

jaypeedevlin Sep 6, 2022 •

edited

Loading

NiallRees Sep 7, 2022

jaypeedevlin Sep 7, 2022

NiallRees left a comment

	{% macro snowflake__get_create_sources_table_if_not_exists_statement(database_name, schema_name, table_name) -%}
	create table {{database_name}}.{{schema_name}}.{{table_name}} (
	command_invocation_id STRING,
	node_id STRING,
	run_started_at TIMESTAMP_TZ,
	database STRING,
	schema STRING,
	source_name STRING,
	loader STRING,
	name STRING,
	identifier STRING,
	loaded_at_field STRING,
	freshness ARRAY
	)
	{%- endmacro %}

	{% macro bigquery__get_create_sources_table_if_not_exists_statement(database_name, schema_name, table_name) -%}
	create table {{database_name}}.{{schema_name}}.{{table_name}} (
	command_invocation_id STRING,
	node_id STRING,
	run_started_at TIMESTAMP,
	database STRING,
	schema STRING,
	source_name STRING,
	loader STRING,
	name STRING,
	identifier STRING,
	loaded_at_field STRING,
	freshness JSON
	)
	{%- endmacro %}

Rework sources to be dbt models rather than manually created #188

Rework sources to be dbt models rather than manually created #188

Conversation

jaypeedevlin commented Aug 31, 2022 • edited Loading

NiallRees Sep 6, 2022

Choose a reason for hiding this comment

jaypeedevlin Sep 6, 2022

Choose a reason for hiding this comment

jaypeedevlin Sep 6, 2022 • edited Loading

Choose a reason for hiding this comment

NiallRees Sep 7, 2022

Choose a reason for hiding this comment

jaypeedevlin Sep 7, 2022

Choose a reason for hiding this comment

NiallRees left a comment

Choose a reason for hiding this comment

jaypeedevlin commented Aug 31, 2022 •

edited

Loading

jaypeedevlin Sep 6, 2022 •

edited

Loading