Fix expect_row_values_to_have_data_for_every_n_datepart errors when both start and end dates are set #115

jeremyyeo · 2021-10-06T02:17:15Z

Hey @clausherther, just implementing @barberscott's fix for #113 here.

Tested with the following:

model / schema

-- dim_dates.sql
SELECT *
  FROM (VALUES ('2020-01-01'::DATE), ('2020-01-02'::DATE), ('2020-01-03'::DATE), ('2020-01-04'::DATE))
    AS my_table(date_at)

# dim_dates.yml
version: 2
models:
  - name: dim_dates
    columns:
      - name: date_at
        tests:
          - accepted_values:
              values: ["2020-01-01", "2020-01-02", "2020-01-03", "2020-01-04"]
    tests:
      - dbt_expectations.expect_row_values_to_have_data_for_every_n_datepart:
          # Expect this to PASS.
          date_col: date_at
          date_part: day
          test_start_date: "2020-01-01"
          test_end_date: "2020-01-04"
      - dbt_expectations.expect_row_values_to_have_data_for_every_n_datepart:
          # Expect this to PASS.
          date_col: date_at
          date_part: day
          test_end_date: "2020-01-04"
      - dbt_expectations.expect_row_values_to_have_data_for_every_n_datepart:
          # Expect this to FAIL.
          date_col: date_at
          date_part: day
          test_start_date: "2019-12-01"
      - dbt_expectations.expect_row_values_to_have_data_for_every_n_datepart:
          # Expect this to FAIL.
          date_col: date_at
          date_part: day
          test_end_date: "2020-12-01"
      - dbt_expectations.expect_row_values_to_have_data_for_every_n_datepart:
          # Expect this to FAIL.
          date_col: date_at
          date_part: day
          test_start_date: "2020-01-01"
          test_end_date: "2020-01-06"

dbt test -m dim_dates output

Running with dbt=0.20.2
Found 4 models, 14 tests, 0 snapshots, 0 analyses, 540 macros, 0 operations, 0 seed files, 0 sources, 0 exposures

15:08:39 | Concurrency: 1 threads (target='dev')
15:08:39 | 
15:08:39 | 1 of 6 START test accepted_values_dim_dates_date_at__2020_01_01__2020_01_02__2020_01_03__2020_01_04 [RUN]
15:08:42 | 1 of 6 PASS accepted_values_dim_dates_date_at__2020_01_01__2020_01_02__2020_01_03__2020_01_04 [�[32mPASS�[0m in 3.26s]
15:08:42 | 2 of 6 START test dbt_expectations_expect_row_values_to_have_data_for_every_n_datepart_dim_dates_date_at__day__2019_12_01 [RUN]
15:08:47 | 2 of 6 FAIL 31 dbt_expectations_expect_row_values_to_have_data_for_every_n_datepart_dim_dates_date_at__day__2019_12_01 [�[31mFAIL 31�[0m in 4.69s]
15:08:47 | 3 of 6 START test dbt_expectations_expect_row_values_to_have_data_for_every_n_datepart_dim_dates_date_at__day__2020_01_04 [RUN]
15:08:51 | 3 of 6 PASS dbt_expectations_expect_row_values_to_have_data_for_every_n_datepart_dim_dates_date_at__day__2020_01_04 [�[32mPASS�[0m in 4.35s]
15:08:51 | 4 of 6 START test dbt_expectations_expect_row_values_to_have_data_for_every_n_datepart_dim_dates_date_at__day__2020_01_04__2020_01_01 [RUN]
15:08:55 | 4 of 6 PASS dbt_expectations_expect_row_values_to_have_data_for_every_n_datepart_dim_dates_date_at__day__2020_01_04__2020_01_01 [�[32mPASS�[0m in 3.89s]
15:08:55 | 5 of 6 START test dbt_expectations_expect_row_values_to_have_data_for_every_n_datepart_dim_dates_date_at__day__2020_01_06__2020_01_01 [RUN]
15:08:59 | 5 of 6 FAIL 1 dbt_expectations_expect_row_values_to_have_data_for_every_n_datepart_dim_dates_date_at__day__2020_01_06__2020_01_01 [�[31mFAIL 1�[0m in 3.82s]
15:08:59 | 6 of 6 START test dbt_expectations_expect_row_values_to_have_data_for_every_n_datepart_dim_dates_date_at__day__2020_12_01 [RUN]
15:09:04 | 6 of 6 FAIL 331 dbt_expectations_expect_row_values_to_have_data_for_every_n_datepart_dim_dates_date_at__day__2020_12_01 [�[31mFAIL 331�[0m in 4.60s]
15:09:04 | 
15:09:04 | Finished running 6 tests in 30.05s.

�[31mCompleted with 3 errors and 0 warnings:�[0m

�[31mFailure in test dbt_expectations_expect_row_values_to_have_data_for_every_n_datepart_dim_dates_date_at__day__2019_12_01 (models/testing_dbt_expectations/dim_dates.yml)�[0m
  Got 31 results, configured to fail if != 0

  compiled SQL at target/compiled/snowflake/models/testing_dbt_expectations/dim_dates.yml/schema_test/dbt_expectations_expect_row_va_983cdf96730122e510782406c066c373.sql

�[31mFailure in test dbt_expectations_expect_row_values_to_have_data_for_every_n_datepart_dim_dates_date_at__day__2020_01_06__2020_01_01 (models/testing_dbt_expectations/dim_dates.yml)�[0m
  Got 1 result, configured to fail if != 0

  compiled SQL at target/compiled/snowflake/models/testing_dbt_expectations/dim_dates.yml/schema_test/dbt_expectations_expect_row_va_bdb191a0149f9b0d3c08c5c8db6ab40d.sql

�[31mFailure in test dbt_expectations_expect_row_values_to_have_data_for_every_n_datepart_dim_dates_date_at__day__2020_12_01 (models/testing_dbt_expectations/dim_dates.yml)�[0m
  Got 331 results, configured to fail if != 0

  compiled SQL at target/compiled/snowflake/models/testing_dbt_expectations/dim_dates.yml/schema_test/dbt_expectations_expect_row_va_4cd32499e35de4d43a0427496f6139d9.sql

Done. PASS=3 WARN=0 ERROR=3 SKIP=0 TOTAL=6

clausherther

Thanks! Hoping to cut a release this weekend, pending a related PR.

clausherther · 2021-10-08T16:04:49Z

README.md

@@ -914,7 +914,13 @@ tests:

 Expects model to have values for every grouped `date_part`.

-For example, this tests whether a model has data for every `day` (grouped on `date_col`) from either a specified `start_date` and `end_date`, or for the `min`/`max` value of the specified `date_col`.
+For example, this tests whether a model has data for every `day` (grouped on `date_col`) between either:


@jeremyyeo fyi, just updated the README a bit. Thanks for adding the extra documentation!

clausherther · 2021-10-08T16:05:01Z

dbt_project.yml

@@ -5,7 +5,7 @@
 name: 'dbt_expectations'
 version: '0.4.0'

-require-dbt-version: [">=0.20.0", "<0.21.0"]
+require-dbt-version: [">=0.20.0", "<0.22.0"]


FYI, rebased to support 0.21

clausherther · 2021-10-08T16:05:14Z

macros/schema_tests/distributional/expect_row_values_to_have_data_for_every_n_datepart.sql

-{%- set dr = run_query(sql) -%}
-{%- set db_start_date = dr.columns[0].values()[0].strftime('%Y-%m-%d') -%}
-{%- set db_end_date = dr.columns[1].values()[0].strftime('%Y-%m-%d') -%}
+{% endif %}


* add `interval` argument for checking presence every n-date_parts instead of every date_part * update docs for explaining new `interval` arg [expect_row_values_to_have_data_for_every_n_datepart](https://github.com/calogica/dbt-expectations/tree/0.4.2#expect_row_values_to_have_data_for_every_n_datepart) * expect_table_columns_to_match_ordered_list: refactor row_number to use loop.index (#112) * Fixes #111 - refactor row_number to use loop.index * Update CHANGELOG * handle data types for `mod` and incorporate windowing This test will handle the mod function, which only takes integer arguments, more stably. It also aggregates row counts across intervals when joining on the date spine to correctly detect data presence in the target model update conditions based on interval update styling update styling * Add support for dbt 0.21 (#116) * Update README.md * remove unintentional styling changes * Fix expect_row_values_to_have_data_for_every_n_datepart errors when both start and end dates are set (#115) * fix none when both test dates are set * Add support for dbt 0.21 (#116) * Update README.md * fix none when both test dates are set * Update README Co-authored-by: Claus Herther <claus@calogica.com> * add `interval` argument for checking presence every n-date_parts instead of every date_part * update docs for explaining new `interval` arg [expect_row_values_to_have_data_for_every_n_datepart](https://github.com/calogica/dbt-expectations/tree/0.4.2#expect_row_values_to_have_data_for_every_n_datepart) * handle data types for `mod` and incorporate windowing This test will handle the mod function, which only takes integer arguments, more stably. It also aggregates row counts across intervals when joining on the date spine to correctly detect data presence in the target model update conditions based on interval update styling update styling * remove unintentional styling changes * Change datepart param and fix formatting * Reformat join to match prior style * simplify tie-out of model data to spine with interval truncation the condition added to the model_data CTE is meant to emulate (kind of) Snowflake's [`TIME_SLICE`](https://docs.snowflake.com/en/sql-reference/functions/time_slice.html), which should allow exact matches to the base_dates CTE for better time bucketing * Add schema test * replace calls to subquery with calls directly to columns in `model_data` CTE * add comments/examples for new interval additions Co-authored-by: Claus Herther <claus@calogica.com> Co-authored-by: Jeremy Yeo <jeremyyeo@users.noreply.github.com>

jeremyyeo and others added 3 commits October 6, 2021 14:55

fix none when both test dates are set

bf094a5

Add support for dbt 0.21 (calogica#116)

2028910

Update README.md

6114264

clausherther self-requested a review October 8, 2021 15:50

jeremyyeo and others added 3 commits October 8, 2021 08:52

fix none when both test dates are set

b957036

Update README

b74f840

merge README

f3eae90

clausherther approved these changes Oct 8, 2021

View reviewed changes

clausherther merged commit b3c5231 into calogica:main Oct 8, 2021

This was referenced Oct 8, 2021

expect_row_values_to_have_data_for_every_n_datepart fails when both a start and end date are passed #113

Closed

Feature/add interval arg to values every n datepart #110

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix expect_row_values_to_have_data_for_every_n_datepart errors when both start and end dates are set #115

Fix expect_row_values_to_have_data_for_every_n_datepart errors when both start and end dates are set #115

jeremyyeo commented Oct 6, 2021

clausherther left a comment

clausherther Oct 8, 2021

clausherther Oct 8, 2021

clausherther Oct 8, 2021

Fix expect_row_values_to_have_data_for_every_n_datepart errors when both start and end dates are set #115

Fix expect_row_values_to_have_data_for_every_n_datepart errors when both start and end dates are set #115

Conversation

jeremyyeo commented Oct 6, 2021

clausherther left a comment

Choose a reason for hiding this comment

clausherther Oct 8, 2021

Choose a reason for hiding this comment

clausherther Oct 8, 2021

Choose a reason for hiding this comment

clausherther Oct 8, 2021

Choose a reason for hiding this comment