[Bug] Unit Test fails when using a versioned model as an input #10528

kbrock91 · 2024-08-06T14:22:04Z

Is this a new bug in dbt-core?

I believe this is a new bug in dbt-core
I have searched the existing issues, and I could not find an existing issue for this bug

Current Behavior

Unit tests fail if an input reference is a versioned model and the version is not explicitly defined in both the model AND the unit test.

Expected Behavior

The unit test should default to the latest version specified in the yml config for the input versioned model.

Steps To Reproduce

In the dbt Cloud IDE, create a versioned mode (e.g. stg_tpch_orders, with versions 1 and 2) and second model (e.g. customer_tier) that references that versioned model, and a unit test on that second model (customer_tier) that references that versioned model as an input
Run a dbt build --select customer_tier

If the customer_tier model does not reference a specific version, the unit test will always fail, regardless if a version is specified or not. For this scenario, see code below.
If the customer_tier model does reference a specific version, the unit test will fail if no version is specified. The unit test does work if you explicitly specify ref('stg_tpch_orders', v =2) in both places

Code for reference

schema.yml for stg_tpch_orders

- name: stg_tpch_orders
    description: staging layer for orders data
    versions:
      - v: 1
        columns:
          - include: all
            exclude: [comment]
        deprecation_date: 2024-8-31 00:00:00.00+00:00
      - v: 2
        columns:
          - include: all
    latest_version: 2

customer_tier.sql

{{
    config(
        materialized='table'
    )
}}

with customer as (
    select * from {{ ref('stg_tpch_customers') }}
),

orders as (
    select * from {{ ref('stg_tpch_orders') }}
),
final as (
    select
        customer.customer_key,
        sum(orders.total_price) as lifetime_value,
        case 
            when lifetime_value <= 200000 then 'tier1'
            when lifetime_value > 2000000 then 'tier2'
            when lifetime_value between 1000000 and 1999999 then 'tier3'
            when lifetime_value between 0 and 999999 then 'tier4' 
        end as tier_name, 
        max(orders.comment) as comment
    from customer
        inner join orders
            on customer.customer_key = orders.customer_key
    group by 1
)

select * from final

unit test:

unit_tests:
  - name: tiers_are_working
    description: "check if the logic for tiering is working correctly"
    model: customer_tier
    given:
      - input: ref('stg_tpch_customers')
        format: dict
        rows:
          - {customer_key: 629}
          - {customer_key: 4}
          - {customer_key: 1}
          - {customer_key: 26}

      - input: ref('stg_tpch_orders') 
        format: dict
        rows:
          - {customer_key: 629, total_price: 163443}
          - {customer_key: 4, total_price: 4134568}
          - {customer_key: 1,  total_price: 1428872}
          - {customer_key: 26, total_price: 418512}

    expect:
      rows:
        - {customer_key: 629,  tier_name: tier1}
        - {customer_key: 4,    tier_name: tier2}
        - {customer_key: 1,    tier_name: tier3}
        - {customer_key: 26,   tier_name: tier4}

Relevant log output

14:20:52 Compilation Error in unit_test tiers_are_working (models/marts/intermediate/intermediate.yml)
  Unit_Test 'unit_test.analytics.customer_tier.tiers_are_working' (models/marts/intermediate/intermediate.yml) depends on a node named 'stg_tpch_orders' which was not found

Environment

- OS:
- Python:
- dbt: Latest Version in dbt Cloud IDE

Which database adapter are you using with dbt?

No response

Additional Context

No response

The text was updated successfully, but these errors were encountered:

dbeatty10 · 2024-08-09T13:23:51Z

EDIT: ignore this comment below; it worked for me when a tried it again today (2024-08-29).

With dbt-core 1.8.3, even specifying it in both places didn't work for me. See below for details.

models/fct_orders.sql

select *
from {{ ref('stg_orders', v=2) }}

models/_unit_tests.yml

unit_tests:
  - name: test_10528
    model: fct_orders
    given:
      - input: ref('stg_orders', v=2) 
        format: dict
        rows:
          - {id: 2}
    expect:
        rows:
          - {id: 2}

models/_models.yml

models:
  - name: stg_orders
    versions:
      - v: 1
        columns:
          - include: all
            exclude: [added_column]
      - v: 2
        columns:
          - include: all
    latest_version: 2

models/stg_orders_v1.sql

select 1 as id

models/stg_orders_v2.sql

select 1 as id, 2 as added_column

dbeatty10 · 2024-08-29T18:35:40Z

The reprex below has two cases:

❌ test_10528_a: model does not reference a specific version
❌ test_10528_b: model does reference a specific version but it is not the latest version
✅ test_10528_c: model does reference a specific version and it is the latest version

The first is a simplification of the example given in this bug report.
The second was described in this bug report and also reported in #10623.
The third works fine without issues. This scenario still worked when a prerelease version of the model was added to the project.

Reprex

models/_unit_tests.yml

unit_tests:

  # ❌
  - name: test_10528_a
    description: model **does not** reference a specific version
    model: fct_orders_a
    given:
      - input: ref('stg_orders')
        format: dict
        rows:
          - {id: 2}
    expect:
        rows:
          - {id: 2}

  # ❌
  - name: test_10528_b
    description: model **does** reference a specific version **but** it is not the latest version
    model: fct_orders_b
    given:
      # This will work if updated to ref('stg_orders', v=1) though
      - input: ref('stg_orders')
        format: dict
        rows:
          - {id: 3}
    expect:
        rows:
          - {id: 3}

  # ✅
  - name: test_10528_c
    description: model **does** reference a specific version **and** it is the latest version
    model: fct_orders_c
    given:
      - input: ref('stg_orders')
        format: dict
        rows:
          - {id: 4}
    expect:
        rows:
          - {id: 4}

models/_models.yml

models:
  - name: stg_orders
    latest_version: 2
    versions:
      - v: 1
        columns:
          - include: all
            exclude: [added_column]
      - v: 2
        columns:
          - include: all

models/stg_orders_v1.sql

select 1 as id

models/stg_orders_v2.sql

select 1 as id, 2 as added_column

models/fct_orders_a.sql

select *
from {{ ref('stg_orders') }}

models/fct_orders_b.sql

select *
from {{ ref('stg_orders', v=1) }}

models/fct_orders_c.sql

select *
from {{ ref('stg_orders', v=2) }}

Run these commands:

dbt run --empty
dbt test --select test_10528_a test_10528_b test_10528_c

Get this output:

$ dbt test --select test_10528_a test_10528_b test_10528_c

18:34:36  Running with dbt=1.8.0
18:34:39  Registered adapter: duckdb=1.8.3
18:34:39  Found 5 models, 1 analysis, 410 macros, 3 unit tests
18:34:39  
18:34:39  Concurrency: 1 threads (target='dev')
18:34:39  
18:34:39  1 of 3 START unit_test fct_orders_a::test_10528_a .............................. [RUN]
18:34:39  1 of 3 ERROR fct_orders_a::test_10528_a ........................................ [ERROR in 0.03s]
18:34:39  2 of 3 START unit_test fct_orders_b::test_10528_b .............................. [RUN]
18:34:39  2 of 3 ERROR fct_orders_b::test_10528_b ........................................ [ERROR in 0.01s]
18:34:39  3 of 3 START unit_test fct_orders_c::test_10528_c .............................. [RUN]
18:34:39  3 of 3 PASS fct_orders_c::test_10528_c ......................................... [PASS in 0.16s]
18:34:39  
18:34:39  Finished running 3 unit tests in 0 hours 0 minutes and 0.35 seconds (0.35s).
18:34:39  
18:34:39  Completed with 2 errors and 0 warnings:
18:34:39  
18:34:39    Compilation Error in unit_test test_10528_a (models/_unit_tests.yml)
  Unit_Test 'unit_test.my_project.fct_orders_a.test_10528_a' (models/_unit_tests.yml) depends on a node named 'stg_orders' which was not found
18:34:39  
18:34:39    Compilation Error in unit_test test_10528_b (models/_unit_tests.yml)
  Unit_Test 'unit_test.my_project.fct_orders_b.test_10528_b' (models/_unit_tests.yml) depends on a node named 'stg_orders' with version '1' which was not found
18:34:39  
18:34:39  Done. PASS=1 WARN=0 ERROR=2 SKIP=0 TOTAL=3

kbrock91 added bug Something isn't working triage labels Aug 6, 2024

dbeatty10 added model_versions unit tests Issues related to built-in dbt unit testing functionality labels Aug 6, 2024

dbeatty10 removed the triage label Aug 9, 2024

dbeatty10 mentioned this issue Aug 29, 2024

[Bug] Runtime Error in unit_test when using versioned models #10623

Closed

2 tasks

clairegchen mentioned this issue Oct 17, 2024

[Bug] Unit test fails with versioned inputs #10880

Closed

2 tasks

devmessias mentioned this issue Oct 21, 2024

fix: unit tests with versioned refs #10889

Merged

5 tasks

MichelleArk closed this as completed in #10889 Nov 14, 2024

dbeatty10 mentioned this issue Nov 24, 2024

[Bug] Unit tests don't recognize versioned sources #11039

Closed

2 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Bug] Unit Test fails when using a versioned model as an input #10528

[Bug] Unit Test fails when using a versioned model as an input #10528

kbrock91 commented Aug 6, 2024

dbeatty10 commented Aug 9, 2024 •

edited

Loading

dbeatty10 commented Aug 29, 2024

Reprex

[Bug] Unit Test fails when using a versioned model as an input #10528

[Bug] Unit Test fails when using a versioned model as an input #10528

Comments

kbrock91 commented Aug 6, 2024

Is this a new bug in dbt-core?

Current Behavior

Expected Behavior

Steps To Reproduce

Relevant log output

Environment

Which database adapter are you using with dbt?

Additional Context

dbeatty10 commented Aug 9, 2024 • edited Loading

dbeatty10 commented Aug 29, 2024

Reprex

dbeatty10 commented Aug 9, 2024 •

edited

Loading