Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[ADAP-1008] [Bug] dbt does not create _dbt_max_partition for subsequent runs of models that are incremental (insert_overwrite) + partitioned + contract enforced. #1002

Open
2 tasks done
jeremyyeo opened this issue Nov 7, 2023 · 4 comments · May be fixed by #1369
Labels
bug Something isn't working model_contracts partitioning Related to creating, replacing, or pruning partitions to avoid full table scans

Comments

@jeremyyeo
Copy link
Contributor

jeremyyeo commented Nov 7, 2023

Is this a new bug in dbt-bigquery?

  • I believe this is a new bug in dbt-bigquery
  • I have searched the existing issues, and I could not find an existing issue for this bug

Current Behavior

As per title, if a model is:

  • Incremental
  • Strategy = 'insert_overwrite'
  • Contract enforced

Then the subsequent run does not declare the _dbt_max_partition variable as opposed to when the model was not contract enforced.

This results in an error if the model uses _dbt_max_partition in it's body.

Expected Behavior

We should be creating _dbt_max_partition regardless of contract enforcement.

Steps To Reproduce

  1. dbt project setup.
# dbt_project.yml
name: my_dbt_project
profile: all
config-version: 2
version: "1.0.0"

models:
  my_dbt_project:
    +materialized: table

# models/schema.yml
version: 2
models:
  - name: inc
    config:
      contract:
        enforced: true
    columns:
      - name: id
        data_type: int
      - name: updated_at
        data_type: date
-- models/inc.sql
{{
    config(
        materialized='incremental',
        partition_by={'field': 'updated_at', 'data_type': 'date'},
        incremental_strategy='insert_overwrite',
        on_schema_change='fail'
    )
}}

with src as (
    select 1 as id, date('2020-01-02') as updated_at
)

select * from src

{% if is_incremental() %}
 where src.updated_at >= _dbt_max_partition
{% endif %}
  1. Do an initial run to create the model - then do an incremental run.
$ dbt run --full-refresh
02:21:39  Running with dbt=1.6.7
02:21:42  Registered adapter: bigquery=1.6.8
02:21:43  Found 1 model, 0 sources, 0 exposures, 0 metrics, 394 macros, 0 groups, 0 semantic models
02:21:43  
02:21:47  Concurrency: 1 threads (target='bq')
02:21:47  
02:21:47  1 of 1 START sql incremental model dbt_jyeo.inc ................................ [RUN]
02:21:55  1 of 1 OK created sql incremental model dbt_jyeo.inc ........................... [CREATE TABLE (1.0 rows, 0 processed) in 7.60s]
02:21:55  
02:21:55  Finished running 1 incremental model in 0 hours 0 minutes and 12.22 seconds (12.22s).
02:21:55  
02:21:55  Completed successfully
02:21:55  
02:21:55  Done. PASS=1 WARN=0 ERROR=0 SKIP=0 TOTAL=1

$ dbt run
02:22:17  Running with dbt=1.6.7
02:22:18  Registered adapter: bigquery=1.6.8
02:22:19  Found 1 model, 0 sources, 0 exposures, 0 metrics, 394 macros, 0 groups, 0 semantic models
02:22:19  
02:22:24  Concurrency: 1 threads (target='bq')
02:22:24  
02:22:24  1 of 1 START sql incremental model dbt_jyeo.inc ................................ [RUN]
02:22:27  BigQuery adapter: https://console.cloud.google.com/bigquery?project=cse-sandbox-319708&j=bq:US:e0d131b5-624d-4a45-9996-67ef91b01279&page=queryresults
02:22:27  1 of 1 ERROR creating sql incremental model dbt_jyeo.inc ....................... [ERROR in 2.83s]
02:22:27  
02:22:27  Finished running 1 incremental model in 0 hours 0 minutes and 7.91 seconds (7.91s).
02:22:27  
02:22:27  Completed with 1 error and 0 warnings:
02:22:27  
02:22:27    Database Error in model inc (models/inc.sql)
  Unrecognized name: _dbt_max_partition at [11:26]
02:22:27  
02:22:27  Done. PASS=0 WARN=0 ERROR=1 SKIP=0 TOTAL=1
  1. Change the model so that the contract is not enforced.
# models/schema.yml
version: 2
models:
  - name: inc
    config:
      contract:
        enforced: false         # this changed #
    columns:
      - name: id
        data_type: int
      - name: updated_at
        data_type: date
  1. Rerun.
$ dbt run
02:24:05  Running with dbt=1.6.7
02:24:07  Registered adapter: bigquery=1.6.8
02:24:09  Found 1 model, 0 sources, 0 exposures, 0 metrics, 394 macros, 0 groups, 0 semantic models
02:24:09  
02:24:12  Concurrency: 1 threads (target='bq')
02:24:12  
02:24:12  1 of 1 START sql incremental model dbt_jyeo.inc ................................ [RUN]
02:24:23  1 of 1 OK created sql incremental model dbt_jyeo.inc ........................... [SCRIPT (40.0 Bytes processed) in 11.18s]
02:24:23  
02:24:23  Finished running 1 incremental model in 0 hours 0 minutes and 14.71 seconds (14.71s).
02:24:23  
02:24:23  Completed successfully
02:24:23  
02:24:23  Done. PASS=1 WARN=0 ERROR=0 SKIP=0 TOTAL=1

Relevant log output

============================== 15:22:17.430112 | b3049fd7-61a7-4902-bc4f-f59ce3646224 ==============================
�[0m15:22:17.430112 [info ] [MainThread]: Running with dbt=1.6.7
�[0m15:22:17.431127 [debug] [MainThread]: running dbt with arguments {'printer_width': '80', 'indirect_selection': 'eager', 'log_cache_events': 'False', 'write_json': 'True', 'partial_parse': 'False', 'cache_selected_only': 'False', 'profiles_dir': '/Users/jeremy/.dbt', 'debug': 'False', 'fail_fast': 'False', 'log_path': '/Users/jeremy/src/dbt-basic/logs', 'version_check': 'True', 'warn_error': 'None', 'use_colors': 'True', 'use_experimental_parser': 'False', 'no_print': 'None', 'quiet': 'False', 'log_format': 'default', 'invocation_command': 'dbt run', 'warn_error_options': 'WarnErrorOptions(include=[], exclude=[])', 'introspect': 'True', 'target_path': 'None', 'static_parser': 'True', 'send_anonymous_usage_stats': 'True'}
�[0m15:22:18.819881 [debug] [MainThread]: Sending event: {'category': 'dbt', 'action': 'project_id', 'label': 'b3049fd7-61a7-4902-bc4f-f59ce3646224', 'context': [<snowplow_tracker.self_describing_json.SelfDescribingJson object at 0x128872730>]}
�[0m15:22:18.832027 [debug] [MainThread]: Sending event: {'category': 'dbt', 'action': 'adapter_info', 'label': 'b3049fd7-61a7-4902-bc4f-f59ce3646224', 'context': [<snowplow_tracker.self_describing_json.SelfDescribingJson object at 0x12889ad30>]}
�[0m15:22:18.834134 [info ] [MainThread]: Registered adapter: bigquery=1.6.8
�[0m15:22:18.865058 [debug] [MainThread]: checksum: a051d2bc88277f3be74306f0393e0e8e6f29724fe11a36c13ebfccd4b87560d8, vars: {}, profile: , target: , version: 1.6.7
�[0m15:22:18.866356 [debug] [MainThread]: Partial parsing not enabled
�[0m15:22:19.879405 [debug] [MainThread]: Sending event: {'category': 'dbt', 'action': 'load_project', 'label': 'b3049fd7-61a7-4902-bc4f-f59ce3646224', 'context': [<snowplow_tracker.self_describing_json.SelfDescribingJson object at 0x128a960d0>]}
�[0m15:22:19.893653 [debug] [MainThread]: Sending event: {'category': 'dbt', 'action': 'resource_counts', 'label': 'b3049fd7-61a7-4902-bc4f-f59ce3646224', 'context': [<snowplow_tracker.self_describing_json.SelfDescribingJson object at 0x128a9cf40>]}
�[0m15:22:19.894365 [info ] [MainThread]: Found 1 model, 0 sources, 0 exposures, 0 metrics, 394 macros, 0 groups, 0 semantic models
�[0m15:22:19.895026 [debug] [MainThread]: Sending event: {'category': 'dbt', 'action': 'runnable_timing', 'label': 'b3049fd7-61a7-4902-bc4f-f59ce3646224', 'context': [<snowplow_tracker.self_describing_json.SelfDescribingJson object at 0x1288be7c0>]}
�[0m15:22:19.896912 [info ] [MainThread]: 
�[0m15:22:19.898170 [debug] [MainThread]: Acquiring new bigquery connection 'master'
�[0m15:22:19.899766 [debug] [ThreadPool]: Acquiring new bigquery connection 'list_cse-sandbox-319708'
�[0m15:22:19.900850 [debug] [ThreadPool]: Opening a new connection, currently in state init
�[0m15:22:23.253263 [debug] [ThreadPool]: Re-using an available connection from the pool (formerly list_cse-sandbox-319708, now list_cse-sandbox-319708_dbt_jyeo)
�[0m15:22:23.255154 [debug] [ThreadPool]: Opening a new connection, currently in state closed
�[0m15:22:24.956855 [debug] [MainThread]: Sending event: {'category': 'dbt', 'action': 'runnable_timing', 'label': 'b3049fd7-61a7-4902-bc4f-f59ce3646224', 'context': [<snowplow_tracker.self_describing_json.SelfDescribingJson object at 0x128ab4e80>]}
�[0m15:22:24.959767 [info ] [MainThread]: Concurrency: 1 threads (target='bq')
�[0m15:22:24.961602 [info ] [MainThread]: 
�[0m15:22:24.968430 [debug] [Thread-1  ]: Began running node model.my_dbt_project.inc
�[0m15:22:24.969843 [info ] [Thread-1  ]: 1 of 1 START sql incremental model dbt_jyeo.inc ................................ [RUN]
�[0m15:22:24.972164 [debug] [Thread-1  ]: Re-using an available connection from the pool (formerly list_cse-sandbox-319708_dbt_jyeo, now model.my_dbt_project.inc)
�[0m15:22:24.973451 [debug] [Thread-1  ]: Began compiling node model.my_dbt_project.inc
�[0m15:22:24.993249 [debug] [Thread-1  ]: Writing injected SQL for node "model.my_dbt_project.inc"
�[0m15:22:24.995790 [debug] [Thread-1  ]: Timing info for model.my_dbt_project.inc (compile): 15:22:24.974394 => 15:22:24.994714
�[0m15:22:24.997186 [debug] [Thread-1  ]: Began executing node model.my_dbt_project.inc
�[0m15:22:25.111007 [debug] [Thread-1  ]: Opening a new connection, currently in state closed
�[0m15:22:25.157037 [debug] [Thread-1  ]: On model.my_dbt_project.inc: select * from (
        

with src as (
    select 1 as id, date('2020-01-02') as updated_at
)

select * from src


 where src.updated_at >= _dbt_max_partition

    ) as __dbt_sbq
    where false and current_timestamp() = current_timestamp()
    limit 0

�[0m15:22:26.812904 [debug] [Thread-1  ]: BigQuery adapter: https://console.cloud.google.com/bigquery?project=cse-sandbox-319708&j=bq:US:fe5232bb-817e-45f8-8dcd-2db2d9549284&page=queryresults
�[0m15:22:26.814935 [debug] [Thread-1  ]: BigQuery adapter: Retry attempt 1 of 1 after error: BadRequest('Unrecognized name: _dbt_max_partition at [11:26]')
�[0m15:22:27.762705 [debug] [Thread-1  ]: BigQuery adapter: https://console.cloud.google.com/bigquery?project=cse-sandbox-319708&j=bq:US:e0d131b5-624d-4a45-9996-67ef91b01279&page=queryresults
�[0m15:22:27.766506 [error] [Thread-1  ]: BigQuery adapter: https://console.cloud.google.com/bigquery?project=cse-sandbox-319708&j=bq:US:e0d131b5-624d-4a45-9996-67ef91b01279&page=queryresults
�[0m15:22:27.769446 [debug] [Thread-1  ]: Timing info for model.my_dbt_project.inc (execute): 15:22:24.997725 => 15:22:27.768551
�[0m15:22:27.802477 [debug] [Thread-1  ]: Database Error in model inc (models/inc.sql)
  Unrecognized name: _dbt_max_partition at [11:26]
�[0m15:22:27.803602 [debug] [Thread-1  ]: Sending event: {'category': 'dbt', 'action': 'run_model', 'label': 'b3049fd7-61a7-4902-bc4f-f59ce3646224', 'context': [<snowplow_tracker.self_describing_json.SelfDescribingJson object at 0x128a7d640>]}
�[0m15:22:27.804784 [error] [Thread-1  ]: 1 of 1 ERROR creating sql incremental model dbt_jyeo.inc ....................... [�[31mERROR�[0m in 2.83s]
�[0m15:22:27.806016 [debug] [Thread-1  ]: Finished running node model.my_dbt_project.inc
�[0m15:22:27.809678 [debug] [MainThread]: Connection 'master' was properly closed.
�[0m15:22:27.810566 [debug] [MainThread]: Connection 'model.my_dbt_project.inc' was properly closed.
�[0m15:22:27.811235 [info ] [MainThread]: 
�[0m15:22:27.812022 [info ] [MainThread]: Finished running 1 incremental model in 0 hours 0 minutes and 7.91 seconds (7.91s).
�[0m15:22:27.813640 [debug] [MainThread]: Command end result
�[0m15:22:27.824234 [info ] [MainThread]: 
�[0m15:22:27.825180 [info ] [MainThread]: �[31mCompleted with 1 error and 0 warnings:�[0m
�[0m15:22:27.826052 [info ] [MainThread]: 
�[0m15:22:27.827293 [error] [MainThread]:   Database Error in model inc (models/inc.sql)
  Unrecognized name: _dbt_max_partition at [11:26]
�[0m15:22:27.828458 [info ] [MainThread]: 
�[0m15:22:27.830409 [info ] [MainThread]: Done. PASS=0 WARN=0 ERROR=1 SKIP=0 TOTAL=1
�[0m15:22:27.831684 [debug] [MainThread]: Command `dbt run` failed at 15:22:27.831525 after 10.44 seconds
�[0m15:22:27.832296 [debug] [MainThread]: Sending event: {'category': 'dbt', 'action': 'invocation', 'label': 'end', 'context': [<snowplow_tracker.self_describing_json.SelfDescribingJson object at 0x10e9c3640>, <snowplow_tracker.self_describing_json.SelfDescribingJson object at 0x12889ad30>, <snowplow_tracker.self_describing_json.SelfDescribingJson object at 0x10e1d0040>]}
�[0m15:22:27.832905 [debug] [MainThread]: Flushing usage events



============================== 15:24:05.583169 | 61c605bd-a7b0-4363-854f-959a89b441cf ==============================
�[0m15:24:05.583169 [info ] [MainThread]: Running with dbt=1.6.7
�[0m15:24:05.584255 [debug] [MainThread]: running dbt with arguments {'printer_width': '80', 'indirect_selection': 'eager', 'log_cache_events': 'False', 'write_json': 'True', 'partial_parse': 'False', 'cache_selected_only': 'False', 'profiles_dir': '/Users/jeremy/.dbt', 'version_check': 'True', 'debug': 'False', 'log_path': '/Users/jeremy/src/dbt-basic/logs', 'fail_fast': 'False', 'warn_error': 'None', 'use_colors': 'True', 'use_experimental_parser': 'False', 'no_print': 'None', 'quiet': 'False', 'warn_error_options': 'WarnErrorOptions(include=[], exclude=[])', 'static_parser': 'True', 'invocation_command': 'dbt run', 'introspect': 'True', 'target_path': 'None', 'log_format': 'default', 'send_anonymous_usage_stats': 'True'}
�[0m15:24:07.730739 [debug] [MainThread]: Sending event: {'category': 'dbt', 'action': 'project_id', 'label': '61c605bd-a7b0-4363-854f-959a89b441cf', 'context': [<snowplow_tracker.self_describing_json.SelfDescribingJson object at 0x11da7f730>]}
�[0m15:24:07.746267 [debug] [MainThread]: Sending event: {'category': 'dbt', 'action': 'adapter_info', 'label': '61c605bd-a7b0-4363-854f-959a89b441cf', 'context': [<snowplow_tracker.self_describing_json.SelfDescribingJson object at 0x11daa8d30>]}
�[0m15:24:07.747803 [info ] [MainThread]: Registered adapter: bigquery=1.6.8
�[0m15:24:07.782139 [debug] [MainThread]: checksum: a051d2bc88277f3be74306f0393e0e8e6f29724fe11a36c13ebfccd4b87560d8, vars: {}, profile: , target: , version: 1.6.7
�[0m15:24:07.783419 [debug] [MainThread]: Partial parsing not enabled
�[0m15:24:09.097377 [debug] [MainThread]: Sending event: {'category': 'dbt', 'action': 'load_project', 'label': '61c605bd-a7b0-4363-854f-959a89b441cf', 'context': [<snowplow_tracker.self_describing_json.SelfDescribingJson object at 0x11dc6efa0>]}
�[0m15:24:09.117368 [debug] [MainThread]: Sending event: {'category': 'dbt', 'action': 'resource_counts', 'label': '61c605bd-a7b0-4363-854f-959a89b441cf', 'context': [<snowplow_tracker.self_describing_json.SelfDescribingJson object at 0x11dc5f280>]}
�[0m15:24:09.118362 [info ] [MainThread]: Found 1 model, 0 sources, 0 exposures, 0 metrics, 394 macros, 0 groups, 0 semantic models
�[0m15:24:09.119150 [debug] [MainThread]: Sending event: {'category': 'dbt', 'action': 'runnable_timing', 'label': '61c605bd-a7b0-4363-854f-959a89b441cf', 'context': [<snowplow_tracker.self_describing_json.SelfDescribingJson object at 0x11dacb310>]}
�[0m15:24:09.121352 [info ] [MainThread]: 
�[0m15:24:09.122788 [debug] [MainThread]: Acquiring new bigquery connection 'master'
�[0m15:24:09.125153 [debug] [ThreadPool]: Acquiring new bigquery connection 'list_cse-sandbox-319708'
�[0m15:24:09.127209 [debug] [ThreadPool]: Opening a new connection, currently in state init
�[0m15:24:10.924567 [debug] [ThreadPool]: Re-using an available connection from the pool (formerly list_cse-sandbox-319708, now list_cse-sandbox-319708_dbt_jyeo)
�[0m15:24:10.926349 [debug] [ThreadPool]: Opening a new connection, currently in state closed
�[0m15:24:12.635469 [debug] [MainThread]: Sending event: {'category': 'dbt', 'action': 'runnable_timing', 'label': '61c605bd-a7b0-4363-854f-959a89b441cf', 'context': [<snowplow_tracker.self_describing_json.SelfDescribingJson object at 0x11dc6e160>]}
�[0m15:24:12.637981 [info ] [MainThread]: Concurrency: 1 threads (target='bq')
�[0m15:24:12.639029 [info ] [MainThread]: 
�[0m15:24:12.643045 [debug] [Thread-1  ]: Began running node model.my_dbt_project.inc
�[0m15:24:12.644197 [info ] [Thread-1  ]: 1 of 1 START sql incremental model dbt_jyeo.inc ................................ [RUN]
�[0m15:24:12.645697 [debug] [Thread-1  ]: Re-using an available connection from the pool (formerly list_cse-sandbox-319708_dbt_jyeo, now model.my_dbt_project.inc)
�[0m15:24:12.646534 [debug] [Thread-1  ]: Began compiling node model.my_dbt_project.inc
�[0m15:24:12.663998 [debug] [Thread-1  ]: Writing injected SQL for node "model.my_dbt_project.inc"
�[0m15:24:12.666397 [debug] [Thread-1  ]: Timing info for model.my_dbt_project.inc (compile): 15:24:12.647119 => 15:24:12.665929
�[0m15:24:12.667068 [debug] [Thread-1  ]: Began executing node model.my_dbt_project.inc
�[0m15:24:12.778245 [debug] [Thread-1  ]: Opening a new connection, currently in state closed
�[0m15:24:12.824908 [debug] [Thread-1  ]: On model.my_dbt_project.inc: /* {"app": "dbt", "dbt_version": "1.6.7", "profile_name": "all", "target_name": "bq", "node_id": "model.my_dbt_project.inc"} */
declare _dbt_max_partition date default (
      select max(updated_at) from `cse-sandbox-319708`.`dbt_jyeo`.`inc`
      where updated_at is not null
    );
  
    

    create or replace table `cse-sandbox-319708`.`dbt_jyeo`.`inc__dbt_tmp`
      
    partition by updated_at
    

    OPTIONS(
      expiration_timestamp=TIMESTAMP_ADD(CURRENT_TIMESTAMP(), INTERVAL 12 hour)
    )
    as (
      

with src as (
    select 1 as id, date('2020-01-02') as updated_at
)

select * from src


 where src.updated_at >= _dbt_max_partition

    );
  
�[0m15:24:14.487935 [debug] [Thread-1  ]: BigQuery adapter: https://console.cloud.google.com/bigquery?project=cse-sandbox-319708&j=bq:US:5fb3bb45-840f-47ed-8a07-5454721520d3&page=queryresults
�[0m15:24:18.852479 [debug] [Thread-1  ]: 
    In `cse-sandbox-319708`.`dbt_jyeo`.`inc`:
        Schema changed: False
        Source columns not in target: []
        Target columns not in source: []
        New column types: []
  
�[0m15:24:18.886625 [debug] [Thread-1  ]: Writing runtime sql for node "model.my_dbt_project.inc"
�[0m15:24:18.887955 [debug] [Thread-1  ]: On model.my_dbt_project.inc: /* {"app": "dbt", "dbt_version": "1.6.7", "profile_name": "all", "target_name": "bq", "node_id": "model.my_dbt_project.inc"} */

   
      -- generated script to merge partitions into `cse-sandbox-319708`.`dbt_jyeo`.`inc`
      declare dbt_partitions_for_replacement array<date>;

      
      
        -- 1. temp table already exists, we used it to check for schema changes
      -- 2. define partitions to update
      set (dbt_partitions_for_replacement) = (
          select as struct
              -- IGNORE NULLS: this needs to be aligned to _dbt_max_partition, which ignores null
              array_agg(distinct date(updated_at) IGNORE NULLS)
          from `cse-sandbox-319708`.`dbt_jyeo`.`inc__dbt_tmp`
      );

      -- 3. run the merge statement
      

    merge into `cse-sandbox-319708`.`dbt_jyeo`.`inc` as DBT_INTERNAL_DEST
        using (
        select
        * from `cse-sandbox-319708`.`dbt_jyeo`.`inc__dbt_tmp`
      ) as DBT_INTERNAL_SOURCE
        on FALSE

    when not matched by source
         and date(DBT_INTERNAL_DEST.updated_at) in unnest(dbt_partitions_for_replacement) 
        then delete

    when not matched then insert
        (`id`, `updated_at`)
    values
        (`id`, `updated_at`)

;

      -- 4. clean up the temp table
      drop table if exists `cse-sandbox-319708`.`dbt_jyeo`.`inc__dbt_tmp`

  


  

    
�[0m15:24:19.300265 [debug] [Thread-1  ]: BigQuery adapter: https://console.cloud.google.com/bigquery?project=cse-sandbox-319708&j=bq:US:ca0b4551-c329-4f7e-9b99-2de5c6edc8a7&page=queryresults
�[0m15:24:23.821804 [debug] [Thread-1  ]: Timing info for model.my_dbt_project.inc (execute): 15:24:12.667530 => 15:24:23.821497
�[0m15:24:23.823149 [debug] [Thread-1  ]: Sending event: {'category': 'dbt', 'action': 'run_model', 'label': '61c605bd-a7b0-4363-854f-959a89b441cf', 'context': [<snowplow_tracker.self_describing_json.SelfDescribingJson object at 0x11dd70c40>]}
�[0m15:24:23.824138 [info ] [Thread-1  ]: 1 of 1 OK created sql incremental model dbt_jyeo.inc ........................... [�[32mSCRIPT (40.0 Bytes processed)�[0m in 11.18s]
�[0m15:24:23.825139 [debug] [Thread-1  ]: Finished running node model.my_dbt_project.inc
�[0m15:24:23.827270 [debug] [MainThread]: Connection 'master' was properly closed.
�[0m15:24:23.827839 [debug] [MainThread]: Connection 'model.my_dbt_project.inc' was properly closed.
�[0m15:24:23.828368 [info ] [MainThread]: 
�[0m15:24:23.828993 [info ] [MainThread]: Finished running 1 incremental model in 0 hours 0 minutes and 14.71 seconds (14.71s).
�[0m15:24:23.830364 [debug] [MainThread]: Command end result
�[0m15:24:23.841392 [info ] [MainThread]: 
�[0m15:24:23.842184 [info ] [MainThread]: �[32mCompleted successfully�[0m
�[0m15:24:23.842728 [info ] [MainThread]: 
�[0m15:24:23.843421 [info ] [MainThread]: Done. PASS=1 WARN=0 ERROR=0 SKIP=0 TOTAL=1
�[0m15:24:23.844534 [debug] [MainThread]: Command `dbt run` succeeded at 15:24:23.844398 after 18.31 seconds
�[0m15:24:23.845091 [debug] [MainThread]: Sending event: {'category': 'dbt', 'action': 'invocation', 'label': 'end', 'context': [<snowplow_tracker.self_describing_json.SelfDescribingJson object at 0x103b10640>, <snowplow_tracker.self_describing_json.SelfDescribingJson object at 0x11dd6b7f0>, <snowplow_tracker.self_describing_json.SelfDescribingJson object at 0x11dc8e580>]}
�[0m15:24:23.845631 [debug] [MainThread]: Flushing usage events

Environment

- OS: macOS
- Python: 3.9.13
- dbt-core: 1.6.7
- dbt-bigquery: 1.6.8

Additional Context

Basically, when contract is enforced, we did not do:

declare _dbt_max_partition date default (
      select max(updated_at) from `cse-sandbox-319708`.`dbt_jyeo`.`inc`
      where updated_at is not null
    );

Plus other DDL/DML.

I think in the schema enforced scenario - we try to first figure out what the data types are via this introspection query:

�[0m15:22:25.157037 [debug] [Thread-1  ]: On model.my_dbt_project.inc: select * from (
        

with src as (
    select 1 as id, date('2020-01-02') as updated_at
)

select * from src


 where src.updated_at >= _dbt_max_partition

    ) as __dbt_sbq
    where false and current_timestamp() = current_timestamp()
    limit 0

We need to know the data types to know if any datatypes have drifted and thus to raise an error or append new columns (on_schema_change config).

Unfortunately the logic:

{% if is_incremental() %}
 where src.updated_at >= _dbt_max_partition
{% endif %}

Has resolved and since we don't create _dbt_max_partition until later, we error at that introspection query.

@jeremyyeo jeremyyeo added bug Something isn't working triage labels Nov 7, 2023
@github-actions github-actions bot changed the title [Bug] dbt does not create _dbt_max_partition for subsequent runs of models that are incremental (insert_overwrite) + partitioned + contract enforced. [ADAP-1008] [Bug] dbt does not create _dbt_max_partition for subsequent runs of models that are incremental (insert_overwrite) + partitioned + contract enforced. Nov 7, 2023
@dbeatty10 dbeatty10 added triage and removed triage labels Nov 7, 2023
@jeremyyeo
Copy link
Contributor Author

Shower thought: Is it just me or are there just so many knobs (config) for BQ models 😁

@tanghyd
Copy link

tanghyd commented Jan 8, 2024

👍 I have also encountered this error when trying to build an incremental (insert_overwrite) + partitioned + contract enforced table: Unrecognized name: _dbt_max_partition. Setting enforced: false for the model contract configuration enables the table to successfully build.

Copy link
Contributor

This issue has been marked as Stale because it has been open for 180 days with no activity. If you would like the issue to remain open, please comment on the issue or else it will be closed in 7 days.

@github-actions github-actions bot added the Stale label Oct 14, 2024
@carolinabtt carolinabtt linked a pull request Oct 15, 2024 that will close this issue
@carolinabtt
Copy link

I created a PR with a fix for this issue by updating the bigquery__get_empty_subquery_sql macro. Hope it helps.

@github-actions github-actions bot removed the Stale label Oct 17, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working model_contracts partitioning Related to creating, replacing, or pruning partitions to avoid full table scans
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants