[CT-10785] Microbatch models should respect `full_refresh` model config #10788

QMalcolm · 2024-09-26T20:31:03Z

Resolves #10785

Problem

In dbt-core, it is generally expected that values passed via CLI flags take precedence over model level configs. However, full_refresh on a model is an exception to this rule, where in the model config takes precedence. This config exists specifically to prevent accidental full refreshes of large incremental models, as doing so can be costly. It is actually best practice to set full_refresh=False on incremental models.

Prior to this PR, for microbatch models, the above was not happening. The CLI flag --full-refresh was taking precedence over the model config full_refresh. That meant that if --full-refresh was supplied, then the microbatch model would full refresh even if full_refresh=False was set on the model. This PR solves that problem.

Solution

Update _is_incremental to prefer the model.config.full_refresh value over the RuntimeConfig.args.FULL_REFRESH value provided by CLI flags.

Checklist

I have read the contributing guide and understand what's expected of me.
I have run this code in development, and it appears to resolve the stated issue.
This PR includes tests, or tests are not required or relevant for this PR.
This PR has no interface changes (e.g., macros, CLI, logs, JSON artifacts, config files, adapter interface, etc.) or this PR has already received feedback and approval from Product or DX.
This PR includes type annotations for new and modified functions.

… configs

In dbt-core, it is generally expected that values passed via CLI flags take precedence over model level configs. However, `full_refresh` on a model is an exception to this rule, where in the model config takes precedence. This config exists specifically to _prevent_ accidental full refreshes of large incremental models, as doing so can be costly. **_It is actually best practice_** to set `full_refresh=False` on incremental models. Prior to this commit, for microbatch models, the above was not happening. The CLI flag `--full-refresh` was taking precedence over the model config `full_refresh`. That meant that if `--full-refresh` was supplied, then the microbatch model **_would full refresh_** even if `full_refresh=False` was set on the model. This commit solves that problem.

codecov · 2024-09-26T20:36:49Z

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 89.09%. Comparing base (41e4836) to head (4d5381e).
Report is 3 commits behind head on main.

Additional details and impacted files

@@            Coverage Diff             @@
##             main   #10788      +/-   ##
==========================================
+ Coverage   86.34%   89.09%   +2.75%     
==========================================
  Files         182      183       +1     
  Lines       23288    23321      +33     
==========================================
+ Hits        20107    20779     +672     
+ Misses       3181     2542     -639

Flag	Coverage Δ
integration	`86.32% <100.00%> (-0.02%)`	⬇️
unit	`62.21% <100.00%> (?)`

Flags with carried forward coverage won't be shown. Click here to find out more.

Components	Coverage Δ
Unit Tests	`62.21% <100.00%> (∅)`
Integration Tests	`86.32% <100.00%> (-0.02%)`	⬇️

ChenyuLInx

The code looks good but I have some questions about the new behavior.

Is this a behavior change for the existing incremental model also?
My understanding is with the new overwrite order, the user can not do a full refresh without changing the code. Is this correct? If yes is this what we want? @graciegoheen
Feel free to dismiss my review if my understanding is wrong.

My understanding was wrong

QMalcolm · 2024-09-26T21:36:26Z

The code looks good but I have some questions about the new behavior.

Is this a behavior change for the existing incremental model also? My understanding is with the new overwrite order, the user can not do a full refresh without changing the code. Is this correct? If yes is this what we want? @graciegoheen

Feel free to dismiss my review if my understanding is wrong.

Good question! I was also confused when I first learned about it 😅

For clarification, this is not new behavior for incremental models. This is already how incremental models operate on the full_refresh model config. This is only a change to make microbatch models following the same pattern. Here is the documentation on the full_refresh model config.

Dismissing review as reviewer suggested

ChenyuLInx

🚢

QMalcolm added 2 commits September 26, 2024 15:21

Add tests to check how microbatch models respect full_refresh model…

4bdbfbf

… configs

QMalcolm requested a review from a team as a code owner September 26, 2024 20:31

cla-bot bot added the cla:yes label Sep 26, 2024

QMalcolm requested a review from MichelleArk September 26, 2024 20:31

Add changie doc for microbatch full_refresh config handling

4d5381e

dbt-labs deleted a comment from github-actions bot Sep 26, 2024

ChenyuLInx previously requested changes Sep 26, 2024

View reviewed changes

ChenyuLInx approved these changes Sep 26, 2024

View reviewed changes

QMalcolm merged commit d8b1bf5 into main Sep 26, 2024
68 of 71 checks passed

QMalcolm deleted the qmalcolm--10785-microbatch-models-respect-full-refresh-config branch September 26, 2024 21:43

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[CT-10785] Microbatch models should respect `full_refresh` model config #10788

[CT-10785] Microbatch models should respect `full_refresh` model config #10788

QMalcolm commented Sep 26, 2024 •

edited

Loading

codecov bot commented Sep 26, 2024 •

edited

Loading

ChenyuLInx left a comment •

edited

Loading

QMalcolm commented Sep 26, 2024 •

edited

Loading

ChenyuLInx left a comment

[CT-10785] Microbatch models should respect full_refresh model config #10788

[CT-10785] Microbatch models should respect full_refresh model config #10788

Conversation

QMalcolm commented Sep 26, 2024 • edited Loading

Problem

Solution

Checklist

codecov bot commented Sep 26, 2024 • edited Loading

Codecov Report

ChenyuLInx left a comment • edited Loading

Choose a reason for hiding this comment

QMalcolm commented Sep 26, 2024 • edited Loading

ChenyuLInx left a comment

Choose a reason for hiding this comment

[CT-10785] Microbatch models should respect `full_refresh` model config #10788

[CT-10785] Microbatch models should respect `full_refresh` model config #10788

QMalcolm commented Sep 26, 2024 •

edited

Loading

codecov bot commented Sep 26, 2024 •

edited

Loading

ChenyuLInx left a comment •

edited

Loading

QMalcolm commented Sep 26, 2024 •

edited

Loading