Feature/defer tests #2706

beckjake · 2020-08-14T19:14:57Z

resolves #2701

Description

Support deferred tests.

I also prevented --defer from doing anything on snapshots/seeds more explicitly. I think the CLI argument will default to not being set regardless of the environment variable, but it's even easier to be sure if we just disable that code path entirely for snapshots and seeds.

Checklist

I have signed the CLA
I have run this code in development and it appears to resolve the stated issue
This PR includes tests, or tests are not required/relevant for this PR
I have updated the CHANGELOG.md and added information about my change to the "dbt next" section.

jtcohen6 · 2020-08-15T00:53:31Z

This works! However, I'm noticing that the model references being tested are always deferred.

We need the ability to defer only some of the model references. For instance:

dbt test -m unique_fct_sprints_name --defer --state .state

I would expect this to execute my test against the production (deferred) version of fct_sprints. Whereas

dbt test -m fct_sprints --defer --state .state

Since fct_sprints is included in my selection criteria, I would expect the fct_sprints reference to not be deferred, and the tests to execute against my development version of fct_sprints.

I realize #2701 wasn't clear about this and I updated/clarified there. I also get that dbt test --models selection is weird; are the model nodes themselves actually included in the final set of selected nodes?

In any case, we just need some way for the following:

dbt test -m state:modified+ --defer --state path/to/artifacts

For any models that are modified, run all tests (whether new/modified or not) against those models' current (un-deferred) namespace
For all tests that are new/modified, if they select from a non-modified model, defer the reference to that model

Does that make sense? Is it possible?

beckjake · 2020-08-17T13:54:05Z

I also get that dbt test --models selection is weird; are the model nodes themselves actually included in the final set of selected nodes?

Yeah. This is just how test selection works: Your selector indicates the nodes whose tests should be run, not the nodes that should be actually selected.

Previously I suggested making dbt test -m foo do nothing, and requiring users to run dbt test -m foo+ to get their tests of the foo model. That was shot down because some folks felt that requiring the + to get your tests running would be too confusing.

The actual process for the test selector goes like:

for each node selector, dbt evaluates that selector and finds all the selected nodes
- dbt then adds all dependent test nodes, to each selector
- appropriate set operations are applied to these expanded sets of nodes
nodes are then filtered based on their resource types (so non-tests are removed).

So I guess this is more of a "sort of" than a "yeah".

jtcohen6 · 2020-08-17T16:34:14Z

Ok! It sounds like this is going to be trickier than I initially appreciated.

Let's say I have a project with model_a that already exists in a production schema. If I create:

a modified model, model_b, that selects from model_a
a new test that selects from model_a (but model_a is not itself new or modified)

Then dbt ls -s state:modified returns me:

random.model_b # modified model
random.schema_test.unique_model_a_column_a # unmodified model, new test

When I:

dbt run -m state:modified+ --defer --state path/to/prod/artifacts

I will create model_b (but not model_a) in_ my development schema. My version of model_b deferred its reference of model_a to use the production namespace.

When it comes time to test, I can either defer:

dbt test -m state:modified+ --defer --state path/to/previous/run

This will defer all model references to the production run manifest. The test on model_b will reference the old version of model_b in the production namespace, so it actually will not test my modifications.

Or not defer:

dbt test -m state:modified+ path/to/previous/run

This will fail because it includes the new test model_a, but model_a does not exist in my production namespace.

N.B. Deferral works seamlessly if model_b is new, because then it doesn't exist in the deferred manifest and dbt falls back to the current namespace.

I think there are two potential approaches here:

Much trickier deferral logic, based on whether the model itself (or just a test node that depends on the model) is included in the dbt test selection criteria.
Selector-based, where there's some syntax that enables bifurcation:

dbt test -m [modified_models] --state path/to/previous/run
dbt test -m [modified_tests_on_unmodified_models] --defer --state path/to/previous/run

Is such a selection syntax possible today? A good answer here would get us what we need, without needing to make test deferral much more complex. (It still wouldn't solve for relationships or data tests that depend on multiple models, though, where there's a possibility of one model being modified and another being unmodified.)

Otherwise, I think we should hold off on adding support for dbt test --defer. The only CI-related scenario where it's needed is when a user adds a test, or changes the severity of an existing test, without making any changes to the model it depends on. That's more common than an edge case, but it's not so common as to be a prerequisite for the beta version of slim CI. I'd rather document it as unsupported, rather than implement it now and significantly alter the implementation later.

cla-bot bot added the cla:yes label Aug 14, 2020

beckjake force-pushed the feature/defer-tests branch from 0bea57a to 4d854cc Compare August 14, 2020 19:15

beckjake changed the base branch from dev/marian-anderson to feature/state-modified-selector August 14, 2020 19:15

Base automatically changed from feature/state-modified-selector to dev/marian-anderson August 14, 2020 19:46

defer tests, too

280803c

beckjake force-pushed the feature/defer-tests branch from 4d854cc to 280803c Compare August 14, 2020 19:46

beckjake requested review from jtcohen6, gshank and kwigley August 14, 2020 19:46

gshank approved these changes Aug 14, 2020

View reviewed changes

kwigley approved these changes Aug 17, 2020

View reviewed changes

jtcohen6 mentioned this pull request Sep 15, 2020

Support deferred tests #2701

Closed

jtcohen6 closed this Sep 15, 2020

This was referenced Dec 12, 2020

Defer iff unselected reference does not exist in current env #2946

Merged

Feature: defer tests #2954

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Feature/defer tests #2706

Feature/defer tests #2706

beckjake commented Aug 14, 2020 •

edited

Loading

jtcohen6 commented Aug 15, 2020

beckjake commented Aug 17, 2020 •

edited

Loading

jtcohen6 commented Aug 17, 2020 •

edited

Loading

Feature/defer tests #2706

Feature/defer tests #2706

Conversation

beckjake commented Aug 14, 2020 • edited Loading

Description

Checklist

jtcohen6 commented Aug 15, 2020

beckjake commented Aug 17, 2020 • edited Loading

jtcohen6 commented Aug 17, 2020 • edited Loading

beckjake commented Aug 14, 2020 •

edited

Loading

beckjake commented Aug 17, 2020 •

edited

Loading

jtcohen6 commented Aug 17, 2020 •

edited

Loading