-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[CT-1914] Store links to tests on model node in manifest #6746
Comments
This would be very useful (aside from the performance improvements) and the inverse would also be useful, tests can only have one parent model/source so a |
Small clarification here - tests can have multiple parents, e.g. the |
@jtcohen6 can I clarify what I mean here, a test can only have one real parent, ie. the model which it tests. This lets us ask 'did the model pass its tests', regardless of what other models/sources are involved in that test. It's a good point but can I check, for the following model example version: 2
models:
- name: orders
columns:
- name: customer_id
tests:
- relationships:
to: ref('customers')
field: id If I run dbt test --select orders The test will run, but if I run dbt test --select customers it won't? This is what I mean by a parent and it is very useful to know for writing testing logic downstream that handles testfailures or passes. |
My interest that sparked the original Slack conversation was rooted in working with codegen e.g. if you generate yaml, you can pull existing descriptions from the graph but you cannot pull tests out of the graph the same way. Having a test property would expand the potential for more automation in generating and updating documentation. @adammarples I agree that the way a @jtcohen6 in this proposal, is |
Inspired by community Slack thread
Background
Generally, each manifest node records only its parents, in the form of
depends_on.nodes
. While we do have a method to create a full "parent map" and "child map" in the manifest, which is included inmanifest.json
and leveraged indbt-docs
, this is still an expensive process.Problems
For consumers of dbt metadata (whether via Jinja
{{ graph }}
shenanigans, or parsingmanifest.json
after the fact), it feels harder than it should to figure out the tests defined on a model. That's in sharp contrast to the very simple UX of sticking aunique
test "right on" a model.Also: performance. The
add_test_edges
method is a known performance bottleneck in large projects / DAGs with many tests. Roughly half the time of that method is just getting theunique_id
for each test that depends on a given node.Running
dbt build --exclude fqn:*
in our sample performance project (with partial parsing enabled),_get_tests_for_node
is just under 25% of the overall time:Proposal
What if we added a new
tests
property, allowing each model to know about the tests defined on it? For generic tests, specified in the same file as its parent's (yaml) configuration, we should be able to add that link during parsing, without any additional lookups.This would feel directionally correct for more future thinking around:
Questions:
relationships
)? Could we update the second parent's node (add the link) duringref
resolution?The text was updated successfully, but these errors were encountered: