Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[CT-1635] Populate adapter_response for tests + sources #2964

Closed
jtcohen6 opened this issue Dec 17, 2020 · 6 comments · Fixed by #6645
Closed

[CT-1635] Populate adapter_response for tests + sources #2964

jtcohen6 opened this issue Dec 17, 2020 · 6 comments · Fixed by #6645
Labels
artifacts dbt tests Issues related to built-in dbt testing functionality enhancement New feature or request jira Team:Adapters Issues designated for the adapter area of the code

Comments

@jtcohen6
Copy link
Contributor

jtcohen6 commented Dec 17, 2020

Describe the feature

#2747 requested that dbt log bytes processed for all query types, including tests, source-freshness, and run-operations. This prompted the first part of a solution in #2961: adding an adapter_response dict to record database-specific information about the queries dbt has run. In v0.19.0, that will include bytes_processed for all materializations (model runs, seeds, snapshots).

We'd like to record bytes processed for tests and source snapshots, too. What does this really mean?

  • dbt test should populate adapter_response in run_results.json
  • dbt source snapshot-freshness should populate adapter_response in sources.json

(I'm less clear on how we'd manage this in run-operation, since it doesn't write to run_results.)

Eventually, this has the potential to dovetail nicely with:

  • #2079: add a new log line to the "summary" stdout, with the total bytes processed from the invocation. In an ideal world, users could get a human-friendly rounded number in stdout and the raw numbers in run_results.json for analysis/aggregation.
  • Configurable node status in log output #2580: the ability to define which result information is populated to message in stdout (maybe via Jinja macro??)

Describe alternatives you've considered

  • Not doing this

Additional context

  • Most immediately relevant to BigQuery (bytes processed)
  • Useful for any database which sends back significant information in its response

Who will this benefit?

  • Projects with larger datasets, where filtering is important to limit cost of test queries
@jtcohen6 jtcohen6 added the enhancement New feature or request label Dec 17, 2020
@jtcohen6 jtcohen6 added this to the Oh-Twenty [TBD] milestone Dec 17, 2020
@jtcohen6 jtcohen6 changed the title Record bytes processed for tests Populate adapter_response for tests + sources Dec 22, 2020
@jtcohen6 jtcohen6 added the dbt tests Issues related to built-in dbt testing functionality label Dec 31, 2020
@jtcohen6 jtcohen6 removed this from the Margaret Mead milestone May 10, 2021
@github-actions
Copy link
Contributor

github-actions bot commented Nov 7, 2021

This issue has been marked as Stale because it has been open for 180 days with no activity. If you would like the issue to remain open, please remove the stale label or comment on the issue, or it will be closed in 7 days.

@github-actions github-actions bot added the stale Issues that have gone stale label Nov 7, 2021
@kwigley kwigley removed the stale Issues that have gone stale label Nov 16, 2021
@kwigley kwigley reopened this Nov 16, 2021
@github-actions
Copy link
Contributor

This issue has been marked as Stale because it has been open for 180 days with no activity. If you would like the issue to remain open, please remove the stale label or comment on the issue, or it will be closed in 7 days.

@github-actions github-actions bot added the stale Issues that have gone stale label May 15, 2022
@jtcohen6 jtcohen6 added help_wanted Trickier changes, with a clear starting point, good for previous/experienced contributors and removed stale Issues that have gone stale labels Jun 1, 2022
@jtcohen6
Copy link
Contributor Author

jtcohen6 commented Jun 1, 2022

I think this may be as simple as using result.adapter_response to populate the RunResult for tests + freshness results, where currently we're setting them to just {}:

https://github.com/dbt-labs/dbt-core/blob/main/core/dbt/include/global_project/macros/materializations/models/table/create_table_as.sql

adapter_response={},

We can look to the run task for the reference implementation:

adapter_response = {}
if isinstance(result.response, dbtClassMixin):
adapter_response = result.response.to_dict(omit_none=True)

@jtcohen6
Copy link
Contributor Author

One thing we'd need to change to get this working for dbt source freshness: Currently, the collect_freshness macro returns just its table result. We need it to return the response object as well, which we can then serialize into an adapter_response dictionary.

To illustrate the changes that I think would be needed: a066c1e

@Fleid Fleid added Team:Adapters Issues designated for the adapter area of the code jira and removed help_wanted Trickier changes, with a clear starting point, good for previous/experienced contributors labels Dec 8, 2022
@github-actions github-actions bot changed the title Populate adapter_response for tests + sources [CT-1635] Populate adapter_response for tests + sources Dec 8, 2022
@sebas-bdr
Copy link

hey @jtcohen6, this feature would be great.
I noticed the help_wanted label was recently removed, not sure (from reading the docs) whether this means contributions won't be accepted. But if possible I'd like to contribute.

To illustrate the changes that I think would be needed: a066c1e

Regarding testing, would the following code change suffice (specifically for source freshness)? sebas-bdr@572f143

@jamwalla
Copy link

I would love to have this feature to see how much my tests are costing!

I was able to get it working pretty easily for tests by returning the adapter response alongside the TestResultData object in TestRunner.execute_test:

def execute_test(self, test: TestNode, manifest: Manifest) -> TestResultData:

Same way that the ModelRunner does it:
adapter_response = result.response.to_dict(omit_none=True)
Then the adapter response can get passed up through TestRunner.execute.

Maybe not the prettiest solution but wanted to advocate for this being very useful + straightforward! thx dbt devs <3

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
artifacts dbt tests Issues related to built-in dbt testing functionality enhancement New feature or request jira Team:Adapters Issues designated for the adapter area of the code
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants