Transactional json logging #1806

beckjake · 2019-10-04T19:57:37Z

This is based on #1801, so once that is merged I'll rebase this against dev/louisa-may-alcott.

drewbanin

This broadly looks very good to me. I think there are a couple more log lines we'll want to add here:

at the beginning of the run, publish a log line that reports the number of resources that are going to be run in the invocation (eg. `"node_count": 17). This will help track completion percentage as the rest of the logs flow
in the Began running model log line, do we have a node index? That will help order the nodes in a list (i imagine doing this by timestamp could be tricky, but doable if necessary)

And a couple of questions:

how does the exc_info field get populated?
can we better structure the field values around success/error states? It looks like model_state can either be passed, or a literal error message. If possible, it would be good to present a field which has a value in {success, error, skipped} that can be used in conjunction with a free-text field which describes the error (if one occurs).

Let me know if any of these things are more involved to implement than they appear at first blush. We have a lot of options here!

beckjake · 2019-10-07T23:59:55Z

how does the exc_info field get populated?

If you log a message with exc_info=True in an except block (and maybe in a finally block if there's an active exception), logging and logbook both insert that.

can we better structure the field values around success/error states? It looks like model_state can either be passed, or a literal error message. If possible, it would be good to present a field which has a value in {success, error, skipped} that can be used in conjunction with a free-text field which describes the error (if one occurs).

Yeah, we can do this, I think. Part of the problem is that dbt's error, status, etc fields in results are a bit ad-hoc. I'm probably being generous with that. We should probably fix the underlying data structures instead of trying to fix it on the logger side, which is a bit involved.

Add node count/total node to log lines Make model status structured

beckjake · 2019-10-08T00:15:58Z

I'm going to resolve that error/status issue by just calling str on things that might not be strings, instead of cleaning up the underlying behavior - we can squash any issues with weird outputs as we find them.

drewbanin

Just a couple more cosmetic updates to make here, but this LGTM when those are addressed.

We will almost certainly want to add more detailed information here over time, but we can revisit some additional fields/tagging in the scope of a future issue.

drewbanin · 2019-10-09T17:27:55Z

core/dbt/node_runners.py

+        if result.error:
+            return {'model_status': 'error', 'model_error': str(result.error)}
+        elif result.skip:
+            return {'model_status': 'skipped'}
+        elif result.fail:
+            return {'model_status': 'failed'}
+        elif result.warn:
+            return {'model_status': 'warn'}
+        else:
+            return {'model_status': 'passed'}


can we change all of these from model_status to node_status for consistency? Many of these results will be generated from nodes which are not models.

drewbanin · 2019-10-09T17:28:17Z

core/dbt/node_runners.py

@@ -436,6 +448,12 @@ def on_skip(self):
            'Freshness: nodes cannot be skipped!'
        )

+    def get_result_status(self, result) -> Dict[str, str]:
+        if result.error:
+            return {'model_status': 'error', 'model_error': str(result.error)}


Let's also make this node_error instead of model_error

drewbanin · 2019-10-09T18:56:03Z

core/dbt/logger.py

+            ('schema', 'node_schema'),
+            ('database', 'node_database'),
+            ('name', 'node_name'),
+            ('original_file_path', 'node_path')


can we also include the resource type here?

s/model/node/ include resource_type

drewbanin

LGTM!

beckjake requested a review from drewbanin October 4, 2019 19:57

cla-bot bot added the cla:yes label Oct 4, 2019

beckjake force-pushed the feature/add-docs-generate-rpc branch from 02feb69 to 2031e23 Compare October 7, 2019 14:16

beckjake changed the base branch from feature/add-docs-generate-rpc to dev/louisa-may-alcott October 7, 2019 15:56

Jacob Beck added 2 commits October 7, 2019 11:58

Transactional logging

aa254b3

Address feedback, appease flake8/mypy/unit tests

52ac98e

beckjake force-pushed the feature/transactional-json-logging branch from 4b227a6 to 52ac98e Compare October 7, 2019 15:58

drewbanin reviewed Oct 7, 2019

View reviewed changes

PR feedback

4018e55

Add node count/total node to log lines Make model status structured

beckjake requested a review from drewbanin October 8, 2019 12:50

drewbanin reviewed Oct 9, 2019

View reviewed changes

PR feedback

2b9c5ae

s/model/node/ include resource_type

drewbanin approved these changes Oct 10, 2019

View reviewed changes

beckjake merged commit 9819c3a into dev/louisa-may-alcott Oct 10, 2019

beckjake deleted the feature/transactional-json-logging branch October 10, 2019 13:22

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Transactional json logging #1806

Transactional json logging #1806

beckjake commented Oct 4, 2019

drewbanin left a comment

beckjake commented Oct 7, 2019

beckjake commented Oct 8, 2019

drewbanin left a comment

drewbanin Oct 9, 2019

drewbanin Oct 9, 2019

drewbanin Oct 9, 2019

drewbanin left a comment

Transactional json logging #1806

Transactional json logging #1806

Conversation

beckjake commented Oct 4, 2019

drewbanin left a comment

Choose a reason for hiding this comment

beckjake commented Oct 7, 2019

beckjake commented Oct 8, 2019

drewbanin left a comment

Choose a reason for hiding this comment

drewbanin Oct 9, 2019

Choose a reason for hiding this comment

drewbanin Oct 9, 2019

Choose a reason for hiding this comment

drewbanin Oct 9, 2019

Choose a reason for hiding this comment

drewbanin left a comment

Choose a reason for hiding this comment