Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Logging: restore previous (small) behaviors #4341

Merged
merged 10 commits into from
Nov 29, 2021
Merged

Conversation

jtcohen6
Copy link
Contributor

From #4260:

  • (size: S) Find a way to configure logger before firing any events, e.g. avoid text logs preceding JSON logs
$ dbt --log-format json run
{"data": null, "event_data_serialized": false, "invocation_id": "455dcb72-fcb5-41dd-bc64-4e0279a45a7c", "level": "info", "log_version": 1, "msg": "Running with dbt=1.0.0-rc2", "pid": 81044, "ts": "2021-11-25T14:56:50.296716"}
{"data": {"code": "Z046", "log_fmt": null, "msg": "* Deprecation Warning:\nThe `source-paths` config has been deprecated in favor of `model-paths`. Please\nupdate your `dbt_project.yml` configuration to reflect this change."}, "event_data_serialized": true, "invocation_id": "455dcb72-fcb5-41dd-bc64-4e0279a45a7c", "level": "warn", "log_version": 1, "msg": "* Deprecation Warning:\nThe `source-paths` config has been deprecated in favor of `model-paths`. Please\nupdate your `dbt_project.yml` configuration to reflect this change.", "pid": 81044, "ts": "2021-11-25T14:56:50.358948"}
{"data": {"code": "Z046", "log_fmt": null, "msg": "* Deprecation Warning:\nThe `data-paths` config has been deprecated in favor of `seed-paths`. Please\nupdate your `dbt_project.yml` configuration to reflect this change."}, "event_data_serialized": true, "invocation_id": "455dcb72-fcb5-41dd-bc64-4e0279a45a7c", "level": "warn", "log_version": 1, "msg": "* Deprecation Warning:\nThe `data-paths` config has been deprecated in favor of `seed-paths`. Please\nupdate your `dbt_project.yml` configuration to reflect this change.", "pid": 81044, "ts": "2021-11-25T14:56:50.360507"}
{"data": {"code": "W006", "stat_line": "2 models, 1 test, 0 snapshots, 0 analyses, 355 macros, 0 operations, 0 seed files, 0 sources, 0 exposures, 1 metric"}, "event_data_serialized": true, "invocation_id": "455dcb72-fcb5-41dd-bc64-4e0279a45a7c", "level": "info", "log_version": 1, "msg": "Found 2 models, 1 test, 0 snapshots, 0 analyses, 355 macros, 0 operations, 0 seed files, 0 sources, 0 exposures, 1 metric", "pid": 81044, "ts": "2021-11-25T14:56:50.485785"}
  • (size: S) feature parity for dbt ls + workflow automation: set stdout logger to WARN level instead of INFO (see here + here), or better yet replace print()
    here
    with an actual fired event that can be filtered on + piped out (??)
$ dbt ls
14:57:26 | [ warn  ] | * Deprecation Warning:
The `source-paths` config has been deprecated in favor of `model-paths`. Please
update your `dbt_project.yml` configuration to reflect this change.
14:57:26 | [ warn  ] | * Deprecation Warning:
The `data-paths` config has been deprecated in favor of `seed-paths`. Please
update your `dbt_project.yml` configuration to reflect this change.
testy.ephemeral_model
testy.my_model

For both of these, I took the goal as providing exact parity with legacy logging behavior, with as small a lift as possible. That means, if we hook into legacy logger methods from main.py by way of task methods in a few subtle ways (pre_init_hook, set_log_format), we should set the same configs for the new event logger. If we hate this and want to keep config centralized, we can take up a more serious refactor.

  • (size: S) include snowplow tracking events in file logging (not cli) when available (i.e. without breaking dbt clean)

This was just a matter of adding File back to tracking events. I'm able to dbt clean, with logs/ as a clean target, without dbt throwing up an error that it can't find the file it expects to write to. If I had to guess, the switch from WatchedFileHandler to RotatingFileHandler may have fixed this?

Checklist

  • I have signed the CLA
  • I have run this code in development and it appears to resolve the stated issue
  • This PR includes tests, or tests are not required/relevant for this PR
  • I have updated the CHANGELOG.md and added information about my change

@jtcohen6 jtcohen6 mentioned this pull request Nov 25, 2021
26 tasks
Copy link
Contributor

@nathaniel-may nathaniel-may left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

requesting changes just so I can have a question answered and some comments. Otherwise this seems like an ok fix to me.

@@ -56,6 +58,7 @@ def __init__(self, args, config):
def pre_init_hook(cls, args):
"""A hook called before the task is initialized."""
log_manager.stderr_console()
event_logger.STDOUT_LOG.level = logging.WARN
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm definitely not a fan of reaching in and modifying STDOUT_LOG like this, but I'm fine with it as a quick fix. Can we just leave a comment here stating that this is a hack and describe what the right way is?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

actually, are there info-level logs we're trying to filter out? I'm not sure why we're setting it to warn here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes - the list task wants to filter out all INFO-level logs, so that only its printed output appears in the command line:

dbt-core/core/dbt/logger.py

Lines 517 to 521 in f72b603

# this is used by `dbt ls` to allow piping stdout to jq, etc
def stderr_console(self):
"""Output to stderr at WARNING level instead of stdout"""
self._output_handler.stream = self.stderr
self._output_handler.level = logbook.WARNING

The way we've done this in the past is by setting the stdout logger level to WARN instead. It will still include warnings + errors in stdout.

Comment on lines 66 to 70
if args.log_format == 'json':
log_manager.format_json()
event_logger.format_json = True
else:
log_manager.format_text()
Copy link
Contributor

@nathaniel-may nathaniel-may Nov 29, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it seems like these blocks won't need to exist once we refactor the early logs right? can we comment to that effect so the refactor goes smoother?

@@ -2365,7 +2365,7 @@ def message(self) -> str:


@dataclass
class SendingEvent(DebugLevel, Cli):
class SendingEvent(DebugLevel, Cli, File):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

awesome! glad this works now 💪

@@ -48,7 +48,7 @@ def setup_event_logger(log_path):
this.format_color = True if flags.USE_COLORS else False
# TODO this default should live somewhere better
log_dest = os.path.join(log_path, 'dbt.log')
level = logging.DEBUG if flags.DEBUG else logging.INFO
level = logging.DEBUG if flags.DEBUG else this.STDOUT_LOG.level
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we set it to WARN below what's the path for setting it back to INFO for runs without the --debug flag? I don't want those runs to be stuck on warn instead of info.

Copy link
Contributor Author

@jtcohen6 jtcohen6 Nov 29, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, totally fair question. This would be an issue any place where we invoke dbt multiple times within the same CLI instantiation, i.e. in integration tests or in the dbt Server.

We could set the log level to INFO within the base (non-list) task pre_init_hook? Perhaps even include the debug logic there too?

Ultimately, we'll want to refactor this so that the logger is being instantiated/configured as soon as flags are available, and before all of this other stuff is happening

Copy link
Contributor

@nathaniel-may nathaniel-may Nov 29, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok so now that I think about it, wouldn't it get reset to info when we call setup_event_logger if we DON'T change this line? do we need it to stay warn longer than that?

If so maybe we could change the signature of this function:

def setup_event_logger(log_path, level_override=None):
    ...
        level = level_override or (logging.DEBUG if flags.DEBUG else logging.INFO)
    ...

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm just trying to keep rogue mutations to a minimum so we can refactor this fairly easily in the future.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@nathaniel-may Just gave this approach a try, I do think it's slightly better

@cla-bot cla-bot bot added the cla:yes label Nov 29, 2021
Copy link
Contributor

@nathaniel-may nathaniel-may left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for all the comments! I think this level_override looks pretty good. Happy for it to stick around for commands like list if we need it.

@jtcohen6 jtcohen6 merged commit 8d2351d into main Nov 29, 2021
@jtcohen6 jtcohen6 deleted the logging-status-quo-ante branch November 29, 2021 18:05
iknox-fa pushed a commit that referenced this pull request Feb 8, 2022
* Log formatting from flags earlier

* WARN-level stdout for list task

* Readd tracking events to File

* PR feedback, annotate hacks

* Revert "PR feedback, annotate hacks"

This reverts commit 5508fa2.

* This is maybe better

* Annotate main.py

* One more comment in base.py

* Update changelog

automatic commit by git-black, original commits:
  8d2351d
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants