-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Output run hooks (#696) #1440
Output run hooks (#696) #1440
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some comments about the hook output, let me know if you'd like to discuss!
I'm actually ok with the Concurrency ....
line where it is -- the on-run-start|end
hooks are run serially, so it actually doesn't make a ton of sense to show the Concurrency line before that.
Do we even need this line anymore? I think we added it when concurrency was new to dbt, and we wanted to make sure everyone knew about it :)
Let me think a little harder about what, if anything, we should be showing here... Maybe it's just Running 7 models
or something like that instead?
core/dbt/task/run.py
Outdated
compiled = compile_node(adapter, self.config, hook, | ||
self.manifest, extra_context) | ||
statement = compiled.wrapped_sql | ||
dbt.ui.printer.print_timestamped_line( | ||
'{} of {} START {}'.format(idx, len(ordered_hooks), statement) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we format this like all of the other timestamped lines that dbt prints?
07:58:48 | 1 of 2 START hook: grant usage on schema......... [RUN]
07:58:48 | 1 of 2 OK hook: grant usage on schema......... [GRANT in 0.4s]
07:58:48 | 2 of 2 START hook: grant select on ......... [RUN]
07:58:48 | 2 of 2 OK hook: grant select on......... [GRANT in 2.8s]
I think we'll also want to reshape/truncate the hook SQL. It's not uncommon to do operational things (like create UDFs, or insert into an audit table in these hooks, and they currently take over the stdout!
I imagine we can do something like (pseudocode):
hook_text = hook.replace("\n", " ")[0:60] + "..."
This doesn't need to be the exact same SQL that dbt is actually running, it just needs to be a stand-in for a name! Maybe some day it will make sense to name hooks, but today is not that day :)
Let's also include a line for when the hooks completes, and a status if it's easy to grab. I think a lot of folks don't actually have insight into how long their hooks currently take, and I be it will be helpful to expose this info to them!
Here's an example hook for testing:
on-run-start:
- "{{ load_udfs() }}"
-- macros/udfs.sql (imagine this did something useful....)
CREATE OR REPLACE FUNCTION no_op()
RETURNS void AS $$
$$ LANGUAGE SQL;
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can also be pretty aggressive with the truncation here -- the point is just to show that hooks are running, and to indicate how long they have taken. If folks want to dig further into that, they can find the full queries in the logs!
Truncate long run-hooks, replacing newlines with spaces Include run-hook status/timing in output Clean up run-hook execution a bit
I find it useful sometimes, though I don't know if it needs to go to stdout. |
In actually seeing this, I think the output is just too verbose to be really useful here. I knew it would be a lot of text, but I'm not thrilled with the way this actually looks. It's the Let's plan to "name" hooks at some point in the future. That might look like:
For now though, I think it would be a good idea to just output the package, hook type, and hook index. So:
The actual formatting of |
I'm mostly amenable to this, but worth pointing out that it will never be red - failing a hook results in the run failing entirely and halting.
|
Hah, I was going to mention that, but wasn't sure if it was still the case. We should probably do something different there, but that can happen in a separate PR! If nothing else, this PR is a step in the right direction... users will be able to see which hook failed in the stdout now :) |
Co-Authored-By: Drew Banin <drew@fishtownanalytics.com>
Order hooks in a deterministic way: - in the root project, hooks in index order - foreach dependency project in alphabetical order, hooks in index order Since these hooks always have a valid "index", no need to devise a default Add some tests
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This was a little different than the original spec, but I'm super happy with where we ended up! Thanks for your help with iterating on this -- ship it!
Fixes #696
Output information about the on-run-start and on-run-end hooks before and after running dbt.
One wrinkle, the output isn't exactly as described in the issue - it's a bit tricky to move that
Concurrency
line before hooks run. Instead you get this order.We definitely can move that
Concurrency
line up, and the output would probably look a bit nicer, but it involves moving that into the task level instead of the node runner level and rearranging some things.