-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Parse the dbt ast for config calls and evaluate them outside of the parsing context #2714
Comments
@drewbanin @gshank There's a chance an approach like this could be a massive improvement over the way we parse and construct |
This would definitely be a big step in the direction we need to go. In a simple project I use for testing, I see 'refs', 'depends_on', 'unrendered_config', and of course the corresponding entries in the child_map and parent_map as the pieces that are updated in the parse-time jinja rendering. |
This fateful issue set us down some important paths. I'll always remember it fondly. (closing in favor of #3680) |
Describe the feature
dbt should hook into jinja's code generator and use it to figure out
config()
values without needing to evaluate jinja.Currently we evaluate jinja models in parse mode to collect ref, source, and config calls. Those have the benefit of letting users do powerful things like control flow, etc, but come with the downside of requiring us to evaluate the jinja twice: once in "parse" mode (
execute=False
) and once in "compile" mode (execute=True
).As part of jinja's compilation, we can get access to the full jinja AST. One thing we could do is extract config() calls from the AST itself instead of having to go through the trouble of evaluating it with the context. The benefit here is you don't need to know the context to build the AST (you only need it to evaluate it). So that means we can build the AST, stop, and then investigate the parts of the AST we care about with the information that would be available in the parse context, and not need to actually evaluate jinja.
This is generally extensible to:
depends_on.macros
in a way that doesn't care about control flow, so we can catch every macro dependencydepends_on.macros
for macros themselvesvar
andenv_var
calls.We'll probably have to come up with some restrictions. We can do this pretty simply as a first pass, and fall back to the existing behavior (figuring out
config
by evaluating the jinja) when we reach the boundaries of our ability to figure it out. As a first pass, I think we can pretty safely say that any/all of these are ok to punt on for this pass (optimization fences, sort of):var
andenv_var
calls:{{ config(materialized=var('materialization_type')) }}
. We actually know these before we even get to parsing (right?).{{ config(materialized=my_macro()) }}
{% macro my_macro() %} {{ config(materialized='view') {% endmacro %}
{% set materialized_value = 'view' %} {{ config(materialized=materialized_value) }}
**
in config arguments{{ config(**kwargs) }}
{% set materialization_macro = my_macro %}{{ config(materialized=my_macro()) }}
I haven't spent enough time poking around the AST to tell you for sure what's easy/what's hard, but I have ordered them in my rough guess at easiest to most difficult. It should be possible to figure out all of these, though some may be more complex than others. I think getting var/env_var support into this PR should be pretty easy (and seems the most obviously desirable).
I think in the longer term, as this behavior gets more useful, we'll end up wanting to restrict what can call
ref
,source
, andconfig
to just disallow anything we can't get through. That's totally out of scope for this issue, but it's worth calling out now!It's important that we feel comfortable scrapping this if we develop the feature (or part of it) and discover we end up hitting the optimization fence everywhere and don't have any hope of figuring it out. I do feel vaguely like there's got to be some huge crippling complication I'm missing, but I don't really see anything obvious - we do have all the information we need at parse-time!
Describe alternatives you've considered
Just don't do this
Who will this benefit?
This is more of a longer-term benefit:
The text was updated successfully, but these errors were encountered: