Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SPIKE] [CT-2650] Materialization macros should support dispatch #7799

Closed
Tracked by #7301
jtcohen6 opened this issue Jun 6, 2023 · 8 comments
Closed
Tracked by #7301

[SPIKE] [CT-2650] Materialization macros should support dispatch #7799

jtcohen6 opened this issue Jun 6, 2023 · 8 comments
Assignees
Labels
spike tech_debt Behind-the-scenes changes, with little direct impact on end-user functionality

Comments

@jtcohen6
Copy link
Contributor

jtcohen6 commented Jun 6, 2023

Copied from #4646

Adapter-specific materializations work differently from adapter-specific macros. Namely:

Let's make materialization macro generation / discovery work more like dispatch. We'd need to offer backwards compatibility / avoiding breaking changes for folks who currently rely on the materialization_<name>_<adapter> construction.


This is important in cases where we want to "call" a materialization macro from within another macro (perhaps from within another materialization). I found a concrete use case for doing this in my spiking on dbt clone (#7258), whereby the clone materialization would in certain instances want to "shell out" to the view materialization.

Specific comment & spot in the spiking code where this is relevant: #7258 (comment)

Depending on the implementation we pursue for #7442, this could also be a relevant capability there.

@github-actions github-actions bot changed the title Materialization macros should support dispatch [CT-2650] Materialization macros should support dispatch Jun 6, 2023
@jtcohen6 jtcohen6 assigned jtcohen6 and unassigned jtcohen6 Jun 6, 2023
@jtcohen6 jtcohen6 added the tech_debt Behind-the-scenes changes, with little direct impact on end-user functionality label Jun 6, 2023
@emmyoop emmyoop added the spike label Jun 12, 2023
@emmyoop emmyoop changed the title [CT-2650] Materialization macros should support dispatch [SPIKE] [CT-2650] Materialization macros should support dispatch Jun 12, 2023
@jtcohen6 jtcohen6 mentioned this issue Jun 20, 2023
6 tasks
@aranke
Copy link
Member

aranke commented Jun 23, 2023

Also came up in #7881 (comment)

@gshank
Copy link
Contributor

gshank commented Jun 23, 2023

Materializations are a variant of Jinja macro (jinja extension) which is created in core/dbt/clients/jinja.py, in MaterializationExtension. They are saved in the manifest.macros dictionary with names such as: "macro.dbt.materialization_incremental_default"

In order to use adapter.dispatch and use its macro resolution code we would need to normalize the names. (Could we have synonyms?) (Note: we currently have two macro resolvers, MacroResolver and MacroNamespace.)

In the "def execute" method in dbt/task/run.py and dbt/task/clone.py we do manifest.get_materialization_macro_name. Instead we would have to execute adapter.dispatch to get the materialization macro, which would return a MacroGenerator object so it wouldn't need to be created. We don't currently call adapter.dispatch in Python, but the materialization macros are the "launcher" macros that get everything started so this might be an exception.

Concerns here are the inconsistency in naming conventions between materializations and "normal" macros and the changes that would need to be made in the adapter repos. It remains to be seen how much incompatibility there would be. Possibly we might be able to have a temporary mapping from old to new materialization macros. But it's also possible that this would be a fairly big breaking change.

@jtcohen6
Copy link
Contributor Author

Users define these as:

{% materialization table, adapter="snowflake" %}

Lowest-risk solution is add the materialization macro by both names:

"macro.dbt.materialization_table_snowflake"
"macro.dbt.snowflake__materialization_table"

The main thing this gets us is allowing someone to write this and have it work:

adapter.dispatch("materialization_table")

We should raise a deprecation warning if people are still calling a materialization directly by the old name. In order to do this, we need to update our own materialization lookup logic (get_materialization_macro_name), so that we're fetching it by the new (rather than old) name.

Potential acceptance criteria:

  • Add materialization macro to manifest by both names
  • Change materialization macro lookup logic
  • Add deprecation logic when old materialization macro name is called explicitly

@aranke @gshank to keep discussing

@gshank
Copy link
Contributor

gshank commented Jun 28, 2023

On thinking about this a bit more, I think it's probably a bad idea to have duplicated macros with different names. It would complicate partial parsing quite a bit. I'm now thinking that the best way to do this would be to add code in adapter.dispatch so that if we fail to find a materialization macro we use special resolution code which will mangle the name into the "standard" materialization name.

@jtcohen6
Copy link
Contributor Author

Having a way to call a materialization as a function (with reliable dispatch + ability to pass in arguments instead of patching the context) would also give us more options for implementing this Mesh UX improvement:

@gshank
Copy link
Contributor

gshank commented Jun 30, 2023

In conversation with Kshitij just now, it seems that the main changes need to be in core/dbt/context/macros.py, in MacroNamespace.get_from_package where we can add special processing of "materialization" macros. The "get_from_package" method is called from adapter.dispatch, which comes from the adapter "contextproperty" of the ProviderContext in core/dbt/context/providers.py. The actual "dispatch" method is in the BaseDatabaseWrapper.

The materialization macro can be found in the "execute" method of the clone task (and probably the same method in runnable.py) by doing context.adapter.dispatch from the constructed context. This should already be a MacroGenerator callable, so we wouldn't need to construct a MacroGenerator as we do further down in "execute".

@jtcohen6
Copy link
Contributor Author

jtcohen6 commented Jul 5, 2023

@gshank @aranke Thanks for continuing the conversation! I'm going to close out this spike, and open a new issue to implement what we've landed on above

@jtcohen6
Copy link
Contributor Author

jtcohen6 commented Jul 5, 2023

Closing in favor of #8031

@jtcohen6 jtcohen6 closed this as completed Jul 5, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
spike tech_debt Behind-the-scenes changes, with little direct impact on end-user functionality
Projects
None yet
Development

No branches or pull requests

4 participants