Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make internal macros use macro dispatch to be overridable in child adapters #320

Merged
merged 5 commits into from
Apr 6, 2022

Conversation

ueshin
Copy link
Contributor

@ueshin ueshin commented Apr 4, 2022

resolves #319

Description

Makes internal macros use macro dispatch to be overridable in child adapters.

The target macros are:

  • file_format_clause
  • location_clause
  • options_clause
  • comment_clause
  • partition_cols
  • clustered_cols
  • create_temporary_view

Checklist

  • I have signed the CLA
  • I have run this code in development and it appears to resolve the stated issue
  • This PR includes tests, or tests are not required/relevant for this PR
  • I have updated the CHANGELOG.md and added information about my change to the "dbt-spark next" section.

@cla-bot cla-bot bot added the cla:yes label Apr 4, 2022
@ueshin
Copy link
Contributor Author

ueshin commented Apr 5, 2022

@jtcohen6 Do you think this will be backported to 1.0.latest? If not, we might need to apply the original approach to the 1.0.latest branch in dbt-databricks. Thanks.

Copy link
Contributor

@jtcohen6 jtcohen6 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One requested change, then this is good to go. Thanks for the quick work!

@jtcohen6 Do you think this will be backported to 1.0.latest? If not, we might need to apply the original approach to the 1.0.latest branch in dbt-databricks. Thanks.

@leahwicz What do you think? I'd be willing to backport this to 1.0.latest, given that:

  • it's a total non-change for dbt-spark end users (overriding location_clause() by name will still work just as before)
  • it will enable dbt-databricks to patch this behavior for folks migrating from dbt-spark v1.0.latest to dbt-databricks v1.0.latest

@@ -1,19 +1,33 @@
{% macro file_format_clause() %}
{{ return(adapter.dispatch('file_format_clause')()) }}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For this and the other macros below, could you provide the second argument to dispatch (macro_namespace) as 'dbt'?

Suggested change
{{ return(adapter.dispatch('file_format_clause')()) }}
{{ return(adapter.dispatch('file_format_clause', 'dbt')()) }}

This enables users to install packages with overrides for "global" (core/plugin) macros, and choose to prefer the package's version over the global version by specifying a project config like:

# dbt_project.yml
dispatch:
  - macro_namespace: dbt
    search_order: ['my_project', 'package_with_custom_overrides', 'dbt']

Docs: https://docs.getdbt.com/reference/dbt-jinja-functions/dispatch#overriding-global-macros

(Meta comment: The syntax here has developed organically over time, so it can be a bit clunky. Some thoughts about how we could make it more ergonomic in the future: dbt-labs/dbt-core#4646)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, thanks for letting me know! Updated.

Comment on lines 30 to 32
def dispatch(macro_name):
return getattr(template.module, f'spark__{macro_name}')
self.default_context['adapter'].dispatch = dispatch
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

@leahwicz
Copy link
Contributor

leahwicz commented Apr 5, 2022

@jtcohen6 if this is just a minimal internal change, the yes we can backport it to 1.0.latest to get into the RC.

@ueshin let me know what your timeline is for getting this change in so I can plan accordingly for the RC release. I was hoping to get to it in the next day or 2 if possible

@ueshin
Copy link
Contributor Author

ueshin commented Apr 5, 2022

@leahwicz Thanks for taking care of this issue.

let me know what your timeline is for getting this change in so I can plan accordingly for the RC release. I was hoping to get to it in the next day or 2 if possible

As I've addressed @jtcohen6's comment, this must be good to go unless there are any other comments.

Copy link
Contributor

@jtcohen6 jtcohen6 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks @ueshin!

One integration test failed for a reason that appears unrelated to the changes here. Rerunning now. If it fails again I'll try taking a deeper look.

@jtcohen6
Copy link
Contributor

jtcohen6 commented Apr 5, 2022

Rerunning now. If it fails again I'll try taking a deeper look.

The same test failed again, only when running against a Databricks cluster, only via ODBC connection: test_delta_comments_databricks_cluster

Trying to dig in to figure out why.

@jtcohen6
Copy link
Contributor

jtcohen6 commented Apr 5, 2022

I haven't been able to reproduce locally, or by running the exact same SQL against a cluster in our Databricks workspace.

Looks like this one has failed flakily for us before. Here's a run from 10 days ago, against main, same exact failure.

It also sounds like other users have also seen the same error message crop up intermittently: #293

Spark adapter: ('42000', '[42000] [Simba][Hardy] (80) Syntax or semantic analysis error thrown in server while executing query. Error message from server: org.apache.hive.service.cli.HiveSQLException: Error running query: org.apache.spark.sql.AnalysisException: org.apache.hadoop.hive.ql.metadata.HiveException: at least one column must be specified for the table\n\tat org.apache.spark.sql.hive.thriftserver.Spa (80) (SQLExecDirectW)')

We could enable query retries (retry_all) for our integration tests?

I'm trying one last rerun, but just about willing to say this is a flaky failure, and unrelated to any substantive changes in this PR.

@jtcohen6 jtcohen6 merged commit fe77af7 into dbt-labs:main Apr 6, 2022
jtcohen6 pushed a commit that referenced this pull request Apr 6, 2022
…apters (#320)

* Make internal macros use macro dispatch to be overridable in child adapters.

* changelog

* Address a comment.

* Fix.

* Fix.
jtcohen6 added a commit that referenced this pull request Apr 6, 2022
…apters (#320) (#322)

* Make internal macros use macro dispatch to be overridable in child adapters.

* changelog

* Address a comment.

* Fix.

* Fix.

Co-authored-by: Takuya UESHIN <ueshin@databricks.com>
@ueshin
Copy link
Contributor Author

ueshin commented Apr 6, 2022

@jtcohen6 Thanks for merging and backporting this!

@ueshin ueshin deleted the macro_dispatch branch April 6, 2022 17:43
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[CT-456] Make internal macros use macro dispatch to be overridable in child adapters.
3 participants