Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix persist_docs for columns #180

Merged
merged 4 commits into from
Jun 15, 2021
Merged

Fix persist_docs for columns #180

merged 4 commits into from
Jun 15, 2021

Conversation

jtcohen6
Copy link
Contributor

@jtcohen6 jtcohen6 commented Jun 14, 2021

follow up to #84, #170

Description

Although #170 implemented spark__alter_column_comment, we were missing the needed call to the persist_docs macro itself in needed materializations, since relation-level docs are handled within the create_x_as macro DDL.

So this PR:

  • Defined spark__persist_docs, to add column descriptions only
  • Adds a call to persist_docs to the table + seed materializations. It's already in the snapshot materialization, and views cannot persist columns descriptions.
  • Adds an integration test modeled off the one here. Key difference: column descriptions don't show up in show table extended in ... like '*', so instead of pulling from catalog.json, we have to run describe extended for each table we want to check.

Checklist

  • I have signed the CLA
  • I have run this code in development and it appears to resolve the stated issue
  • This PR includes tests, or tests are not required/relevant for this PR
  • I have updated the CHANGELOG.md and added information about my change to the "dbt next" section.

@cla-bot cla-bot bot added the cla:yes label Jun 14, 2021
@jtcohen6
Copy link
Contributor Author

I forgot that Databricks SQL endpoints can't create parquet tables. I'll update when I get a chance tomorrow.

@@ -81,10 +81,7 @@
{%- set agate_table = load_agate_table() -%}
{%- do store_result('agate_table', response='OK', agate_table=agate_table) -%}

{{ run_hooks(pre_hooks, inside_transaction=False) }}
Copy link
Contributor

@leahwicz leahwicz Jun 14, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do the run_hooks calls change? Is it related to the persist_docs change or is this related to something else?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This isn't really related to the specific change in the PR at all, I just noticed because I was editing related lines. We clearly copied these lines from the default seed materialization a few years back. The default materialization, which was originally written for Redshift/Postgres, assumes a transactional database. Some hooks may want to run before the main transaction starts (begin), after it starts, before it ends (commit), or after it ends.

Spark doesn't have transactions, so there's no need to distinguish between inside_transaction = True|False. I figured I would simplify it while here, but I'd also be fine reverting those lines for PR cleanliness.

@jtcohen6
Copy link
Contributor Author

I'll cherry-pick to 0.20.latest after merging

@jtcohen6 jtcohen6 merged commit a8a85c5 into master Jun 15, 2021
@jtcohen6 jtcohen6 deleted the fix/persist-docs-columns branch June 15, 2021 16:40
jtcohen6 added a commit that referenced this pull request Jun 15, 2021
* Fix persist_docs for columns

* Disable parquet model on endpoint

* Rm parquet model, not worth the fuss

* Update changelog [skip ci]
@binhnefits
Copy link

Hi, I was wondering why persist_docs is not called for incremental models?

@jtcohen6
Copy link
Contributor Author

jtcohen6 commented Oct 7, 2021

@binhnefits I think that was an oversight on my part! Would you be able to open a separate issue for that?

Also, if you'd be interested in contributing the fix for it: I think it would just look like adding {% do persist_docs(target_relation, model) %} to the end of the incremental materialization.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants