Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added last_altered query to Snowflake catalog macro #2732

Merged

Conversation

Mr-Nobody99
Copy link
Contributor

@Mr-Nobody99 Mr-Nobody99 commented Aug 31, 2020

resolves ##2728

Description

Added addition column using snowflakes last_altered col of information schema, which should allow dbt-docs to pick up the additional info from the cataglog.json to display in details sections for models.

Checklist

  • I have signed the CLA
  • This PR includes tests, or tests are not required/relevant for this PR
  • I have run this code in development and it appears to resolve the stated issue
  • I have updated the CHANGELOG.md and added information about my change to the "dbt next" section.

@cla-bot cla-bot bot added the cla:yes label Aug 31, 2020
@jtcohen6 jtcohen6 changed the base branch from dev/marian-anderson to dev/0.18.1 September 9, 2020 17:07
Copy link
Contributor

@jtcohen6 jtcohen6 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The pro: just by adding this to the catalog query, it automatically shows up in dbt-docs. Neat!

  • I left a few comments from a UX perspective
  • I think you'll need to amend the integration tests, since they're not expecting an additional stats field
  • I just switched the base branch to v0.18.1. Could you merge changes from that branch and add a changelog entry?

(bytes is not null) as "stats:bytes:include",

'Last Modified' as "stats:last_modified:label",
to_varchar(last_altered, 'mon dd, yyyy-HH24:MI:SS') as "stats:last_modified:value",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should pick between displaying either:

  • a concise, correct value: to_varchar(convert_timezone('UTC', last_altered), 'yyyy-mm-dd HH24:MI'||' UTC')
  • a user-friendly value: to_varchar(last_altered, 'mon dd, hh12:mi am')

I lean toward the former, what do you think?

'Last Modified' as "stats:last_modified:label",
to_varchar(last_altered, 'mon dd, yyyy-HH24:MI:SS') as "stats:last_modified:value",
'The timestamp for last update/change' as "stats:last_modified:description",
(last_altered is not null) as "stats:last_modified:include"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The last-modified timestamp is a little less intuitive for views vs. tables. Views will include most recent data, whether or not the view definition has been modified recently. So really, this value just indicates the last time the view definition potentially changed, but says nothing about the underlying data.

I'd expect folks with database experience to be comfortable with that distinction, but that's not a background we can presume all dbt-docs viewers to have.

Additionally, views haven't historically had other stats, so this value is presented alone without context. (Hence the failing integration tests, where views are not expected to have stats.)

What do you think about adjusting this to

Suggested change
(last_altered is not null) as "stats:last_modified:include"
(last_altered is not null and table_type = "BASE TABLE") as "stats:last_modified:include"

@jtcohen6
Copy link
Contributor

I think the integration tests are failing because they're running in the Mr-Nobody99 CircleCI org, rather than the Fishtown-Analytics one. The same thing happened over here; I'm not exactly sure of the cause, but it looks like there's a way to force CircleCI to pick up the Fishtown org + requisite env vars.

'Last Modified' as "stats:last_modified:label",
to_varchar(convert_timezone('UTC', last_altered), 'yyyy-mm-dd HH24:MI'||'UTC') as "stats:last_modified:value",
'The timestamp for last update/change' as "stats:last_modified:description",
(last_altered is not null and table_type="BASE_TABLE") as "stats:last_modified:include"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Double quotes on Snowflake indicates a quoted identifier (i.e. column name), you need single quotes here to indicate a string literal.

Suggested change
(last_altered is not null and table_type="BASE_TABLE") as "stats:last_modified:include"
(last_altered is not null and table_type='BASE TABLE') as "stats:last_modified:include"

@jtcohen6
Copy link
Contributor

@Mr-Nobody99 I think we caught you amidst the branch switching. Sorry about that. You could try rebasing, but given that your changes were self-contained, I'd recommend resetting your local branch against dev/0.18.1, re-adding your changes (snowflake catalog query + changelog), and force-pushing your remote branch.

If that's not something you feel up to, let me know and I can try to help out.

@Mr-Nobody99 Mr-Nobody99 force-pushed the feature/add-snowflake-last-modified branch from 79efd33 to 713e781 Compare September 22, 2020 19:56
@Mr-Nobody99
Copy link
Contributor Author

Mr-Nobody99 commented Sep 23, 2020

@jtcohen6 Thanks for the advice, I hard reset my branch against the upstream dev/0.18.1 and it seems to pass more of the tests now than previously. The value in the test log seems correct however it is still failing, If you have any insight on the cause of the test failure that would be much appreciated.

@beckjake
Copy link
Contributor

That snowflake test failure is because the catalog tests haven't been updated to include your changes! In test/integration/029_docs_generate_tests/test_docs_generate.py, you'll likely need to update the _snowflake_stats method to include a last_modified field that matches your changes. For the value field you'll probably just want to set ANY.

@Mr-Nobody99
Copy link
Contributor Author

@beckjake That was the thing, much appreciated.

Copy link
Contributor

@jtcohen6 jtcohen6 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the contribution @Mr-Nobody99!

@jtcohen6 jtcohen6 merged commit cb0e625 into dbt-labs:dev/0.18.1 Sep 24, 2020
@Mr-Nobody99 Mr-Nobody99 deleted the feature/add-snowflake-last-modified branch September 24, 2020 21:42
@Mr-Nobody99
Copy link
Contributor Author

@jtcohen6 happy to help 👍

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants