Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feature/add new model fct_hard_coded_references to capture all models that have hard coded references #246

Merged
merged 11 commits into from
Dec 21, 2022

Conversation

graciegoheen
Copy link
Collaborator

@graciegoheen graciegoheen commented Dec 2, 2022

This is a:

  • bug fix PR with no breaking changes
  • new functionality

Link to Issue

Closes #240 and #189

Description & motivation

Create a new model fct_hard_coded_references that captures all models that have hard coded reference(s) in their raw sql

TO DO:

  • add text to readme
  • consider only returning the unique list of raw references, in alphabetic order
  • should the list be limited if too big?
  • add documentation to dag.yml

Integration Test Screenshot

Screen Shot 2022-12-02 at 2 41 26 PM

Checklist

  • I have verified that these changes work locally on the following warehouses (Note: it's okay if you do not have access to all warehouses, this helps us understand what has been covered)
    • BigQuery
    • Postgres
    • Redshift
    • Snowflake
    • Databricks
    • DuckDB
  • I have updated the README.md (if applicable)
  • I have added tests & descriptions to my models (and macros if applicable)

@b-per
Copy link
Collaborator

b-per commented Dec 5, 2022

As Joel mentioned in another issue, would it make sense to load the regexes from another package (and then should they be in dbt-codegen or should they be in their own package and be imported both here and in dbt-codegen) instead of copy/pasting all of them?

@graciegoheen
Copy link
Collaborator Author

graciegoheen commented Dec 5, 2022

Hmm - I think it could make sense to load regexes from another package but this model vs. codegen use different regexes with some overlap. Might be worth brainstorming what exactly the regex macros should look like, since these two use cases are different. Would also worry about version dependency if someone installed both this package and codegen, but those packages had different versions of a regex package installed.

Is this blocking? Or do you want me to open up a new issue for this package & codgen but continue finalizing this PR?

@graciegoheen
Copy link
Collaborator Author

Tagging @joellabes who might have good advice here :)

@joellabes
Copy link
Contributor

If they're similar but not the same, trying to shoehorn them into the same behaviour is probably a bad idea.

should they be in dbt-codegen or should they be in their own package and be imported both here and in dbt-codegen

If you're going to do this, I'd rather its own package - codegen shouldn't have any direct relationship to project-evaluator.

IMO you could continue with it duplicated, but if you have a third use case where this comes up then extracting it probably makes sense. Very unscientific heuristics here so feel free to disagree!

Comment on lines 50 to 52
# second matching group
# opening {{, 0 or more whitespace character(s), var, 0 or more whitespace character(s), an opening parenthesis, 0 or more whitespace character(s), 1 or 0 quotation mark
({{\s*var\s*\(\s*[\'\"]?)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So beautiful ❤️

README.md Outdated Show resolved Hide resolved
README.md Outdated
<details>
<summary><b>Example</b></summary>

blah blah
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree 👍

@b-per
Copy link
Collaborator

b-per commented Dec 9, 2022

It looks great and super useful.
I added some comments but they are about minor things.

@graciegoheen graciegoheen marked this pull request as ready for review December 20, 2022 17:09
@graciegoheen graciegoheen requested a review from b-per December 20, 2022 17:09
README.md Outdated
<details>
<summary><b>Exceptions</b></summary>

TO DO: Are there any exceptions anyone can think of?? I think there might be some packages that select from a variable...
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we remove the TODO?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, but this is an open question - can you think of any exceptions?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, and if there is any, then people will just use the seed file I guess

@b-per
Copy link
Collaborator

b-per commented Dec 20, 2022

Looks good to me overall. I added a comment about the TODO in the README and we just need to get databricks passing as well.

@graciegoheen graciegoheen changed the title feature/add new model fct_raw_references to capture all models that have a raw references feature/add new model fct_hard_coded_references to capture all models that have hard coded references Dec 20, 2022
@@ -0,0 +1,3 @@
{% macro spark__escape_single_quotes(expression) -%}
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should eventually be added to the spark adapter!

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i'm honestly floored by the whitespace issue adding spaces

@graciegoheen graciegoheen requested a review from b-per December 20, 2022 20:42
Copy link
Collaborator

@b-per b-per left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks Grace! LGTM 🥐

Copy link
Collaborator

@dave-connors-3 dave-connors-3 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

great work! -- stray thought: is this in service of replacing our current fct_root_models check? I feel like the overlap here would be really close to 100%

name: equality_fct_hard_coded_references
compare_model: ref('fct_hard_coded_references')
compare_columns:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: since the compare columns are all the columns, we can just leave out this config

@@ -0,0 +1,3 @@
{% macro spark__escape_single_quotes(expression) -%}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i'm honestly floored by the whitespace issue adding spaces

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Handle/flag hard-coded references
4 participants