Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature/hackathon model generator #83

Conversation

fivetran-joemarkiewicz
Copy link
Contributor

@fivetran-joemarkiewicz fivetran-joemarkiewicz commented Oct 17, 2022

Hackathon PR

This is a:

  • bug fix PR with no breaking changes — please ensure the base branch is main
  • new functionality — please ensure the base branch is the latest dev/ branch
  • a breaking change — please ensure the base branch is the latest dev/ branch

Description & motivation

Currently, the dbt-codegen package has macros that output the source, base, and other model information in the terminal. While this is great, I think it can be expanded upon even further to make this process more seamless and have have greater versatility. A big blocker from me using this package is that there is a lot of copy/pasting that takes place.

Therefore, I added some new functionality that builds off the generate_base_model macro and allows for the package to create the model files in your local dbt project. This is achieved by leveraging a new bash script that can be run from the package and will output a series of terminal commands. These terminal commands can then be copy/pasted and ran to generate the model files with the output of the generate_base_model included within them.

Note: These new macros are not compatible with the dbt Cloud IDE. Let me know if that is an issue.

Checklist

  • I have verified that these changes work locally
  • I have updated the README.md (if applicable)
  • I have added tests & descriptions to my models (and macros if applicable)
  • I have added an entry to CHANGELOG.md

Additional Notes

I also was unsure what types of tests (if any) should be used for the macro and bash script. Therefore, I have omitted them for the time being.

@fivetran-joemarkiewicz
Copy link
Contributor Author

Not entirely sure why the CircleCi build is failing 🤔 Let me know if there is anything needed on my end!

@dbeatty10
Copy link
Contributor

@fivetran-joemarkiewicz agreed, the copy-pasting is no fun and creates a barrier!

Note: These new macros are not compatible with the dbt Cloud IDE. Let me know if that is an issue.

I'm not personally worried about this not being able to run shell commands in the dbt Cloud IDE. It seems to me that the macros would actually work correctly in dbt Cloud -- the user just wouldn't be able to execute the script there, right? The user would have to execute the script in some kind of bash/zsh-compatible shell, which I think is reasonable.

Taking it for a spin 🚗

I setup a couple sources and ran the new command using two sources that exist plus a bogus one that doesn't exist:

dbt run-operation codegen.create_base_models --args '{"source_name": "dbt_dbeatty", "tables": ["raw_orders", "raw_customers", "asdfasdf"]}'

The output was this:

source dbt_packages/codegen/bash_scripts/base_model_creation.sh dbt_dbeatty raw_orders && 
source dbt_packages/codegen/bash_scripts/base_model_creation.sh dbt_dbeatty raw_customers && 
source dbt_packages/codegen/bash_scripts/base_model_creation.sh dbt_dbeatty asdfasdf

(There was an extra set of && at the beginning of the output -- I take a peek at the code to see if we can eliminate it since it will be a invalid command if someone includes it when they copy-paste.)

Coping and pasting that output to run those commands created the following files:

models/stg_dbt_dbeatty__asdfasdf.sql
models/stg_dbt_dbeatty__raw_customers.sql
models/stg_dbt_dbeatty__raw_orders.sql

The happy path generated this content:

with source as (

    select * from {{ source('dbt_dbeatty', 'raw_customers') }}

),

renamed as (

    select
        id,
        first_name,
        last_name

    from source

)

select * from renamed

And the content of the missing file on the not-so-happy path was reasonable too:

  Macro 'macro.codegen.generate_base_model' (macros/generate_base_model.sql) depends on a source named 'dbt_dbeatty.asdfasdf' which was not found or is disabled
  
  > in macro generate_base_model (macros/generate_base_model.sql)
  > called by macro generate_base_model (macros/generate_base_model.sql)

Next steps

I'm not sure why the CircleCI build is failing either. Regardless, this PR is currently targeting dev/0.4.0, but codegen is all the way up to 0.9.0 now!

So I'm going to try merging main into your branch, resolve any conflicts that come our way, and then re-target this PR to main instead of dev/0.4.0. I'm doing this shortly, so you'll be able to see how it goes 🤞.

From there, this feature will be included in the next release (whether that is 0.9.1 or 0.10.0 or 1.0.0).

@dbeatty10 dbeatty10 changed the base branch from dev/0.4.0 to main December 9, 2022 19:31
Copy link
Contributor

@dbeatty10 dbeatty10 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From the very first time I used codegen, this feature is everything I wanted @fivetran-joemarkiewicz! 🤩

It's like setting sail on a new adventure and hoisting the sails to catch the wind. This new feature will surely help us all navigate through the sea of data and reach new, unexplored shores. Keep up the good work, matey! 🚢

@dbeatty10 dbeatty10 merged commit 6da8a04 into dbt-labs:main Dec 9, 2022
@fivetran-joemarkiewicz
Copy link
Contributor Author

Hey @dbeatty10 thanks for picking this PR up and I am glad you were able to make some edits and set this PR to sea within the main branch! I'm happy it was able to be merged and hope it helps others out there!

jeremyholtzman pushed a commit that referenced this pull request Apr 10, 2023
* feature/hackathon-model-generator

* documentation updates

* final bash script changes

* readme updates

* changelog entry

* Remove `columns_array` and `&&` and add some light commentary

Co-authored-by: Doug Beatty <doug.beatty@dbtlabs.com>
Co-authored-by: Doug Beatty <44704949+dbeatty10@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants