Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

dbt clone command #3742

Merged
merged 19 commits into from
Jul 18, 2023
Merged

dbt clone command #3742

merged 19 commits into from
Jul 18, 2023

Conversation

graciegoheen
Copy link
Collaborator

@graciegoheen graciegoheen commented Jul 14, 2023

What are you changing in this pull request and why?

Add docs for new dbt clone command

Closes #3607

Outstanding:

Previews

Checklist

  • Add versioning components, as described in Versioning Docs
  • Review the Content style guide and About versioning so my content adheres to these guidelines.
  • Add a checklist item for anything that needs to happen before this PR is merged, such as "needs technical review" or "change base branch."
  • Add page to website/sidebars.js
  • Provide a unique filename for the new page

@graciegoheen graciegoheen requested review from dbeatty10 and aranke July 14, 2023 00:06
@netlify
Copy link

netlify bot commented Jul 14, 2023

Deploy Preview for docs-getdbt-com ready!

Name Link
🔨 Latest commit bd77486
🔍 Latest deploy log https://app.netlify.com/sites/docs-getdbt-com/deploys/64b71496a3e9630008057a74
😎 Deploy Preview https://deploy-preview-3742--docs-getdbt-com.netlify.app
📱 Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify site configuration.

@github-actions github-actions bot added content Improvements or additions to content size: small This change will take 1 to 2 days to address labels Jul 14, 2023
@graciegoheen graciegoheen requested a review from jtcohen6 July 14, 2023 00:10
Copy link
Collaborator

@runleonarun runleonarun left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey @graciegoheen just two tiny link fixes to fix your staging preview. We use the document root which starts with /docs or /reference or /guides.

website/docs/reference/commands/clone.md Outdated Show resolved Hide resolved
website/docs/reference/node-selection/syntax.md Outdated Show resolved Hide resolved
Copy link
Collaborator

@jtcohen6 jtcohen6 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nice solid start!!

I'd also like to see something in the docs on deferral: when/why would I use deferral versus cloning? Where are there two separate approaches to achieving the same end? (see internal notion thread)

website/docs/reference/commands/clone.md Outdated Show resolved Hide resolved
website/docs/reference/commands/clone.md Outdated Show resolved Hide resolved
website/docs/reference/commands/clone.md Outdated Show resolved Hide resolved
dbeatty10 and others added 3 commits July 14, 2023 08:44
Co-authored-by: Leona B. Campbell <3880403+runleonarun@users.noreply.github.com>
Co-authored-by: Leona B. Campbell <3880403+runleonarun@users.noreply.github.com>
Co-authored-by: Jeremy Cohen <jeremy@dbtlabs.com>
Copy link
Member

@aranke aranke left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great start, thanks for taking this on @graciegoheen!

website/docs/reference/commands/clone.md Outdated Show resolved Hide resolved
- Otherwise, dbt will create a simple pointer view (`select * from` the other object)

Note:
- The state to clone from is based on the location of nodes in the manifest provided to `--state`.
Copy link
Member

@aranke aranke Jul 14, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Something else to note here is that you cannot use the same target directory for both the source and target due to <placeholder>.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will it literally not allow the target and source to be the same? Or would it yield surprising results?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How is something like this typically called out in the docs @runleonarun

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@graciegoheen I would add this tidbit as a sentence in the very first paragraph. Maybe by rephrasing as what you should do instead of what you shouldn't, for example:

The dbt clone command clones selected nodes from the specified state to the target schema(s). You must always use different directories for source and target otherwise you will get an error message/overwrite the source. This command makes use of the clone materialization:

(hopefully you just get an error @aranke??)

Copy link
Member

@aranke aranke Jul 18, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You get a very weird error that source and target states are the same, and so it's a no-op.

I ran into this during local development, so think this warning needs to be surfaced loudly.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds like we want a dbt-core follow-on issue?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

^ yes and add callout to the docs? or yes then we don't need this callout?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd say "yes, and" — then when the issue is closed, we can remove the callout from the docs

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

docs are updated - @dbeatty10 or @aranke are one of y'all able to open up a core issue to improve the error message?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@graciegoheen I just now opened a minimal issue here: dbt-labs/dbt-core#8160

website/docs/reference/commands/clone.md Outdated Show resolved Hide resolved
@graciegoheen graciegoheen marked this pull request as ready for review July 18, 2023 00:51
@graciegoheen graciegoheen requested a review from a team as a code owner July 18, 2023 00:51
@graciegoheen
Copy link
Collaborator Author

graciegoheen commented Jul 18, 2023

nice solid start!!

I'd also like to see something in the docs on deferral: when/why would I use deferral versus cloning? Where are there two separate approaches to achieving the same end? (see internal notion thread)

@jtcohen6 were you imagining this callout be added to the defer docs, or to the clone docs, or to both? or a new page altogether?

@jtcohen6
Copy link
Collaborator

jtcohen6 commented Jul 18, 2023

@jtcohen6 were you imagining this callout be added to the defer docs, or to the clone docs, or to both? or a new page altogether?

How about this:

  • A sentence addition to the "defer" page, stating that cloning is an alternative to deferral and links to the "clone" command page
  • A slightly longer addition (paragraph) on the "clone" command page: "When would I use cloning instead of deferral?" Deferral is the feature that's more established (been around for longer), and at first glance it offers a strictly better/cheaper alternative to clone (which requires some compute + creation of more DWH objects), but has its own advantages worth explaining/considering

- You may want to specify a higher number of [threads](/docs/running-a-dbt-project/using-threads) to decrease execution time since individual clone statements are independent of one another.

The `clone` command is useful for:
- blue/green continuous deployment
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

would it be helpful to link to this so there's context? is that the right link?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, good call! that's a fairly old post at this point, but until we have a newer one, it's good to provide some definition/context

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How do we feel about this being a snowflake-specific article? I think blue/green continuous blue/green deployment will be different depending on the warehouse being used (zero copy cloning vs. pointer views)

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Even better point. Blue/green is really only enabled by table clones. There's also hackery cleverness in that post around overriding ref so that when the schemas get swapped, the views stay pointed to the same database.

Maybe better to skip it for now, and just say something like:

  • blue/green continuous deployment (on data warehouses that support cloning tables)

Or remove the bullet entirely, until we have a more up-to-date and thorough dev blog article that we could link to

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just updated - let me know what you think

@aranke
Copy link
Member

aranke commented Jul 18, 2023

@jtcohen6 were you imagining this callout be added to the defer docs, or to the clone docs, or to both? or a new page altogether?

To-do for @dbeatty10 and I: link to blog post highlighting differences when that's ready

@github-actions github-actions bot added size: medium This change will take up to a week to address and removed size: small This change will take 1 to 2 days to address labels Jul 18, 2023
@graciegoheen graciegoheen requested a review from jtcohen6 July 18, 2023 21:05
Copy link
Collaborator

@jtcohen6 jtcohen6 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One small comment (I changed my mind about the source/target mismatch error) -- then this lgtm!

website/docs/reference/commands/clone.md Outdated Show resolved Hide resolved
Comment on lines +37 to +39
For example, by creating actual data warehouse objects, `dbt clone` allows you to test out your code changes on downstream dependencies _outside of dbt_ (such as a BI tool).

As another example, you could `clone` your modified incremental models as the first step of your dbt Cloud CI job to prevent costly `full-refresh` builds for warehouses that support zero-copy cloning.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nice!

@graciegoheen graciegoheen self-assigned this Jul 18, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
content Improvements or additions to content size: medium This change will take up to a week to address
Projects
None yet
Development

Successfully merging this pull request may close these issues.

New command: dbt clone
6 participants