Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[CT-263] [Bug] Snapshots should get name from sql file not snapshot macro to be consistent #4761

Closed
1 task done
nathan-protempo opened this issue Feb 22, 2022 · 4 comments
Labels
bug Something isn't working snapshots Issues related to dbt's snapshot functionality

Comments

@nathan-protempo
Copy link

nathan-protempo commented Feb 22, 2022

Is there an existing issue for this?

  • I have searched the existing issues

Current Behavior

Not a bug as such, more an inconsistency in behaviour.

When creating a snapshot we create a sql file in the "snapshot" directory and then add a {% snapshot ... %} macro.
The name of the generated snapshot table is derived from the value in the {% snapshot .. %} macro definition and the name of the sql file is ignored.
This behaviour is inconsistent with how all other models derive their names, which is from the name of the file itself. This can lead to confusion if users are expecting the snapshot name to be based on the file name as it is for other models. It can also make it hard to relate snapshot database objects back to a dbt file if the file name and the macro name are different, since the value in the macro is only visible if the file is opened or the user performs a project-wide text search.

Expected Behavior

Snapshot database objects are created with a name that matches the file name of the snapshot model.

Steps To Reproduce

Create a dbt snapshot where the file name does not match the name defined in the macro.

Relevant log output

No response

Environment

- OS: Windows 
- Python: 3.8.10
- dbt: 1.0.1

What database are you using dbt with?

snowflake

Additional Context

@joellabes mentioned on Slack that this inconsistency might be connected to the prevailing thinking on models back when snapshots were implemented e.g. option 2 in the post below #3469 (comment)

@nathan-protempo nathan-protempo added bug Something isn't working triage labels Feb 22, 2022
@github-actions github-actions bot changed the title [Bug] Snapshots should get name from sql file not snapshot macro to be consistent [CT-263] [Bug] Snapshots should get name from sql file not snapshot macro to be consistent Feb 22, 2022
@jtcohen6 jtcohen6 added snapshots Issues related to dbt's snapshot functionality Team:Language labels Feb 22, 2022
@gshank
Copy link
Contributor

gshank commented Feb 23, 2022

I can understand that this feels inconsistent with regard to models, but it is consistent with regard to other dbt objects that are created in blocks with names, such as macros. The problem is that you can put multiple snapshot blocks with different names into one snapshot sql file, which wouldn't work if the snapshots took the name of the file. Changing this would also be a breaking change for users. I think your best option to achieve what you're asking for is to give the snapshots the same name as the file.

@gshank gshank closed this as completed Feb 23, 2022
@nathan-protempo
Copy link
Author

@gshank I don't see that "snapshots are consistent with macros" is really a justification for this behaviour, I think my point was that there doesn't appear to be a reason why snapshots should behave like macros. It doesn't really make sense that seeds, analyses and models work one way but snapshots are a special case.

As far as I can tell it was only implemented this way because at the time there was an intention to make everything work as blocks, but it was later decided to go with "1 file = 1 model" - except for snapshots.

I also don't think that this needs to be a breaking change, both behaviours could be accommodated with something like this:

  • dbt already knows that files in the "snapshots" folder represent snapshots as this is defined in snapshot-paths in dbt_project.yml
  • a file in the snapshots folder that contains SQL but not the {% snapshot %} macro could just be automatically wrapped in that macro when compiled, with the object name defaulted to the file name - this would allow for usage consistent with other models i.e. one sql file per object, object names match file names
  • a file that contains one or more instances of the {% snapshot %} macro could be processed as it is now, with the object name taken from the macro name.

It seems you have already closed this issue but I'm leaving this comment in case this decision is reviewed in future.

@jtcohen6
Copy link
Contributor

jtcohen6 commented Feb 24, 2022

@nathan-protempo I think your read of the historical context here is a fair one. We created snapshots in 2019, at a time when we were also thinking that the move away from one file = one model to one file -> many model blocks felt imminent.

An approach like the one you're describing feels reasonable to me. dbt could see a file like:

-- snapshots/my_snapshot.sql

{{ config(...) }}

select * from {{ source('src', 'tbl') }}

And parse it the same way it would:

-- snapshots/my_snapshot.sql

{% snapshot my_snapshot %}

{{ config(...) }}

select * from {{ source('src', 'tbl') }}

{% endsnapshot %}

@gshank I'd be curious to get your read on how hard something like that would be. I imagine it might require two passes during parsing for each file. I don't think we should prioritize that work imminently, but I'd be okay with having it on the backlog, in case more users come along and want the same thing.

In another (and maybe future) state of the world, in which we support "blocks" for all resource types (including models), we'd need the ability to support both representations. I know we did work on this in the past; that was the previous incarnation of the "experimental parser" (same name, very different goal): #1400, #1407

@nathan-protempo
Copy link
Author

Thanks for the additional context @jtcohen6, that is indeed how I was thinking it might work. Appreciate this is not going to be a show stopper for anyone but thought it was worth raising as I have been doing some work with snapshots recently.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working snapshots Issues related to dbt's snapshot functionality
Projects
None yet
Development

No branches or pull requests

3 participants