-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implement archival with blocks #1175
Comments
cc @jthandy |
cc @jtcohen6 |
I really love this. If, in the process, we can also improve the logging of archival sql to the standard log that would make me incredibly happy. My guess is we will in the process of touching this code anyway. Here are some thoughts:
|
i don't actually know what you mean by that!
yeah, my idea is very much that we'll be able to get rid of the "-paths" notion altogether once everything is defined in blocks. I will say: these paths can be overlapping, so you could just make a single directory that's specified as you
really great feedback 👍 |
The last time I checked, archival didn't actually output the queries it was running against your warehouse to the standard If this is no longer the case then 👍 but would be great if we could just do a super-quick audit of what log statements we have in the archival process. And maybe the archival sql should actually go to |
I think I saw the archival sql statements being logged yesterday. |
Yeah - these will be logged to |
Feature
Feature description
Let's take archival out of configuration and redefine it using code. In so doing, users will be able to more flexibly control the semantics of their archival jobs.
Functionality to support:
unique_key
andupdated_at
).Proposed spec
Parameters:
{% archive {archive_name} %}
. Use this name toref
the archivetarget_database
: the destination database (if supported by the warehouse)target_schema
: the destination schematarget_table
: the destination tableunique_key
: The column that uniquely identifies an entity in the querystrategy
: Can be one oftimestamp
orcheck
timestamp
: implements the existing behavior of archival. Requires anupdated_at
configcheck
: dbt will compare the columns incheck_cols
to previous values for theunique_key
. Archival will occur when these values change. Ifcheck_cols
is set to"all"
, then db will check all columns in the table. (those exact semantics TBD)Notes:
check_col
Considerations:
archives/
dir by default. Users can change this with anarchive-paths
config indbt_project.yml
.Who will this benefit?
Archive users
The text was updated successfully, but these errors were encountered: