-
Notifications
You must be signed in to change notification settings - Fork 4.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add _raw suffix to non-normalized destination table names #874
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i know the goal is to get this in right now so feel free to skip this suggestion for now. but i think your life gets a whole lot easier if you just had a helper method that looks like this
public String getRawTableName(String streamName) {
return streamName + "_raw";
}
it saves us needing put "_raw" all over the codebase. ideally that string should be written in exactly one place. (at least in java)
maybe also a function that maps (rawTableName) => streamname.
anyway. go for it.
@@ -35,6 +35,4 @@ quoting: | |||
# are materialized, and more! | |||
models: | |||
airbyte: | |||
# Schema (dataset) defined in profiles.yml is concatenated with schema below for dbt's final output |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
eh?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this suffixed the schema/dataset with _NORMALIZED
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ah okay.
@cgardens good suggestion - implemented |
What
_raw
suffix to all non-normalized table names in the destinationI'm pretty unhappy with this implementation and think we should remove it going forward, but putting it up to minimize delivery risk. Some better ways to do this are having DBT rename the raw tables in a pre-hook to add the _raw suffix, or having the destination integration output a "writtenCatalog" which contains the names of the tables it actually wrote. The latter is better in the case that we want to stop overwriting tables when performing full refresh syncs and instead write to a different table e.g
table_timestamp
.