-
Notifications
You must be signed in to change notification settings - Fork 39
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
bugfix/uniqueness-test-adjustment #94
bugfix/uniqueness-test-adjustment #94
Conversation
@fivetran-jamie FYI I also explored adding more rigorous testing instead of just removing the uniqueness tests. However, I was having a hard time rationalizing the need for a complicated test when I felt the removal would suffice. It was either add a complicated test that could be brittle and break along the way, or remove it and rely on the not_null test only. Open to your thoughts. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
so i think we need to adjust the tests for the email event models, not for models like contacts
or deal_stages
. i followed up in the issue to clarify which models are failing
i also think it would be best to include the is_contact_deleted
fields in a dbt_utils.unique_combination_of_columns test rather than just removing the uniqueness tests
also also, i noticed that some _fivetran_synced
entries have the _fivetran_deleted
doc definition if you'd have bandwidth to fix those in here (or i can make an issue)
Thanks for following up here with these notes @fivetran-jamie! I definitely will have the bandwidth to make the adjustments on the docs definitions. Additionally, I had one follow up regarding the addition of the tests. Would it technically work to simply include the deleted column to the uniqueness test? What if there are multiple deleted entries for a single contact? Wouldn't that still return a test failure as there could be five deleted entries and only one non deleted? I would assume that uniqueness test on multiple columns would fail. If we wanted to truly test this, I think we would need to use a more robust test. Maybe something along the lines of the dbt-expectations expect_column_values_to_be_unique test to test on the condition where we filter out deleted records. tests:
- dbt_expectations.expect_column_values_to_be_unique:
row_condition: "not is_deleted_field" The inverse of this approach would be to just remove the uniqueness test altogether. |
@fivetran-joemarkiewicz ah i definitely like the |
Just applied the updates and the users shared that the test updates fixed the issues on their end (see the issue for details). This is now ready for re-review. I will plan to regen docs following your review/approval. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
left a couple of questions! nearly there
@@ -20,7 +20,8 @@ models: | |||
- name: event_id | |||
description: The ID of the event. | |||
tests: | |||
- unique | |||
- dbt_expectations.expect_column_values_to_be_unique: | |||
row_condition: "not is_contact_deleted" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
would we ever need to coalesce?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's a great point. My thought was that since we did the coalesce in the staging model then it didn't need to be done here. However, we did the coalesce in the where clause. So, yes a coalesce would be wise to add here. Will update!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated
@@ -1,3 +1,5 @@ | |||
packages: | |||
- package: fivetran/hubspot_source | |||
version: [">=0.8.0", "<0.9.0"] | |||
- package: calogica/dbt_expectations | |||
version: [">=0.8.0", "<0.9.0"] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should add this to the list of package dependencies in the README
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great catch! Will update.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated and added dbt-date as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm!
Are you a current Fivetran customer?
Fivetran created PR
What change(s) does this PR introduce?
Bug Fixes
v0.8.0
, the end model uniqueness tests were not updated to account for the added flexibility of the inclusion of deleted records. As such the respective end model tests have been adjusted to test uniqueness only on non-deleted records. (#94)Under the Hood
Did you update the CHANGELOG?
Does this PR introduce a breaking change?
This is simply an update to adjust the uniqueness tests following the release from v0.8.0. This will not be a breaking change.
Did you update the dbt_project.yml files with the version upgrade (please leverage standard semantic versioning)? (In both your main project and integration_tests)
Is this PR in response to a previously created Bug or Feature Request
How did you test the PR changes?
Select which warehouse(s) were used to test the PR
Provide an emoji that best describes your current mood
🦄
Feedback
We are so excited you decided to contribute to the Fivetran community dbt package! We continue to work to improve the packages and would greatly appreciate your feedback on our existing dbt packages or what you'd like to see next.