Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update/crm api v3 - contact_merge_audit table removed #100

Merged
merged 20 commits into from
Mar 30, 2023

Conversation

fivetran-jamie
Copy link
Collaborator

@fivetran-jamie fivetran-jamie commented Mar 21, 2023

Are you a current Fivetran customer?

internal

What change(s) does this PR introduce?

aligns with recent changes to the hubspot API

  • adds the new property_hs_calculated_merged_vids field to the contact table seed data
  • adds logic using the property_hs_calculated_merged_vids source field to merge contacts.
    • By default, bigquery targets will use the old contact_merge_audit table, but they can turn that off by setting hubspot_contact_merge_audit_enabled to false (the default value of this variable is target.type == 'bigquery')
    • the new logic is quite complicated!!! we have to turn calculated_merged_vids into an array, and then split it out into rows, and then split it again along a : (the format of this field is vid_to_merge:long_number_that_looks_like_epoch_seconds;second_vid_to_merge;another_epoch_time_maybe;...)
  • less important: updated the references to target.name to target.type and fixed some wonky seed files

i heavily leveraged our iterable code https://github.com/fivetran/dbt_iterable/blob/main/models/intermediate/int_iterable__list_user_unnest.sql

here's a breakdown of the logic/expected output:
let's say in the contact table we have 3 records, and their contact_ids are 100, 200, and 300

when we look at property_hs_calculated_merged_vids, it stores the contact ids that should be merged into the main contact_id. the format is vid_to_be_merged:long_number_that_looks_like_epoch_seconds;second_vid_to_merge;another_epoch_time_maybe;...

let's say contact 100 has a property_hs_calculated_merged_vids value of 200:1654364535;300:165404030. this means that contacts 200 and 300 were merged into 100, so only 100 should persist in the final hubspot__contacts model

Did you update the CHANGELOG?

  • Yes

Does this PR introduce a breaking change?

  • Yes (please provide breaking change details below.)
  • No (please provide an explanation as to how the change is non-breaking below.)

default behavior is different for most users (though they should end up with the same result...so maybe not?)

Did you update the dbt_project.yml files with the version upgrade (please leverage standard semantic versioning)? (In both your main project and integration_tests)

  • Yes

Is this PR in response to a previously created Bug or Feature Request

How did you test the PR changes?

  • Buildkite
  • Local (please provide additional testing details below)

Select which warehouse(s) were used to test the PR

  • BigQuery
  • Redshift
  • Snowflake
  • Postgres
  • Databricks
  • Other (provide details below)

Provide an emoji that best describes your current mood

🧛

Feedback

We are so excited you decided to contribute to the Fivetran community dbt package! We continue to work to improve the packages and would greatly appreciate your feedback on our existing dbt packages or what you'd like to see next.

@fivetran-jamie fivetran-jamie self-assigned this Mar 21, 2023
Copy link
Contributor

@fivetran-joemarkiewicz fivetran-joemarkiewicz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@fivetran-jamie thanks for working through this complicated update. I reviewed the output of the code and it all looks good across warehouses! It does exactly what I would expect 😄. This will be a great update for those encountering the API update woes.

That being said, I would like to clean up the logic a bit. I tried to convert the unnest into a macro and was having luck (with all but redshift). Additionally I had success with converting the split_part logic to just use dbt.split_part (for all but redshift). Therefore, my ask is if we can adjust the code to leverage the split_part macro where possible and make a custom macro for this package (maybe we can dip into fivetran_utils) for the flatten piece. Redshift can still be on it's own, but I would like to dry this code up a bit more if we can.

See my other comments below for additional requests to update. Thanks again!

CHANGELOG.md Outdated Show resolved Hide resolved
Copy link
Contributor

@fivetran-joemarkiewicz fivetran-joemarkiewicz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@fivetran-jamie we are so close! I just have one final comment around the hubspot_contact_merge_audit_enabled variable and if we should possibly change it. We can discuss more during standup.

Copy link
Contributor

@fivetran-joemarkiewicz fivetran-joemarkiewicz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@fivetran-jamie this looks good to go once the source package is approved!

@fivetran-jamie fivetran-jamie merged commit 7d76e94 into main Mar 30, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants