Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug] stg_hubspot__contact: Duplication between PROPERTY_CREATEDATE and PROPERTY_CREATED_AT #117

Closed
2 of 4 tasks
moreaupascal56 opened this issue Oct 11, 2023 · 10 comments
Closed
2 of 4 tasks
Assignees
Labels
error:forced status:accepted Scoped and accepted into queue type:bug Something is broken or incorrect update_type:models Primary focus requires model updates

Comments

@moreaupascal56
Copy link
Contributor

moreaupascal56 commented Oct 11, 2023

Is there an existing issue for this?

  • I have searched the existing issues

Describe the issue

Hello,

I have an issue with a duplicate column, when running dbt run -s +stg_hubspot_contact I have the following error
Capture d’écran 2023-10-11 à 10 47 18

I have hubspot__pass_through_all_columns: true

I had a closer look and it seems that the properties: PROPERTY_CREATEDATE and PROPERTY_CREATED_AT are named after prefix removal PROPERTY_CREATED_AT.

I think this is because in get_contact_columns() macro
the PROPERTY_CREATEDATE is mapped to created_at:
{"name": "property_createdate", "datatype": dbt.type_timestamp(), "alias": "created_at"},

The issue comes from the fact that the column PROPERTY_CREATED_AT is not excluded in the exclude parameter of stg_hubspot__contact which only returns these columns:

['_FIVETRAN_DELETED', '_FIVETRAN_SYNCED', 'ID', 'PROPERTY_HS_CALCULATED_MERGED_VIDS', 'PROPERTY_EMAIL', 'PROPERTY_COMPANY', 'PROPERTY_FIRSTNAME', 'PROPERTY_LASTNAME', 'PROPERTY_CREATEDATE', 'PROPERTY_JOBTITLE', 'PROPERTY_ANNUALREVENUE']

(so PROPERTY_CREATEDATE and not PROPERTY_CREATED_AT)

And then obviously PROPERTY_CREATED_AT is renamed as created_at and this is causing the issue.
Hope I was clear enough

Have a great day

Relevant error log or model output

see above

Expected behavior

Either exclude PROPERTY_CREATED_AT by default to avoid this issue or change PROPERTY_CREATEDATE mapping
I think the PROPERTY_ + aliases defined in get_contact_columns() macro
should be excluded as well.

dbt Project configurations

dbt hubspot source 0.12.0
dbt 1.4.4

Package versions

dbt hubspot source 0.12.0
dbt 1.4.4

What database are you using dbt with?

snowflake

dbt Version

dbt 1.4.4

Additional Context

maybe we will open a PR

Are you willing to open a PR to help address this issue?

  • Yes.
  • Yes, but I will need assistance and will schedule time during our office hours for guidance
  • No.
@moreaupascal56
Copy link
Contributor Author

I created this attached PR to showcase the issue but I guess there is better way to fix it

@fivetran-catfritz
Copy link
Contributor

Hi @moreaupascal56 thanks for flagging this issue! I have started taking a look at this and could use a little more information. In your source contact table, are you seeing both PROPERTY_CREATEDATE and PROPERTY_CREATED_AT columns, or only PROPERTY_CREATED_AT?

@moreaupascal56
Copy link
Contributor Author

moreaupascal56 commented Oct 11, 2023 via email

@fivetran-catfritz
Copy link
Contributor

@moreaupascal56 Thank you much for the info! I will discuss this with our internal team.

@fivetran-catfritz
Copy link
Contributor

@moreaupascal56 Thanks again for identifying the issue areas! I talked with the team, and we are thinking to update the alias you identified in the get_contact_columns() macro to be create_date. This way we're staying closer to the source naming, and you could still utilize the property_created_at field if you wish. We didn't want to remove user custom fields just because of the way we built the package. What are your thoughts?

@moreaupascal56
Copy link
Contributor Author

Hi, I think that is the more logical way to do it as well

@fivetran-catfritz fivetran-catfritz added the status:accepted Scoped and accepted into queue label Oct 17, 2023
@fivetran-catfritz
Copy link
Contributor

Thanks @moreaupascal56! We plan to bring this into the next update for this package. We'll also let you know when we have a better sense of the timing.

@moreaupascal56
Copy link
Contributor Author

moreaupascal56 commented Oct 17, 2023 via email

@fivetran-joemarkiewicz fivetran-joemarkiewicz added type:bug Something is broken or incorrect update_type:models Primary focus requires model updates labels Oct 24, 2023
@fivetran-jamie
Copy link
Contributor

this fix is in the latest release of the package!

if you're using the transform package, v0.14.0 will include this change

@moreaupascal56
Copy link
Contributor Author

hi there,

Looks good I am not able to test on prod rn because I switched jobs but this will help the team for sure :)
Thanks a lot and have a great day ! See you!!

Pascal

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
error:forced status:accepted Scoped and accepted into queue type:bug Something is broken or incorrect update_type:models Primary focus requires model updates
Projects
None yet
Development

No branches or pull requests

4 participants