Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged deals #135

Merged
merged 28 commits into from
Feb 21, 2024
Merged

Merged deals #135

merged 28 commits into from
Feb 21, 2024

Conversation

fivetran-reneeli
Copy link
Contributor

@fivetran-reneeli fivetran-reneeli commented Feb 13, 2024

PR Overview

This PR will address the following Issue/Feature: #134

This PR will result in the following new package version: 0.16.0

Please detail what change(s) this PR introduces and any additional information that should be known during the review of this PR:

We are adding logic to merge stale deals into the correct active deals based on the merged_deal table. This consists of 1. removing stale deals from the final deal models and 2. addition of a new field that lists all the merged deals per each active deal.

In addition we are adding back unique tests that we previously removed in #118 , using the dbt unique test config

PR Checklist

Basic Validation

Please acknowledge that you have successfully performed the following commands locally:

  • dbt compile
  • dbt run –full-refresh
  • dbt run
  • dbt test
  • dbt run –vars (if applicable) -- see height ticket for variables used

Before marking this PR as "ready for review" the following have been applied:

  • The appropriate issue has been linked and tagged
  • You are assigned to the corresponding issue and this PR
  • BuildKite integration tests are passing

Detailed Validation

Please acknowledge that the following validation checks have been performed prior to marking this PR as "ready for review":

  • You have validated these changes and assure this PR will address the respective Issue/Feature.
  • You are reasonably confident these changes will not impact any other components of this package or any dependent packages.
  • You have provided details below around the validation steps performed to gain confidence in these changes.

see linked hex journal

Standard Updates

Please acknowledge that your PR contains the following standard updates:

  • Package versioning has been appropriately indexed in the following locations:
    • indexed within dbt_project.yml
    • indexed within integration_tests/dbt_project.yml
  • CHANGELOG has individual entries for each respective change in this PR
  • README updates have been applied (if applicable)
  • DECISIONLOG updates have been updated (if applicable)
  • Appropriate yml documentation has been added (if applicable)

dbt Docs

Please acknowledge that after the above were all completed the below were applied to your branch:

  • docs were regenerated (unless this PR does not include any code or yml updates)

If you had to summarize this PR in an emoji, which would it be?

💃

@fivetran-reneeli
Copy link
Contributor Author

  1. should we add the same unique test logic here since there is a is_ticket_deleted field

  2. see my other comments

@fivetran-reneeli
Copy link
Contributor Author

  • docs

Copy link
Contributor

@fivetran-joemarkiewicz fivetran-joemarkiewicz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@fivetran-reneeli thanks for working through this update! Generally your updates look great, I just have some suggestions and questions around possibly adding a variable for the merged_deal models and possibly making this a breaking change.

Let me know if you have any questions. Thanks!

Edit: Sorry @fivetran-reneeli this was the review for the source package PR. Please disregard this as the review for the transform will be coming shortly.

models/sales/sales.yml Outdated Show resolved Hide resolved
models/sales/sales.yml Outdated Show resolved Hide resolved
Copy link
Contributor

@fivetran-joemarkiewicz fivetran-joemarkiewicz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@fivetran-reneeli thanks for working through these updates. I have a few comments and suggestions below to be addressed before approving.

Let me know if you have any questions. Thanks!

models/sales/intermediate/int_hubspot__deals_enhanced.sql Outdated Show resolved Hide resolved
models/sales/intermediate/int_hubspot__deals_enhanced.sql Outdated Show resolved Hide resolved

{% if var('hubspot_owner_enabled', true) %}
left join owners
on deals.owner_id = owners.owner_id
{% endif %}

where merged_deals.merged_deal_id is null -- remove deals that have been merged
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this true? I don't imagine the merged_deals table will contain all deal_ids and then if it has no merged deals then it will be null for the merged_deal_id column. I would assume this table is only populated with deals that have been merged. Therefore, if a deal has not been merged I would assume it would not be present in this table.

If the above is the case then this logic would not be accurate for filtering out merged deals.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When looking at the underlying data I see that there do in fact seem to be more deals in the deal table than exist in the merged_deal table. Given this, I believe this logic needs to be adjusted to accurately filter out merged deals.

image

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also is filtering out the null records the correct way to filter out merged accounts? Wouldn't we want to cross reference which deals are present in the merged_deal_id column and then when there are any matches that is what we filter out?

Let me know if I am missing a component, but I am not sure if filtering where the merged_deal_id is null is the right way to achieve this.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh goodness yes I confused how I was looking at the table-- right not all deals exist in the merged_deals. Therefore to correctly filter out deals in the final models that have been merged, we would need to do a check for if deal_id is a merged_deal_id.

I updated it to the following:

where deals.deal_id not in (select merged_deal_id from merged_deals)

models/sales/sales.yml Outdated Show resolved Hide resolved
Comment on lines 8 to 20
), merged_deals as (

select *
from {{ var('merged_deal')}}

), aggregate_merged_deals as (

select
deal_id,
{{ fivetran_utils.array_agg("merged_deal_id") }} as merged_deal_ids

from {{ var('merged_deal')}}
group by 1
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Additionally, following the conversation from the source PR we may want to add a conditional to the merged deal components if we do decide to leverage a variable to enable/disable these.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Included hubspot_merged_deal_enabled config

models/sales/hubspot__deal_stages.sql Show resolved Hide resolved
CHANGELOG.md Outdated Show resolved Hide resolved
models/sales/sales.yml Outdated Show resolved Hide resolved
Copy link
Contributor

@fivetran-joemarkiewicz fivetran-joemarkiewicz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@fivetran-reneeli thanks for working through these updates! I took another look and have a few additional questions and suggestions. Let me know if you have any questions. Once the changes are applied I can re-review.

.buildkite/scripts/run_models.sh Outdated Show resolved Hide resolved
CHANGELOG.md Outdated Show resolved Hide resolved
README.md Outdated Show resolved Hide resolved
models/sales/hubspot__deal_stages.sql Outdated Show resolved Hide resolved
Comment on lines 78 to 80
{% if var('hubspot_merged_deal_enabled', true) %}
where deals.deal_id not in (select merged_deal_id from merged_deals) -- remove deals that have been merged
{% endif %}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Small note about helping cut down on the conditional logic, you can reorder the joins so the two hubspot_merged_deal_enabled conditional sections can be consolidated into one. Something like this:

    {% if var('hubspot_owner_enabled', true) %}
    left join owners 
        on deals.owner_id = owners.owner_id
    {% endif %}

    {% if var('hubspot_merged_deal_enabled', false) %}
    left join aggregate_merged_deals
        on deals.deal_id = aggregate_merged_deals.deal_id

    where deals.deal_id not in (select merged_deal_id from merged_deals) -- remove deals that have been merged
    {% endif %}

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Additionally, I am not entirely opposed to the use of the subquery. However, do you know if there is a more optimal way to filter out the merged records? If this is, then I am comfortable leaving it in, but wanted to check if there were some other alternatives to consider as we normally do not use subqueries in these scenarios.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Righ,t I was thinking through options here since I know our packages barely use subqueries < CTEs. This was the most straightforward that I came to, but I know subqueries aren't efficient to be used in Where clauses. What do you think about the following?


    left join merged_deals
        on deals.deal_id = merged_deals.merged_deal_id
    where merged_deals.merged_deal_id is null

We want to only have deal_id's from the deal CTE that don't have a corresponding merged_deal_id from merged_deals. This way is less straightforward though

CHANGELOG.md Outdated Show resolved Hide resolved
@fivetran-joemarkiewicz
Copy link
Contributor

@fivetran-reneeli additionally I am now seeing a large number of test failures when running dbt test. Do you see these on your end as well?

image

@fivetran-reneeli
Copy link
Contributor Author

fivetran-reneeli commented Feb 20, 2024

Hi @fivetran-joemarkiewicz , I tried dbt test again and now I'm seeing the tests all pass, and compiling as expected...

image

Copy link
Contributor

@fivetran-joemarkiewicz fivetran-joemarkiewicz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@fivetran-reneeli thanks for working through this! This overall looks great and I just have a few final suggestions.

Also, we should probably let the original issue creator know that this is ready and maybe we can send this over to them to try before release.

README.md Outdated Show resolved Hide resolved
models/sales/intermediate/int_hubspot__deals_enhanced.sql Outdated Show resolved Hide resolved
packages.yml Outdated
Comment on lines 2 to 3
# - package: fivetran/hubspot_source
# version: [">=0.14.0", "<0.15.0"]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reminder to update before merge

@fivetran-reneeli fivetran-reneeli linked an issue Feb 21, 2024 that may be closed by this pull request
4 tasks
fivetran-reneeli and others added 4 commits February 21, 2024 15:51
Co-authored-by: Joe Markiewicz <74217849+fivetran-joemarkiewicz@users.noreply.github.com>
Co-authored-by: Joe Markiewicz <74217849+fivetran-joemarkiewicz@users.noreply.github.com>
@fivetran-reneeli fivetran-reneeli linked an issue Feb 21, 2024 that may be closed by this pull request
4 tasks
@fivetran-reneeli fivetran-reneeli merged commit b4f961a into main Feb 21, 2024
7 of 9 checks passed
@fivetran-reneeli fivetran-reneeli deleted the merged_deals branch February 21, 2024 22:01
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[Bug] Merged deals don't get remove in final models [Feature] Leverage macro to test uniqueness of models
2 participants