Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Source Hubspot: refactor PropertyHistory stream #14666

Conversation

davydov-d
Copy link
Collaborator

What

https://github.com/airbytehq/oncall/issues/291

How

  • disable Incremental mode for this stream
  • get rid of redundant http requests
  • use correct http GET params

…tory stream and decrease number of http requests made
@github-actions github-actions bot added area/connectors Connector related issues area/documentation Improvements or additions to documentation labels Jul 13, 2022
@davydov-d
Copy link
Collaborator Author

davydov-d commented Jul 13, 2022

/test connector=connectors/source-hubspot

🕑 connectors/source-hubspot https://github.com/airbytehq/airbyte/actions/runs/2662579329
❌ connectors/source-hubspot https://github.com/airbytehq/airbyte/actions/runs/2662579329
🐛 https://gradle.com/s/yrywtkjs54mdm

Build Failed

Test summary info:

=========================== short test summary info ============================
FAILED test_full_refresh.py::TestFullRefresh::test_sequential_reads[inputs0]
FAILED test_full_refresh.py::TestFullRefresh::test_sequential_reads[inputs1]
=================== 2 failed, 28 passed in 193.57s (0:03:13) ===================

@davydov-d
Copy link
Collaborator Author

davydov-d commented Jul 13, 2022

/test connector=connectors/source-hubspot

🕑 connectors/source-hubspot https://github.com/airbytehq/airbyte/actions/runs/2663005653
✅ connectors/source-hubspot https://github.com/airbytehq/airbyte/actions/runs/2663005653
Python tests coverage:

Name                                                 Stmts   Miss  Cover
------------------------------------------------------------------------
source_acceptance_test/utils/__init__.py                 6      0   100%
source_acceptance_test/tests/__init__.py                 4      0   100%
source_acceptance_test/__init__.py                       2      0   100%
source_acceptance_test/tests/test_full_refresh.py       52      2    96%
source_acceptance_test/utils/asserts.py                 37      2    95%
source_acceptance_test/config.py                        77      6    92%
source_acceptance_test/utils/json_schema_helper.py     105     13    88%
source_acceptance_test/tests/test_incremental.py       121     25    79%
source_acceptance_test/utils/common.py                  80     17    79%
source_acceptance_test/tests/test_core.py              294    106    64%
source_acceptance_test/utils/compare.py                 62     23    63%
source_acceptance_test/base.py                          10      4    60%
source_acceptance_test/utils/connector_runner.py       110     48    56%
------------------------------------------------------------------------
TOTAL                                                  960    246    74%
Name                         Stmts   Miss  Cover
------------------------------------------------
source_hubspot/errors.py         6      0   100%
source_hubspot/__init__.py       2      0   100%
source_hubspot/helpers.py       70      3    96%
source_hubspot/streams.py      766     66    91%
source_hubspot/source.py        97     21    78%
------------------------------------------------
TOTAL                          941     90    90%

Build Passed

Test summary info:

All Passed

Copy link
Contributor

@pedroslopez pedroslopez left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@davydov-d In the linked issue you mention "due to a bug in the stream we make lots of redundant queries and generate lots of duplicate records" - was the incremental mode causing this? What about the stream makes incremental mode behave incorrectly / prevents us from making it incremental?

@davydov-d
Copy link
Collaborator Author

davydov-d commented Jul 13, 2022

@davydov-d In the linked issue you mention "due to a bug in the stream we make lots of redundant queries and generate lots of duplicate records" - was the incremental mode causing this? What about the stream makes incremental mode behave incorrectly / prevents us from making it incremental?

@pedroslopez Well the reasons for generating duplicate records as well as the outrageous number of requests we used to make were following:

  • we did not use count param to use max page size (we used limit instead, so the page size was default 25 instead of 100 which increased the request number)
  • we used to make at least one call per stream slice (30 days) although the API supports fetching only last 30 days changes and does not support time-offset or vid-offset filtering, only paginating.

As per the incremental mode, I think it's kind of a wrong approach:

  • the API does not support filtering by the cursor field
  • the API itself returns most recently updated contacts

Copy link
Contributor

@pedroslopez pedroslopez left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@davydov-d makes sense! Thanks for explaining 😄

@davydov-d
Copy link
Collaborator Author

davydov-d commented Jul 14, 2022

/publish connector=connectors/source-hubspot

🕑 Publishing the following connectors:
connectors/source-hubspot
https://github.com/airbytehq/airbyte/actions/runs/2668473819


Connector Did it publish? Were definitions generated?
connectors/source-hubspot

if you have connectors that successfully published but failed definition generation, follow step 4 here ▶️

@octavia-squidington-iii octavia-squidington-iii temporarily deployed to more-secrets July 14, 2022 07:10 Inactive
@davydov-d davydov-d merged commit 400f4e6 into master Jul 14, 2022
@davydov-d davydov-d deleted the ddavydov/#291-oncall-source-hubspot-fix-property-history-stream branch July 14, 2022 07:14
@lazebnyi lazebnyi removed their request for review July 18, 2022 18:17
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/connectors Connector related issues area/documentation Improvements or additions to documentation connectors/source/hubspot
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants