🐛 Source Stripe: fix multiple BankAccounts issues #32146

davydov-d · 2023-11-03T15:46:35Z

What

This PR addresses multiple issues described in https://github.com/airbytehq/oncall/issues/3398 and #31555 that are related to lazy substreams, mostly the BankAccounts stream:

500 error <lambda>() missing 1 required keyword-only argument: 'stream_slice'
Data not expanded when using the BankAccounts stream in the full refresh mode
Data not filtered when using the event-based incremental sync mode
Cursor values not filled when using the full refresh mode

They are all fixed in a single PR because they all depend one on another.

How

Reorder function params of some lambdas like path and extra_request_params
Instantiate the Customers stream twice - first to be an independent stream, second to be a parent for the BankAccounts stream and expand the requested data
Replace a FilteringRecordExtractor with a response_filter callable that is now passed into both IncrementalExtractor and DefaultExtractor so that records are filtered no matter what the sync mode is.
Override the record extractor of the UpdatedCursorIncrementalStripeLazySubStream so that cursor value is filled in the full refresh sync mode as well
Cover these changes with unit tests

vercel · 2023-11-03T15:46:41Z

The latest updates on your projects. Learn more about Vercel for Git ↗︎

1 Ignored Deployment

Name	Status	Preview	Comments	Updated (UTC)
airbyte-docs	⬜️ Ignored (Inspect)	Visit Preview		Nov 10, 2023 8:42am

github-actions · 2023-11-03T15:46:56Z

Before Merging a Connector Pull Request

Wow! What a great pull request you have here! 🎉

To merge this PR, ensure the following has been done/considered for each connector added or updated:

PR name follows PR naming conventions
Breaking changes are considered. If a Breaking Change is being introduced, ensure an Airbyte engineer has created a Breaking Change Plan.
Connector version has been incremented in the Dockerfile and metadata.yaml according to our Semantic Versioning for Connectors guidelines
You've updated the connector's metadata.yaml file any other relevant changes, including a breakingChanges entry for major version bumps. See metadata.yaml docs
Secrets in the connector's spec are annotated with airbyte_secret
All documentation files are up to date. (README.md, bootstrap.md, docs.md, etc...)
Changelog updated in docs/integrations/<source or destination>/<name>.md with an entry for the new version. See changelog example
Migration guide updated in docs/integrations/<source or destination>/<name>-migrations.md with an entry for the new version, if the version is a breaking change. See migration guide example
If set, you've ensured the icon is present in the platform-internal repo. (Docs)

If the checklist is complete, but the CI check is failing,

Check for hidden checklists in your PR description
Toggle the github label checklist-action-run on/off to re-run the checklist CI.

maxi297

I really like what I see. I would like to see a little bit more coverage to document these cases and make sure we have a safety net in case of regression

maxi297 · 2023-11-06T21:04:20Z

airbyte-integrations/connectors/source-stripe/unit_tests/test_streams.py

-                        "object": "list",
-                        "total_count": 3,
-                        "url": "/v1/invoices/in_1KD6OVIEn5WyEQxn9xuASHsD/lines",
+lazy_substream_test_suite = (


I like this a lot! It feels like this is filling a big testing gap we have for streams where the dataset is hard/impossible to maintain.

The PR addresses the case where the events stream records were not filtered properly. Should we test that using this kind of tests?

I would like to push the idea further: for each stream_cls, do we want to have one test for incremental and one for full_refresh? The reason I'm asking is that the code path is very different for both. For example, bank_accounts relies on the events endpoint on incremental and on the customer else. One other crazy particularity with stripe is the pagination given StripeLazySubStream (for stream invoice_line_items for example) which seems to be a very specific and interesting case to document/test

@maxi297 all these cases are already covered.

Full refresh + pagination for Invoice Line Items (StripeLazySubStream)

Full refresh + pagination for Subscription Items (StripeLazySubstream)

Full refresh + pagination for Bank Accounts (UpdatedCursorIncrementalStripeLazySubstream)

Full refresh + pagination for Application Fees Refunds (UpdatedCursorIncrementalStripeLazySubstream)

Incremental for Bank Accounts (UpdatedCursorIncrementalStripeLazySubstream)

Incremental for Application Fees Refunds (UpdatedCursorIncrementalStripeLazySubStream)

Each of this cases tests:

Request args. The test would fail if the URL param values do not match those in the expected URL

Record filtering - the number of actual records should be the same as the number of expected records and it is not always the number of records in the mocked response

Cursor value - it is populated in the expected records, so if missing, the test would fail

Ok, so we test per python classes and not per Stripe stream. I think that can be enough for now. What I'm worried about is that I don't see a safety net for a couple of changes you've done. For example:

If we have an event that is not filtered properly

If a dev removes parent=expand_items=["data.sources"] on bank_accounts

Do you think it's worth testing that?

@maxi297 what do you mean by event that is not filtered properly? Did we have such an issue?
Regarding your second concern - I do think we do not have to cover this as we no more implement a class per stream and this is an input option for instantiating a class

I think we are looking at the same lines of code with two different perspectives that are not mutually exclusive but do not produce the same lines of code for the tests.

I do think we do not have to cover this as we no more implement a class per stream and this is an input option for instantiating a class

This is true if we check the interface of the classes in isolation. So if we check the tests for UpdatedCursorIncrementalStripeLazySubStream, this case is covered as you mention; Unit tests are checking that just fine. Does that mean the source is working as we would like it to work? The answer is "no" as the code was failing in production and the stream BankAccounts was missing records in production. Tomorrow, I could remove this line inadvertently and I would have no safety net to tell my I did something wrong because this is not caught on the unit test level or the CATs level. Hence, it feels like we need another safety net.

what do you mean by event that is not filtered properly? Did we have such an issue?

I think we has as the PR mentions Data not filtered when using the event-based incremental sync mode. There was indeed a change made on filtering on the event stream. By looking more closely at the different cases, I see that this entry would be filtered so the case is covered. What I fear is that the purpose of the test is not clear (it tests many things like that it perform the right query for incremental, filtering works, etc...) and as it isn't explicit, I can remove lines 255 to 262 from the test and it would still pass and still be a valid test. It's very hard to be too explicit but things are often too implicit.

As a more generic comments, tests have many many goals. We are pretty good in using tests for validation explicitly written code in a specific project. This is good as it ensure that all our small units of code work as expected. However, I think we should use tests for other reasons:

Ensuring the integration between our component works as expected (hence the first point)

Documenting the code (hence my second point)

@maxi297 tests updated: Now we are testing not classes, but source streams. I have also split the generic test into three separate tests to verify that
a. cursor value is populated,
b. data is filtered
c. data is expanded
for both sync modes (if applicable)

The PR has also been split into two different

davydov-d · 2023-11-07T17:50:30Z

@maxi297 I have also merged in another patch regarding different issue: https://github.com/airbytehq/oncall/issues/3428. Changes may look huge but both updates share the same codebase so I think it's worth doing in a single PR. The PR description is updated, changes are covered with tests, please review once more when you have a chance

maxi297 · 2023-11-07T20:35:12Z

There are 8 changes all within one PR that modifies adds 875 lines of code and removes 433 lines. It is too tedious to find how each of the changes are tested to ensure the proper execution of the source. The fact that a test can test many things at the same time and that there are no significant names on those tests makes things even harder.

I can't review this change effectively. I've tried but it's already been an hour and I'm not done. Please split this into different PR (one for each change) so that I can see the impact of the changes more easily.

…rbyte into ddavydov/3398-oncall-bugfix

maxi297

This is very good. There's only one thing I can't find out. My understanding is that this line was changed to fix 500 error <lambda>() missing 1 required keyword-only argument: 'stream_slice'. Is this right? If so, should we have a test for subscription_items to ensure we have a safety net and this does not happen again?

maxi297 · 2023-11-09T14:03:30Z

airbyte-integrations/connectors/source-stripe/unit_tests/conftest.py

+@pytest.fixture()
+def stream_by_name(config):
+    def mocker(stream_name, source_config=config):
+        source = SourceStripe()


I really like that because it uses the streams as configured by the source so not only it tests the streams but also how the source instantiate them

maxi297 · 2023-11-09T14:05:23Z

airbyte-integrations/connectors/source-stripe/unit_tests/test_streams.py

-    requests_mock.get(
-        "https://api.stripe.com/v1/invoices",
-        json={
+bank_accounts_full_refresh_test_case = (


This tests Data not expanded when using the BankAccounts stream in the full refresh mode 👍

maxi297 · 2023-11-09T14:07:01Z

airbyte-integrations/connectors/source-stripe/unit_tests/test_streams.py

-        {"id": "il_2", "invoice_id": "in_1KD6OVIEn5WyEQxn9xuASHsD", "object": "line_item"},
-        {"id": "il_3", "invoice_id": "in_1KD6OVIEn5WyEQxn9xuASHsD", "object": "line_item"},
-    ]
+bank_accounts_incremental_test_case = (


This tests Data not filtered when using the event-based incremental sync mode 👍

…rbyte into ddavydov/3398-oncall-bugfix

davydov-d · 2023-11-10T08:33:02Z

This is very good. There's only one thing I can't find out. My understanding is that this line was changed to fix 500 error <lambda>() missing 1 required keyword-only argument: 'stream_slice'. Is this right? If so, should we have a test for subscription_items to ensure we have a safety net and this does not happen again?

@maxi297 Completely forgot about it, thanks. Tests updated again

maxi297

Thanks for adding the test!

fix multiple stripe issues

a85401e

octavia-squidington-iii added area/connectors Connector related issues area/documentation Improvements or additions to documentation connectors/source/stripe labels Nov 3, 2023

update changelog

088e0d7

vercel bot deployed to Preview November 3, 2023 15:48 View deployment

davydov-d marked this pull request as ready for review November 3, 2023 15:58

davydov-d added the checklist-action-run label Nov 3, 2023

davydov-d requested review from maxi297, bazarnov and a team November 3, 2023 15:58

Automated Commit - Formatting Changes

2220237

vercel bot deployed to Preview November 3, 2023 16:06 View deployment

bazarnov approved these changes Nov 3, 2023

View reviewed changes

Merge branch 'master' into ddavydov/3398-oncall-bugfix

e4a1eae

vercel bot deployed to Preview November 3, 2023 21:36 View deployment

maxi297 reviewed Nov 6, 2023

View reviewed changes

octavia-squidington-iv requested a review from a team November 7, 2023 17:38

davydov-d changed the title ~~🐛 Source Stripe: fix multiple BankAccount issues~~ 🐛 Source Stripe: fix BankAccounts, Refunds, CheckoutSessions and CheckoutSessionsLineItems issues Nov 7, 2023

vercel bot deployed to Preview November 7, 2023 17:38 View deployment

davydov-d added the breaking-change Don't merge me unless you are ready. label Nov 7, 2023

davydov-d changed the title ~~🐛 Source Stripe: fix BankAccounts, Refunds, CheckoutSessions and CheckoutSessionsLineItems issues~~ 🚨 🚨 Source Stripe: fix BankAccounts, Refunds, CheckoutSessions and CheckoutSessionsLineItems issues Nov 7, 2023

davydov-d force-pushed the ddavydov/3398-oncall-bugfix branch 2 times, most recently from aecc74d to e4a1eae Compare November 8, 2023 08:35

davydov-d changed the title ~~🚨 🚨 Source Stripe: fix BankAccounts, Refunds, CheckoutSessions and CheckoutSessionsLineItems issues~~ 🐛 Source Stripe: fix multiple BankAccounts issues Nov 8, 2023

davydov-d removed the breaking-change Don't merge me unless you are ready. label Nov 8, 2023

test connector streams instead of classes

5b19277

vercel bot deployed to Preview November 8, 2023 17:04 View deployment

davydov-d and others added 4 commits November 8, 2023 17:13

Automated Commit - Formatting Changes

2f0b204

update tests

7c20080

Merge branch 'ddavydov/3398-oncall-bugfix' of github.com:airbytehq/ai…

1e4a000

…rbyte into ddavydov/3398-oncall-bugfix

Automated Commit - Formatting Changes

2697f7b

maxi297 reviewed Nov 9, 2023

View reviewed changes

davydov-d added 2 commits November 10, 2023 10:31

update tests

8a7571a

Merge branch 'ddavydov/3398-oncall-bugfix' of github.com:airbytehq/ai…

4ce3635

…rbyte into ddavydov/3398-oncall-bugfix

Automated Commit - Formatting Changes

32190b3

maxi297 approved these changes Nov 13, 2023

View reviewed changes

davydov-d merged commit a05a293 into master Nov 13, 2023
18 checks passed

davydov-d deleted the ddavydov/3398-oncall-bugfix branch November 13, 2023 13:15

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

🐛 Source Stripe: fix multiple BankAccounts issues #32146

🐛 Source Stripe: fix multiple BankAccounts issues #32146

davydov-d commented Nov 3, 2023 •

edited

Loading

vercel bot commented Nov 3, 2023 •

edited

Loading

github-actions bot commented Nov 3, 2023 •

edited by davydov-d

Loading

maxi297 left a comment

maxi297 Nov 6, 2023

davydov-d Nov 7, 2023

maxi297 Nov 7, 2023

davydov-d Nov 7, 2023

maxi297 Nov 7, 2023

davydov-d Nov 8, 2023

davydov-d commented Nov 7, 2023 •

edited

Loading

maxi297 commented Nov 7, 2023

maxi297 left a comment

maxi297 Nov 9, 2023

maxi297 Nov 9, 2023

maxi297 Nov 9, 2023

davydov-d commented Nov 10, 2023

maxi297 left a comment

🐛 Source Stripe: fix multiple BankAccounts issues #32146

🐛 Source Stripe: fix multiple BankAccounts issues #32146

Conversation

davydov-d commented Nov 3, 2023 • edited Loading

What

How

vercel bot commented Nov 3, 2023 • edited Loading

github-actions bot commented Nov 3, 2023 • edited by davydov-d Loading

Before Merging a Connector Pull Request

maxi297 left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

davydov-d commented Nov 7, 2023 • edited Loading

maxi297 commented Nov 7, 2023

maxi297 left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

davydov-d commented Nov 10, 2023

maxi297 left a comment

Choose a reason for hiding this comment

davydov-d commented Nov 3, 2023 •

edited

Loading

vercel bot commented Nov 3, 2023 •

edited

Loading

github-actions bot commented Nov 3, 2023 •

edited by davydov-d

Loading

davydov-d commented Nov 7, 2023 •

edited

Loading