Concurrent CDK: source-stripe: concurrent execution for incremental syncs #32057
Closed
3 of 4 tasks
Labels
connectors/source/stripe
team/critical-connectors
team/extensibility
type/enhancement
New feature or request
What area the feature impact?
Connectors
Relevant Information
Goal: Reduce first incremental sync duration
Once the Concurrent CDK supports incremental syncs with state conversion, stripe can now leverage this.
Why only the first incremental sync?There are a couple of issues with stripe at the moment. For examplehttps://airbytehq-team.slack.com/archives/C02U9R3AF37/p1697571185299049https://airbytehq-team.slack.com/archives/C02U9R3AF37/p1697836660843669Having parent stream but state are not per partition have risk in terms of data loss. We don't expect Concurrent CDK to aggravate this issue but we still fear it might add a layer of complexity on top of the connectorfull_refresh and first incremental syncs leverage mostly the same code path (i.e. stripe has this logic of consuming different endpoint if there is a state is empty example)There is not much value to running things concurrent once the first incremental sync is done as it should be up-to-date and hence should only have one sliceThere are other issues that were documented as part of this spreadsheet. So instead of doing all streams first incremental sync concurrent, we will work on a stream by stream basis as follow:
UpdatedCursorIncrementalStripeStreamPersonsSetupAttemptsFor now, we won't tackle:
IncrementalStripeStream
as they are affected by https://github.com/airbytehq/airbyte-internal-issues/issues/2368 which will change the cursor and hence incremental syncsapplication_fees_refunds
,bank_accounts
andcheckout_sessions_line_items
as their cursor field is not working as is (see https://github.com/airbytehq/airbyte-internal-issues/issues/2435 )refunds
as the implementation will probably change toIncrementalStripeStream
(see https://github.com/airbytehq/airbyte-internal-issues/issues/2450)[DONE] Missing CDK feature:
The cursor needs to know what is the lowest boundary expected. Let's assume the following partitions are generated:
If the second is successful but not the first, no state should be emitted. This is different than the logic we implemented as we only take the upper boundary of the first continuous range which in that case would be 3.
Acceptance criteria
UpdatedCursorIncrementalStripeStreamPersonsSetupAttemptsThe text was updated successfully, but these errors were encountered: