Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix concurrent deadlock #34454

Merged
merged 8 commits into from
Jan 24, 2024
Merged

Fix concurrent deadlock #34454

merged 8 commits into from
Jan 24, 2024

Conversation

maxi297
Copy link
Contributor

@maxi297 maxi297 commented Jan 23, 2024

What

Addresses the second issue from https://github.com/airbytehq/oncall/issues/3602

How

  • Adding priority on the queue so that records are processed faster and hence free more memory we did not add a priority queue because of the complexity it adds. Indeed, the ConcurrentReadProcessor assume a certain sequence of event which is not true if the order change
  • Block on queue if size > 10_000
  • Only block for futures during PartitionEnqueuer. This will be done as part of the thread now hence the need for a lock to make it thread safe

Here is the diagram of the possible cases in terms of what populates the queue and what populates the list of futures:
image

Recommended reading order

  1. x.java
  2. y.python

🚨 User Impact 🚨

We should not see the case where the source hangs and dies on _queue.Empty. My personal connection on https://cloud.airbyte.com/workspaces/8ba9a84d-ef49-41fb-9aaa-9f75fb0f351e/connections/bf6ede5c-e346-4489-a965-63a381eb3c30 is working now.

TODO

Re-release stripe with this new CDK version

@maxi297 maxi297 requested a review from a team as a code owner January 23, 2024 20:54
Copy link

vercel bot commented Jan 23, 2024

The latest updates on your projects. Learn more about Vercel for Git ↗︎

Name Status Preview Comments Updated (UTC)
airbyte-docs ✅ Ready (Inspect) Visit Preview 💬 Add feedback Jan 24, 2024 5:26pm

@octavia-squidington-iii octavia-squidington-iii added the CDK Connector Development Kit label Jan 23, 2024
@maxi297 maxi297 requested review from girarda and clnoll January 23, 2024 21:10
concurrent_stream_processor: ConcurrentReadProcessor,
) -> Iterable[AirbyteMessage]:
while airbyte_message_or_record_or_exception := queue.get():
try:
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The tests were very minimal on this class and adding this did not break anything. Given the urgency, I'll ask for a review right now but I'll go back and add some tests to make sure we document the behavior of this class


@staticmethod
def _get_priority(value) -> int:
# The order of the `isinstance` is a bit funky but it is order in terms of which QueueItem we expect to see the most
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should this raise an exception on unexpected type to ensure we don't forget to update the method if we ever add a new one?

Alternative would be to set a default priority.

edit: I like this test with the caveat that the current implement actually returns an Optional[int]

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added the test as you were commenting.

For the link, it seems like it is pointing to a blank like. Can you be more explicit about which method return Optional[int]?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

_get_priority does not return anything if the value is not of the expected type

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's raising an exception though right? So the return type is indeed just int and not Optional[int], right?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't see where the method is raising an exception. As far as I can tell, the check relies on the unit test to verify all types of QueueItems are handled. is that what you mean?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What about this line?

Copy link
Contributor

@clnoll clnoll left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @maxi297! LGTM. Just one logging request. Also would you mind putting your diagram in the PR description?

Copy link
Contributor

@girarda girarda left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🎉 🙏

@maxi297 maxi297 merged commit b9c1897 into master Jan 24, 2024
25 checks passed
@maxi297 maxi297 deleted the maxi297/fix-concurrent-deadlock branch January 24, 2024 17:35
jatinyadav-cc pushed a commit to ollionorg/datapipes-airbyte that referenced this pull request Feb 26, 2024
jatinyadav-cc pushed a commit to ollionorg/datapipes-airbyte that referenced this pull request Feb 26, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CDK Connector Development Kit
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants