Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

made subscription broadcast multithreaded #741

Merged
merged 1 commit into from
Jan 25, 2024

Conversation

Shubham8287
Copy link
Contributor

@Shubham8287 Shubham8287 commented Jan 22, 2024

Description of Changes

Made broadcast_commit_event multithreaded to leverage read locks.

@Shubham8287 Shubham8287 force-pushed the shub/multitheading-subscribers branch 2 times, most recently from dcba3af to 4fa41e5 Compare January 22, 2024 13:42
@Shubham8287 Shubham8287 self-assigned this Jan 22, 2024
@coolreader18
Copy link
Collaborator

coolreader18 commented Jan 22, 2024

I think right now, I mostly don't see the point of this change? Is there a plan to start calling these methods from other threads in the future? Because with the current changeset, I don't see any real operational difference. Is the goal to run multiple ModuleSubscriptionActor commands in parallel? begin_tx() in broadcast_commit_event takes a lock on the datastore anyway, so it would still run in serial.
I'm not sure if this is a good idea as implemented here, since theoretically commands could run out-of-order from the order they were sent, which is bad. Oh also - I think I just realized, the deadlock we're getting is possibly write-starving; the read() locks collectively holding on to the rwlock constantly, with the write() waiter stuck forever.

Copy link
Collaborator

@joshua-spacetime joshua-spacetime left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

begin_tx() in broadcast_commit_event takes a lock on the datastore anyway, so it would still run in serial.

It takes a shared lock now.

commands could run out-of-order from the order they were sent

@coolreader18 is correct. If txn A runs(commits) before txn B, then we need to ensure that broadcast_commit_event is called on A before it is called on B. Unfortunately we'll introduce inconsistency if we multithread subscriptions apart from their originating transactions.

Copy link
Collaborator

@joshua-spacetime joshua-spacetime left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Requesting changes because I think we can only multithread the initial subscription calls, not the incremental evaluations unfortunately. At least not under the current concurrency model.

sub.remove_subscriber(client_id);
!sub.subscribers().is_empty()
})
}

async fn _broadcast_commit_event(&mut self, mut event: ModuleEvent, tx: &mut Tx) -> Result<(), DBError> {
let futures = FuturesUnordered::new();
async fn broadcast_commit_event(&mut self, mut event: ModuleEvent) -> Result<(), DBError> {
Copy link
Collaborator

@joshua-spacetime joshua-spacetime Jan 23, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This function really shouldn't be async. The reason being is that subscription evaluations need to have a serial equivalent ordering to the original transactions.

If txn A inserts a row, and then txn B deletes that same row, it would be incorrect to evaluate the effects of B before the effects of A.

Note this does not necessarily apply to the initial subscribe call eval.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I got you, This would require some more work I guess. Thanks for pointing.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Shubham8287 just reiterating the solution that we discussed earlier:

If each reducer holds an exclusive lock on the ModuleSubscriptionManager, we can evaluate all of the subscription queries for single transaction concurrently, and at the same time preserve the ordering across transactions.

That is, reducers are still run sequentially and do not complete until all of the required subscription queries have been evaluated. However within said reducer, queries can be run in parallel.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for summarizing.
I have done changed the we intented to achive parrallelism, so that now instead of trying to run broadcast_commit_event parallelly. The method will spawn threads to evaluate each Subsription QuerySets
and keeping subscription actor itself single threaded.

@Shubham8287
Copy link
Contributor Author

I don't see any real operational difference. Is the goal to run multiple ModuleSubscriptionActor commands in parallel? begin_tx() in broadcast_commit_event takes a lock on the datastore anyway, so it would still run in serial.

begin_tx() holds only read lock, so we can have parallel broadcast_commit_event executions.

the deadlock we're getting is possibly write-starving; the read() locks collectively holding on to the rwlock constantly, with the write() waiter stuck forever

this could be true.

@Shubham8287 Shubham8287 force-pushed the shub/multitheading-subscribers branch 3 times, most recently from fbf4331 to a85496b Compare January 24, 2024 21:06
@Shubham8287 Shubham8287 force-pushed the shub/multitheading-subscribers branch 2 times, most recently from ecde151 to df0f0d8 Compare January 24, 2024 21:58
added type alias for Subscription Vec

fix tx init  seq

parallelising incr eval

lint

remove uneccessary _arc()
@Shubham8287 Shubham8287 force-pushed the shub/multitheading-subscribers branch from df0f0d8 to c4053d5 Compare January 25, 2024 10:35
Copy link
Collaborator

@jdetter jdetter left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've tested this thoroughly using the bots and I'm not seeing the deadlock that we had before. The performance here is really good so let's try to get this in asap 👍

@joshua-spacetime
Copy link
Collaborator

Feel free to merge @Shubham8287. The consistency issue can be handled separately #758.

@Shubham8287 Shubham8287 added this pull request to the merge queue Jan 25, 2024
Merged via the queue into master with commit c80cd35 Jan 25, 2024
5 checks passed
@Shubham8287 Shubham8287 deleted the shub/multitheading-subscribers branch January 25, 2024 20:35
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants