-
Notifications
You must be signed in to change notification settings - Fork 29
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Ensure that transactions are wrapped asynchronously in the parent database object instead of blocking on transactions' drop function. #1223
Conversation
…abase object instead of blocking on transactions' drop function.
let tx_dropper_joiner = tokio::task::spawn(async move { | ||
let mut conn_rx = conn_rx; | ||
while let Some(conn) = conn_rx.recv().await { | ||
conn.batch_execute("ROLLBACK").await.unwrap_or_else(|e| { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looking at the documentation, maybe batch()
rather than batch_execute()
is better when only doing a single query?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
batch_execute() is what was used in postgres_async::Transaction. It seems the right thing to do. Not sure.
>, | ||
// Note: This exists to enforce that a transaction never outlives the database object, | ||
// given that all connections have 'static lifetimes | ||
_marker: std::marker::PhantomData<&'a ()>, | ||
} | ||
|
||
impl<'a> Drop for ApiServerPostgresTransactionalRw<'a> { | ||
fn drop(&mut self) { | ||
if !self.finished { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since getting a connection has been change from get()
to get_owned()
, should we always send()
the connection back to the poll regardless of finished
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There's no need to send the connection back if finished, because it's already committed/rolled back. It should be destroyed so that the database object retrieves the connection again.
Nice. Once again channels saves the day. I think this should be a rule of thumb now - whenever something seems impossible, consider using channels to solve the problem. |
One last thought - too late now given you've implemented everything already and it works, but tokio_postgres::Transaction can possibly do a lot of the transactional heavy lifting. |
Channels aren't really the solution here, as they always were part of the attempts. What saved the day really was get_owned, because if the connections had a lifetime associated with anything less than static, none of this would've worked, since tokio tasks enforce that. |
Using Transaction is not really possible because it's a reference to connection, which will lead to a self-referential transaction object in our struct, which is a no-no in rust. In fact this is how it was done in the sync version of the code and why CI isn't passing there (because I wanted to get a comment from reviewers on whether Ouroboros lifetimes of connections are covariant), but then I realized that I can circumvent all that complexity by getting rid of Transaction and creating them myself, which is what I did here. Without that solution, there would've been issues because Ouroboros doesn't allow async borrows of self-referential variables. |
impl Drop for TransactionalApiServerPostgresStorage { | ||
fn drop(&mut self) { | ||
// Since the whole connection pool will be destroyed, we can safely abort all connections | ||
self.tx_dropper_joiner.abort(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is abort
really the clean thing to do here? If there is some connection waiting to be rolled back in the channel and the dropper task is aborted, the transaction is not going to be finalised. Or did I miss something?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not exactly. The lower level connection will be dropped since the connection pool will be destroyed, and hence the session will be dropped, and hence the transaction will be aborted anyway. At least this is my understanding of how postgres and mysql work from my research into this problem, but I'm open to being corrected here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, if it's so that sounds acceptable, albeit still not the cleanest. If I understand it correctly, some transactions may not be explicitly rolled back by issuing the ROLLBACK
query, relying on the connection cleanup procedure to effectively abort it. Is that about right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, but I'm not sure whether there's a misunderstanding, so let me elaborate.
This is exclusively at the end of life of the database, which is probably obvious.
So, IF we're shutting down the system (hence the database object, not a transaction object) is about to be dropped, and there are connections that were transactions and are still not rolled back (even though their transaction objects are destroyed, by definition, because a transaction cannot outlive the database given the lifetime constraint with that phantom data), then aborting will, yes, cause the connection cleanup to effectively rollback the transaction, because the sessions of the connection pool will be ended, which cancels all low-level postgres transactions that were open.
Having said that, again, this depends on my understanding of how database connections/sessions work and based my research on this problem. I'm happy to be challenged on that.
Now the alternative is that we have to block when shutting down the database. I'm happy to do that if it turned out to be necessary... in fact it's like a line of code in tokio, but I consider it less clean because of my eternal fear of deadlocks.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Now the alternative is that we have to block when shutting down the database. I'm happy to do that if it turned out to be necessary... in fact it's like a line of code in tokio, but I consider it less clean because of my eternal fear of deadlocks.
That could be mitigated by using spawn_blocking
for the cleanup task but then you won't be able to use the single-threaded tokio runtime. It's already the case in mempool because of the blocking subsystem handle business but it's not great and I'd like to avoid that if possible.
This merges to the postgres-async branch and fixes the issue that transactions that aren't ended explicitly would block the thread until they do that.
Now with this solution, transactions that aren't finished will simply use their Drop trait to send back themselves to the parent, owning database object, which will take care of decommissioning the transaction asynchronously