Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix(iroh): remove quinn::Endpoint::wait_idle from iroh::Endpoint::close process #3165

Merged
merged 2 commits into from
Feb 4, 2025
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
36 changes: 17 additions & 19 deletions iroh/src/endpoint.rs
Original file line number Diff line number Diff line change
Expand Up @@ -991,10 +991,13 @@ impl Endpoint {
/// of `0` and an empty reason. Though it is best practice to close those
/// explicitly before with a custom error code and reason.
///
/// It will then make a best effort to wait for all close notifications to be
/// acknowledged by the peers, re-transmitting them if needed. This ensures the
/// peers are aware of the closed connections instead of having to wait for a timeout
/// on the connection. Once all connections are closed or timed out, the magic socket is closed.
/// This will not wait for the [`quinn::Endpoint`] to drain connections.
///
/// To ensure no data is lost, design protocols so that the last *sender*
/// of data in the protocol calls [`Connection::closed`], and `await`s until
/// it receives a "close" message from the *receiver*. Once the *receiver*
/// gets the last data in the protocol, it should call [`Connection::close`]
/// to inform the *sender* that all data has been received.
///
/// Be aware however that the underlying UDP sockets are only closed
/// on [`Drop`], bearing in mind the [`Endpoint`] is only dropped once all the clones
Expand Down Expand Up @@ -1393,21 +1396,16 @@ impl Connection {
///
/// # Gracefully closing a connection
///
/// Only the peer last receiving application data can be certain that all data is
/// delivered. The only reliable action it can then take is to close the connection,
/// potentially with a custom error code. The delivery of the final CONNECTION_CLOSE
/// frame is very likely if both endpoints stay online long enough, calling
/// [`Endpoint::close`] will wait to provide sufficient time. Otherwise, the remote peer
/// will time out the connection, provided that the idle timeout is not disabled.
///
/// The sending side can not guarantee all stream data is delivered to the remote
/// application. It only knows the data is delivered to the QUIC stack of the remote
/// endpoint. Once the local side sends a CONNECTION_CLOSE frame in response to calling
/// [`close`] the remote endpoint may drop any data it received but is as yet
/// undelivered to the application, including data that was acknowledged as received to
/// the local endpoint.
///
/// [`close`]: Connection::close
/// Only the peer last **receiving** application data can be certain that all data is
/// delivered.
///
/// To communicate to the last **sender** of the application data that all the data was received, we recommend designing protocols that follow this pattern:
///
/// 1) The **sender** sends the last data. It then calls [`Connection::closed`]. This will wait until it receives a CONNECTION_CLOSE frame from the other side.
/// 2) The **receiver** receives the last data. It then calls [`Connection::close`] and provides an error_code and/or reason.
/// 3) The **sender** checks that the error_code is the expected error code.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This provides no guidance how the receiver of the last data ensures that the close is sent and delivered. Which possibly is no longer possible right now?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah - this seems impossible at the moment, see also this failing test case that's fixed by reverting this PR: 82334ca

///
/// If the `close`/`closed` dance is not done, or is interrupted at any point, the connection will eventually time out, provided that the idle timeout is not disabled.
#[inline]
pub fn close(&self, error_code: VarInt, reason: &[u8]) {
self.inner.close(error_code, reason)
Expand Down
9 changes: 8 additions & 1 deletion iroh/src/magicsock.rs
Original file line number Diff line number Diff line change
Expand Up @@ -1785,12 +1785,19 @@ impl Handle {
/// Only the first close does anything. Any later closes return nil.
/// Polling the socket ([`AsyncUdpSocket::poll_recv`]) will return [`Poll::Pending`]
/// indefinitely after this call.
///
/// This will not wait for the [`quinn::Endpoint`] to drain connections.
///
/// To ensure no data is lost, design protocols so that the last *sender*
/// of data in the protocol calls [`crate::endpoint::Connection::closed`], and `await`s until
/// it receives a "close" message from the *receiver*. Once the *receiver*
/// gets the last data in the protocol, it should call [`crate::endpoint::Connection::close`]
/// to inform the *sender* that all data has been received.
#[instrument(skip_all, fields(me = %self.msock.me))]
pub(crate) async fn close(&self) {
trace!("magicsock closing...");
// Initiate closing all connections, and refuse future connections.
self.endpoint.close(0u16.into(), b"");
self.endpoint.wait_idle().await;

if self.msock.is_closed() {
return;
Expand Down
Loading