Skip to content

Commit

Permalink
mptcp: refactor shutdown and close
Browse files Browse the repository at this point in the history
We must not close the subflows before all the MPTCP level
data, comprising the DATA_FIN has been acked at the MPTCP
level, otherwise we could be unable to retransmit as needed.

__mptcp_wr_shutdown() shutdown is responsible to check for the
correct status and close all subflows. Is called by the output
path after spooling any data and at shutdown/close time.

In a similar way, __mptcp_destroy_sock() is responsible to clean-up
the MPTCP level status, and is called when the msk transition
to TCP_CLOSE.

The protocol level close() does not force anymore the TCP_CLOSE
status, but orphan the msk socket and all the subflows.
Orphaned msk sockets are forciby closed after a timeout or
when all MPTCP-level data is acked.

There is a caveat about keeping the orphaned subflows around:
the TCP stack can asynchronusly call tcp_cleanup_ulp() on them via
tcp_close(). To prevent accessing freed memory on later MPTCP
level operations, the msk acquires a reference to each subflow
socket and prevent subflow_ulp_release() from releasing the
subflow context before __mptcp_destroy_sock().

The additional subflow references are released by __mptcp_done()
and the async ULP release is detected checking ULP ops. If such
field has been already cleared by the ULP release path, the
dangling context is freed directly by __mptcp_done().

Co-developed-by: Davide Caratti <dcaratti@redhat.com>
Signed-off-by: Davide Caratti <dcaratti@redhat.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
  • Loading branch information
Paolo Abeni authored and kuba-moo committed Nov 16, 2020
1 parent eaa2ffa commit e16163b
Show file tree
Hide file tree
Showing 5 changed files with 273 additions and 110 deletions.
2 changes: 1 addition & 1 deletion net/mptcp/options.c
Original file line number Diff line number Diff line change
Expand Up @@ -492,7 +492,7 @@ static bool mptcp_established_options_dss(struct sock *sk, struct sk_buff *skb,
bool ret = false;

mpext = skb ? mptcp_get_ext(skb) : NULL;
snd_data_fin_enable = READ_ONCE(msk->snd_data_fin_enable);
snd_data_fin_enable = mptcp_data_fin_enabled(msk);

if (!skb || (mpext && mpext->use_map) || snd_data_fin_enable) {
unsigned int map_size;
Expand Down
6 changes: 2 additions & 4 deletions net/mptcp/pm_netlink.c
Original file line number Diff line number Diff line change
Expand Up @@ -416,14 +416,13 @@ void mptcp_pm_nl_rm_addr_received(struct mptcp_sock *msk)
list_for_each_entry_safe(subflow, tmp, &msk->conn_list, node) {
struct sock *ssk = mptcp_subflow_tcp_sock(subflow);
int how = RCV_SHUTDOWN | SEND_SHUTDOWN;
long timeout = 0;

if (msk->pm.rm_id != subflow->remote_id)
continue;

spin_unlock_bh(&msk->pm.lock);
mptcp_subflow_shutdown(sk, ssk, how);
__mptcp_close_ssk(sk, ssk, subflow, timeout);
__mptcp_close_ssk(sk, ssk, subflow);
spin_lock_bh(&msk->pm.lock);

msk->pm.add_addr_accepted--;
Expand Down Expand Up @@ -452,14 +451,13 @@ void mptcp_pm_nl_rm_subflow_received(struct mptcp_sock *msk, u8 rm_id)
list_for_each_entry_safe(subflow, tmp, &msk->conn_list, node) {
struct sock *ssk = mptcp_subflow_tcp_sock(subflow);
int how = RCV_SHUTDOWN | SEND_SHUTDOWN;
long timeout = 0;

if (rm_id != subflow->local_id)
continue;

spin_unlock_bh(&msk->pm.lock);
mptcp_subflow_shutdown(sk, ssk, how);
__mptcp_close_ssk(sk, ssk, subflow, timeout);
__mptcp_close_ssk(sk, ssk, subflow);
spin_lock_bh(&msk->pm.lock);

msk->pm.local_addr_used--;
Expand Down
Loading

0 comments on commit e16163b

Please sign in to comment.