-
-
Notifications
You must be signed in to change notification settings - Fork 43
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Server hang when proxying over multiple subflows #178
Comments
@darkwrat, do you see this with the v5.11 stable kernels from Fedora? |
Yes, I can reproduce the server hang on 5.11.11-200.fc33.x86_64, but not on 5.10.23-200.fc33. |
Hi @darkwrat Thank you for this complete bug report! When your VPS is stuck, is it possible for you to trigger a But ideally, it would be nice to generate a kernel crash dump with KDump. I guess there are doc about that on Fedora website but here is already a guide from RedHat: https://access.redhat.com/documentation/en-us/red_hat_enterprise_linux/7/html/kernel_administration_guide/kernel_crash_dump_guide Also, hopefully this |
Managed to get a kernel crash dump with sysrq-c, but not with the actual crash ( |
Right, let's forget about the list corruption in the scope of this one. I will create a separate one once I have more details. |
So I've put it together and crashed the system with sysrq-c after the hang. Here is the dump, hope it's of some value. https://hel1.trail5.net/crash-sysrq-c.tar.gz |
And one more with sysrq-w before sysrq-c. https://hel1.trail5.net/crash-sysrq-w-c.tar.gz |
I have managed to run decode_stacktrace on the dmesg output from the crash-sysrq-w-c file above, hope it saves some time. Decoded stacktraces
|
@matttbe Hi, I have collected the data you requested. Do you still think we should split the issue? As dmesg outputs indicate, there is something wrong with |
The soft lockup is likely caused by a loop in the dfrag list. That in turn is likely just another kind of corruption as the one causing the kernel BUG. So likely the two splat are different symptoms of the same root cause. Looks like 5.12-rc5 already has all the relevant fix previously staged in our devel branch. Could you please try to reproduce the issue with running a debug kernel on the vps ? (pkg name: kernel-debug on fedora). That should have KASAN enabled and could help pin-pointing the list corruption (could splat early with a relevant backtrace pointing to the bugged code). Beware: KASAN and other debug options enabled there will slow down the kernel a lot! Thanks! Paolo |
@darkwrat : thank you for the collected data, very useful! We discussed about that at our last meeting: see Paolo's message Sorry for the delay in the answers, around Easter, many people -- including myself -- are on PTO or have less time to look because their colleagues are on PTO :) |
I've made some attempts at capturing the KASAN output today. @pabeni fedora kernel-debug packages seem to leave KASAN disabled (since https://src.fedoraproject.org/rpms/kernel/c/6a28602fa738744deaf29ee931c02193843c43f8?branch=rawhide). So I've followed this article (https://fedoraproject.org/wiki/Building_a_custom_kernel) and built a latest rawhide kernel with a
The soft lockup is easily reproducible, but kdump does not work properly with my KASAN kernel. Console just contains output from before the crash, there is nothing from the kdump kernel itself. I've spent some time playing with I've kept reproducing the issue until I've stumbled upon a splat without soft lockup, but I guess it's almost as useless as previous ones. I still got no KASAN errors. The decode_stacktraces stumbles on it btw (leaves empty brackets where source references should be).
Should I explore more ways to capture dmesg output for the soft lockup? Like, to setup a more controlled test environment, or find a hoster with a serial console for vps. @matttbe Now it's my turn to apologize for being slow to respond. :) And it looks like I'll have to commute to my parents' place in the country for weekends to get something tested, as I'm back at the office for the dayjob. I've also glanced through the meeting notes -- could you tell me which repo and branches do you want me to test this against? Thanks, |
@pabeni is it possible that |
First thing first, thanks for all the effort spent here! This looks quite elusive. I suspect there is some bad interaction with CONFIG_PREEMPT=y. One random shot in the dark would be repeating the scenario disabling that (# CONFIG_PREEMPT is not set). Another wildguess/debug attempt would be the following patch:
(beware: it will increase the memory usage) Your kasan configuration LGTM. I'm guessing kasan does not catch the UaF due to the allocation strategy. KASAN is likely fooled by the page frag allocation usage. I'll investigate if we can add some kasan annotation to the code to help kasan understanding the dfrag lifecycle.
I'm unsure I understood your question. list_add_tail() initializes &dfrag->list and adds it to rtx_queue tail. &dfrag->list is expected to be not initialized (not inuse) before the list_add_tail() call. I think the problem is caused by some already in use dfrag being re-used (re-inserted) before being freed. It would be great if you could attempt any of the above, thanks! |
That is already the case for rawhide kernels (# CONFIG_PREEMPT is not set).
I'll give it a try, presumably tomorrow.
Please disregard the question, I've read it wrong. And thank you for clarification! |
Whoops, did not know that.
Uhm... due to the above low expectation just went even lower :( BTW I notice there is a sendto() in the relevant backtrace, which is quite unexpected for [mp]tcp. Looking at ss-server source (current git at https://github.com/shadowsocks/shadowsocks-libev), it looks like sendto() could be invoked on TCP sockets only if fastopen is enabled: https://github.com/shadowsocks/shadowsocks-libev/blob/master/src/server.c#L518 Note that we currently don't support MSG_FASTOPEN for mptcp socket, the relevant syscall should fail with -EOPNOTSUPP.
|
shadowsocks-libev-3.3.5-4.gitb5d6225.fc33.x86_64 -- git hash b5d6225 -- got it from https://copr.fedorainfracloud.org/coprs/outman/shadowsocks-libev/
Got it from the run when soft lockup doesn't happen, so that it would not be truncated. list_add corruption. prev->next should be next (ffff9e24037fb930), but was 768c12efb29c0dac. (prev=ffff9e24051218d0). Call Trace: strace -f -ttt -vv -o ssst ./use_mptcp.sh /usr/bin/ss-server -c /etc/shadowsocks-libev/config.json -u -v
|
I believe it's just glibc using sendto/recvfrom in place of the send/recv syscall, and is not related to fast open. https://github.com/bminor/glibc/blob/595c22ecd8e87a27fd19270ed30fdbae9ad25426/sysdeps/unix/sysv/linux/send.c#L28 |
Maxim reported several issues when forcing a TCP transparent proxy to use the MPTCP protocol for the inbound connections. He also provided a clean reproducer. The problem boils down to 'mptcp_frag_can_collapse_to()' assuming that only MPTCP will use the given page_frag. If others - e.g. the plain TCP protocol - allocate page fragments, we can end-up re-using already allocated memory for mptcp_data_frag. Fix the issue ensuring that the to-be-expanded data fragment is located at the current page frag end. v1 -> v2: - added missing fixes tag (Mat) Closes: #178 Reported-and-tested-by: Maxim Galaganov <max@internet.ru> Fixes: 18b683b ("mptcp: queue data for mptcp level retransmission") Signed-off-by: Paolo Abeni <pabeni@redhat.com> Reviewed-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Maxim reported several issues when forcing a TCP transparent proxy to use the MPTCP protocol for the inbound connections. He also provided a clean reproducer. The problem boils down to 'mptcp_frag_can_collapse_to()' assuming that only MPTCP will use the given page_frag. If others - e.g. the plain TCP protocol - allocate page fragments, we can end-up re-using already allocated memory for mptcp_data_frag. Fix the issue ensuring that the to-be-expanded data fragment is located at the current page frag end. v1 -> v2: - added missing fixes tag (Mat) Closes: #178 Reported-and-tested-by: Maxim Galaganov <max@internet.ru> Fixes: 18b683b ("mptcp: queue data for mptcp level retransmission") Signed-off-by: Paolo Abeni <pabeni@redhat.com> Reviewed-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Maxim reported several issues when forcing a TCP transparent proxy to use the MPTCP protocol for the inbound connections. He also provided a clean reproducer. The problem boils down to 'mptcp_frag_can_collapse_to()' assuming that only MPTCP will use the given page_frag. If others - e.g. the plain TCP protocol - allocate page fragments, we can end-up re-using already allocated memory for mptcp_data_frag. Fix the issue ensuring that the to-be-expanded data fragment is located at the current page frag end. v1 -> v2: - added missing fixes tag (Mat) Closes: #178 Reported-and-tested-by: Maxim Galaganov <max@internet.ru> Fixes: 18b683b ("mptcp: queue data for mptcp level retransmission") Signed-off-by: Paolo Abeni <pabeni@redhat.com> Reviewed-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Maxim reported several issues when forcing a TCP transparent proxy to use the MPTCP protocol for the inbound connections. He also provided a clean reproducer. The problem boils down to 'mptcp_frag_can_collapse_to()' assuming that only MPTCP will use the given page_frag. If others - e.g. the plain TCP protocol - allocate page fragments, we can end-up re-using already allocated memory for mptcp_data_frag. Fix the issue ensuring that the to-be-expanded data fragment is located at the current page frag end. v1 -> v2: - added missing fixes tag (Mat) Closes: #178 Reported-and-tested-by: Maxim Galaganov <max@internet.ru> Fixes: 18b683b ("mptcp: queue data for mptcp level retransmission") Signed-off-by: Paolo Abeni <pabeni@redhat.com> Reviewed-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Maxim reported several issues when forcing a TCP transparent proxy to use the MPTCP protocol for the inbound connections. He also provided a clean reproducer. The problem boils down to 'mptcp_frag_can_collapse_to()' assuming that only MPTCP will use the given page_frag. If others - e.g. the plain TCP protocol - allocate page fragments, we can end-up re-using already allocated memory for mptcp_data_frag. Fix the issue ensuring that the to-be-expanded data fragment is located at the current page frag end. v1 -> v2: - added missing fixes tag (Mat) Closes: #178 Reported-and-tested-by: Maxim Galaganov <max@internet.ru> Fixes: 18b683b ("mptcp: queue data for mptcp level retransmission") Signed-off-by: Paolo Abeni <pabeni@redhat.com> Reviewed-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Maxim reported several issues when forcing a TCP transparent proxy to use the MPTCP protocol for the inbound connections. He also provided a clean reproducer. The problem boils down to 'mptcp_frag_can_collapse_to()' assuming that only MPTCP will use the given page_frag. If others - e.g. the plain TCP protocol - allocate page fragments, we can end-up re-using already allocated memory for mptcp_data_frag. Fix the issue ensuring that the to-be-expanded data fragment is located at the current page frag end. v1 -> v2: - added missing fixes tag (Mat) Closes: #178 Reported-and-tested-by: Maxim Galaganov <max@internet.ru> Fixes: 18b683b ("mptcp: queue data for mptcp level retransmission") Signed-off-by: Paolo Abeni <pabeni@redhat.com> Reviewed-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Maxim reported several issues when forcing a TCP transparent proxy to use the MPTCP protocol for the inbound connections. He also provided a clean reproducer. The problem boils down to 'mptcp_frag_can_collapse_to()' assuming that only MPTCP will use the given page_frag. If others - e.g. the plain TCP protocol - allocate page fragments, we can end-up re-using already allocated memory for mptcp_data_frag. Fix the issue ensuring that the to-be-expanded data fragment is located at the current page frag end. v1 -> v2: - added missing fixes tag (Mat) Closes: #178 Reported-and-tested-by: Maxim Galaganov <max@internet.ru> Fixes: 18b683b ("mptcp: queue data for mptcp level retransmission") Signed-off-by: Paolo Abeni <pabeni@redhat.com> Reviewed-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Maxim reported several issues when forcing a TCP transparent proxy to use the MPTCP protocol for the inbound connections. He also provided a clean reproducer. The problem boils down to 'mptcp_frag_can_collapse_to()' assuming that only MPTCP will use the given page_frag. If others - e.g. the plain TCP protocol - allocate page fragments, we can end-up re-using already allocated memory for mptcp_data_frag. Fix the issue ensuring that the to-be-expanded data fragment is located at the current page frag end. v1 -> v2: - added missing fixes tag (Mat) Closes: #178 Reported-and-tested-by: Maxim Galaganov <max@internet.ru> Fixes: 18b683b ("mptcp: queue data for mptcp level retransmission") Signed-off-by: Paolo Abeni <pabeni@redhat.com> Reviewed-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Maxim reported several issues when forcing a TCP transparent proxy to use the MPTCP protocol for the inbound connections. He also provided a clean reproducer. The problem boils down to 'mptcp_frag_can_collapse_to()' assuming that only MPTCP will use the given page_frag. If others - e.g. the plain TCP protocol - allocate page fragments, we can end-up re-using already allocated memory for mptcp_data_frag. Fix the issue ensuring that the to-be-expanded data fragment is located at the current page frag end. v1 -> v2: - added missing fixes tag (Mat) Closes: #178 Reported-and-tested-by: Maxim Galaganov <max@internet.ru> Fixes: 18b683b ("mptcp: queue data for mptcp level retransmission") Signed-off-by: Paolo Abeni <pabeni@redhat.com> Reviewed-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Maxim reported several issues when forcing a TCP transparent proxy to use the MPTCP protocol for the inbound connections. He also provided a clean reproducer. The problem boils down to 'mptcp_frag_can_collapse_to()' assuming that only MPTCP will use the given page_frag. If others - e.g. the plain TCP protocol - allocate page fragments, we can end-up re-using already allocated memory for mptcp_data_frag. Fix the issue ensuring that the to-be-expanded data fragment is located at the current page frag end. v1 -> v2: - added missing fixes tag (Mat) Closes: #178 Reported-and-tested-by: Maxim Galaganov <max@internet.ru> Fixes: 18b683b ("mptcp: queue data for mptcp level retransmission") Signed-off-by: Paolo Abeni <pabeni@redhat.com> Reviewed-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Maxim reported several issues when forcing a TCP transparent proxy to use the MPTCP protocol for the inbound connections. He also provided a clean reproducer. The problem boils down to 'mptcp_frag_can_collapse_to()' assuming that only MPTCP will use the given page_frag. If others - e.g. the plain TCP protocol - allocate page fragments, we can end-up re-using already allocated memory for mptcp_data_frag. Fix the issue ensuring that the to-be-expanded data fragment is located at the current page frag end. v1 -> v2: - added missing fixes tag (Mat) Closes: #178 Reported-and-tested-by: Maxim Galaganov <max@internet.ru> Fixes: 18b683b ("mptcp: queue data for mptcp level retransmission") Signed-off-by: Paolo Abeni <pabeni@redhat.com> Reviewed-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Maxim reported several issues when forcing a TCP transparent proxy to use the MPTCP protocol for the inbound connections. He also provided a clean reproducer. The problem boils down to 'mptcp_frag_can_collapse_to()' assuming that only MPTCP will use the given page_frag. If others - e.g. the plain TCP protocol - allocate page fragments, we can end-up re-using already allocated memory for mptcp_data_frag. Fix the issue ensuring that the to-be-expanded data fragment is located at the current page frag end. v1 -> v2: - added missing fixes tag (Mat) Closes: #178 Reported-and-tested-by: Maxim Galaganov <max@internet.ru> Fixes: 18b683b ("mptcp: queue data for mptcp level retransmission") Signed-off-by: Paolo Abeni <pabeni@redhat.com> Reviewed-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Maxim reported several issues when forcing a TCP transparent proxy to use the MPTCP protocol for the inbound connections. He also provided a clean reproducer. The problem boils down to 'mptcp_frag_can_collapse_to()' assuming that only MPTCP will use the given page_frag. If others - e.g. the plain TCP protocol - allocate page fragments, we can end-up re-using already allocated memory for mptcp_data_frag. Fix the issue ensuring that the to-be-expanded data fragment is located at the current page frag end. v1 -> v2: - added missing fixes tag (Mat) Closes: #178 Reported-and-tested-by: Maxim Galaganov <max@internet.ru> Fixes: 18b683b ("mptcp: queue data for mptcp level retransmission") Signed-off-by: Paolo Abeni <pabeni@redhat.com> Reviewed-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Maxim reported several issues when forcing a TCP transparent proxy to use the MPTCP protocol for the inbound connections. He also provided a clean reproducer. The problem boils down to 'mptcp_frag_can_collapse_to()' assuming that only MPTCP will use the given page_frag. If others - e.g. the plain TCP protocol - allocate page fragments, we can end-up re-using already allocated memory for mptcp_data_frag. Fix the issue ensuring that the to-be-expanded data fragment is located at the current page frag end. v1 -> v2: - added missing fixes tag (Mat) Closes: #178 Reported-and-tested-by: Maxim Galaganov <max@internet.ru> Fixes: 18b683b ("mptcp: queue data for mptcp level retransmission") Signed-off-by: Paolo Abeni <pabeni@redhat.com> Reviewed-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Maxim reported several issues when forcing a TCP transparent proxy to use the MPTCP protocol for the inbound connections. He also provided a clean reproducer. The problem boils down to 'mptcp_frag_can_collapse_to()' assuming that only MPTCP will use the given page_frag. If others - e.g. the plain TCP protocol - allocate page fragments, we can end-up re-using already allocated memory for mptcp_data_frag. Fix the issue ensuring that the to-be-expanded data fragment is located at the current page frag end. v1 -> v2: - added missing fixes tag (Mat) Closes: #178 Reported-and-tested-by: Maxim Galaganov <max@internet.ru> Fixes: 18b683b ("mptcp: queue data for mptcp level retransmission") Signed-off-by: Paolo Abeni <pabeni@redhat.com> Reviewed-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Maxim reported several issues when forcing a TCP transparent proxy to use the MPTCP protocol for the inbound connections. He also provided a clean reproducer. The problem boils down to 'mptcp_frag_can_collapse_to()' assuming that only MPTCP will use the given page_frag. If others - e.g. the plain TCP protocol - allocate page fragments, we can end-up re-using already allocated memory for mptcp_data_frag. Fix the issue ensuring that the to-be-expanded data fragment is located at the current page frag end. v1 -> v2: - added missing fixes tag (Mat) Closes: #178 Reported-and-tested-by: Maxim Galaganov <max@internet.ru> Fixes: 18b683b ("mptcp: queue data for mptcp level retransmission") Signed-off-by: Paolo Abeni <pabeni@redhat.com> Reviewed-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Maxim reported several issues when forcing a TCP transparent proxy to use the MPTCP protocol for the inbound connections. He also provided a clean reproducer. The problem boils down to 'mptcp_frag_can_collapse_to()' assuming that only MPTCP will use the given page_frag. If others - e.g. the plain TCP protocol - allocate page fragments, we can end-up re-using already allocated memory for mptcp_data_frag. Fix the issue ensuring that the to-be-expanded data fragment is located at the current page frag end. v1 -> v2: - added missing fixes tag (Mat) Closes: #178 Reported-and-tested-by: Maxim Galaganov <max@internet.ru> Fixes: 18b683b ("mptcp: queue data for mptcp level retransmission") Signed-off-by: Paolo Abeni <pabeni@redhat.com> Reviewed-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Maxim reported several issues when forcing a TCP transparent proxy to use the MPTCP protocol for the inbound connections. He also provided a clean reproducer. The problem boils down to 'mptcp_frag_can_collapse_to()' assuming that only MPTCP will use the given page_frag. If others - e.g. the plain TCP protocol - allocate page fragments, we can end-up re-using already allocated memory for mptcp_data_frag. Fix the issue ensuring that the to-be-expanded data fragment is located at the current page frag end. v1 -> v2: - added missing fixes tag (Mat) Closes: #178 Reported-and-tested-by: Maxim Galaganov <max@internet.ru> Fixes: 18b683b ("mptcp: queue data for mptcp level retransmission") Signed-off-by: Paolo Abeni <pabeni@redhat.com> Reviewed-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Maxim reported several issues when forcing a TCP transparent proxy to use the MPTCP protocol for the inbound connections. He also provided a clean reproducer. The problem boils down to 'mptcp_frag_can_collapse_to()' assuming that only MPTCP will use the given page_frag. If others - e.g. the plain TCP protocol - allocate page fragments, we can end-up re-using already allocated memory for mptcp_data_frag. Fix the issue ensuring that the to-be-expanded data fragment is located at the current page frag end. v1 -> v2: - added missing fixes tag (Mat) Closes: #178 Reported-and-tested-by: Maxim Galaganov <max@internet.ru> Fixes: 18b683b ("mptcp: queue data for mptcp level retransmission") Signed-off-by: Paolo Abeni <pabeni@redhat.com> Reviewed-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Maxim reported several issues when forcing a TCP transparent proxy to use the MPTCP protocol for the inbound connections. He also provided a clean reproducer. The problem boils down to 'mptcp_frag_can_collapse_to()' assuming that only MPTCP will use the given page_frag. If others - e.g. the plain TCP protocol - allocate page fragments, we can end-up re-using already allocated memory for mptcp_data_frag. Fix the issue ensuring that the to-be-expanded data fragment is located at the current page frag end. v1 -> v2: - added missing fixes tag (Mat) Closes: #178 Reported-and-tested-by: Maxim Galaganov <max@internet.ru> Fixes: 18b683b ("mptcp: queue data for mptcp level retransmission") Signed-off-by: Paolo Abeni <pabeni@redhat.com> Reviewed-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Maxim reported several issues when forcing a TCP transparent proxy to use the MPTCP protocol for the inbound connections. He also provided a clean reproducer. The problem boils down to 'mptcp_frag_can_collapse_to()' assuming that only MPTCP will use the given page_frag. If others - e.g. the plain TCP protocol - allocate page fragments, we can end-up re-using already allocated memory for mptcp_data_frag. Fix the issue ensuring that the to-be-expanded data fragment is located at the current page frag end. v1 -> v2: - added missing fixes tag (Mat) Closes: #178 Reported-and-tested-by: Maxim Galaganov <max@internet.ru> Fixes: 18b683b ("mptcp: queue data for mptcp level retransmission") Signed-off-by: Paolo Abeni <pabeni@redhat.com> Reviewed-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Maxim reported several issues when forcing a TCP transparent proxy to use the MPTCP protocol for the inbound connections. He also provided a clean reproducer. The problem boils down to 'mptcp_frag_can_collapse_to()' assuming that only MPTCP will use the given page_frag. If others - e.g. the plain TCP protocol - allocate page fragments, we can end-up re-using already allocated memory for mptcp_data_frag. Fix the issue ensuring that the to-be-expanded data fragment is located at the current page frag end. v1 -> v2: - added missing fixes tag (Mat) Closes: #178 Reported-and-tested-by: Maxim Galaganov <max@internet.ru> Fixes: 18b683b ("mptcp: queue data for mptcp level retransmission") Signed-off-by: Paolo Abeni <pabeni@redhat.com> Reviewed-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Maxim reported several issues when forcing a TCP transparent proxy to use the MPTCP protocol for the inbound connections. He also provided a clean reproducer. The problem boils down to 'mptcp_frag_can_collapse_to()' assuming that only MPTCP will use the given page_frag. If others - e.g. the plain TCP protocol - allocate page fragments, we can end-up re-using already allocated memory for mptcp_data_frag. Fix the issue ensuring that the to-be-expanded data fragment is located at the current page frag end. v1 -> v2: - added missing fixes tag (Mat) Closes: #178 Reported-and-tested-by: Maxim Galaganov <max@internet.ru> Fixes: 18b683b ("mptcp: queue data for mptcp level retransmission") Signed-off-by: Paolo Abeni <pabeni@redhat.com> Reviewed-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Maxim reported several issues when forcing a TCP transparent proxy to use the MPTCP protocol for the inbound connections. He also provided a clean reproducer. The problem boils down to 'mptcp_frag_can_collapse_to()' assuming that only MPTCP will use the given page_frag. If others - e.g. the plain TCP protocol - allocate page fragments, we can end-up re-using already allocated memory for mptcp_data_frag. Fix the issue ensuring that the to-be-expanded data fragment is located at the current page frag end. v1 -> v2: - added missing fixes tag (Mat) Closes: multipath-tcp/mptcp_net-next#178 Reported-and-tested-by: Maxim Galaganov <max@internet.ru> Fixes: 18b683b ("mptcp: queue data for mptcp level retransmission") Signed-off-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
commit 29249ea upstream. Maxim reported several issues when forcing a TCP transparent proxy to use the MPTCP protocol for the inbound connections. He also provided a clean reproducer. The problem boils down to 'mptcp_frag_can_collapse_to()' assuming that only MPTCP will use the given page_frag. If others - e.g. the plain TCP protocol - allocate page fragments, we can end-up re-using already allocated memory for mptcp_data_frag. Fix the issue ensuring that the to-be-expanded data fragment is located at the current page frag end. v1 -> v2: - added missing fixes tag (Mat) Closes: multipath-tcp/mptcp_net-next#178 Reported-and-tested-by: Maxim Galaganov <max@internet.ru> Fixes: 18b683b ("mptcp: queue data for mptcp level retransmission") Signed-off-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 29249ea upstream. Maxim reported several issues when forcing a TCP transparent proxy to use the MPTCP protocol for the inbound connections. He also provided a clean reproducer. The problem boils down to 'mptcp_frag_can_collapse_to()' assuming that only MPTCP will use the given page_frag. If others - e.g. the plain TCP protocol - allocate page fragments, we can end-up re-using already allocated memory for mptcp_data_frag. Fix the issue ensuring that the to-be-expanded data fragment is located at the current page frag end. v1 -> v2: - added missing fixes tag (Mat) Closes: multipath-tcp/mptcp_net-next#178 Reported-and-tested-by: Maxim Galaganov <max@internet.ru> Fixes: 18b683b ("mptcp: queue data for mptcp level retransmission") Signed-off-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
stable inclusion from stable-5.10.42 commit 3267a061096efc91eda52c2a0c61ba76e46e4b34 bugzilla: 55093 CVE: NA -------------------------------- commit 29249ea upstream. Maxim reported several issues when forcing a TCP transparent proxy to use the MPTCP protocol for the inbound connections. He also provided a clean reproducer. The problem boils down to 'mptcp_frag_can_collapse_to()' assuming that only MPTCP will use the given page_frag. If others - e.g. the plain TCP protocol - allocate page fragments, we can end-up re-using already allocated memory for mptcp_data_frag. Fix the issue ensuring that the to-be-expanded data fragment is located at the current page frag end. v1 -> v2: - added missing fixes tag (Mat) Closes: multipath-tcp/mptcp_net-next#178 Reported-and-tested-by: Maxim Galaganov <max@internet.ru> Fixes: 18b683b ("mptcp: queue data for mptcp level retransmission") Signed-off-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Chen Jun <chenjun102@huawei.com> Acked-by: Weilong Chen <chenweilong@huawei.com> Signed-off-by: Zheng Zengkai <zhengzengkai@huawei.com>
I am trying to setup a shadowsocks proxy over a connection with multiple mptcp
subflows, so as to aggregate bandwidth of two of my usb LTE modems at home.
I have NAT'ting middleboxes on both routes. I can successfully download a
single file and clearly see traffic flow with tcpdump on both interfaces. When
I try web surfing over this proxy, the remote vps just hangs. I cannot get a
shell through a hoster's vnc (it doesn't respond to input and the cursor isn't
blinking), no ssh, and cpu usage is stuck at 100%. Rebooting the vps helps.
I have also set
kernel.panic = 1
kernel.panic_on_oops = 1
..and the vps reboots on its own. I cannot obtain dmesg output.
I am using LD_PRELOAD trick to enable upstream mptcp in ss-server via
use_mptcp.sh from mptcp-tools repo. For ss-local I have just patched those
socket calls to use IPPTOTO_MPTCP.
I have caught the freeze on video, and can provide clientside pcap's.
-- https://hel1.trail5.net/hang.webm
-- https://hel1.trail5.net/x.pcap.zst
The kernels come from fedora rawhide. I am too clumsy to roll my own right now,
but I will make some attempts in the following weeks.
My understanding is that no client should be able to make a server's kernel
irresponsive over the internet. I have never solved an issue with a completely
stuck system before, but I could provide someone with a vps and point my
ss-local to that system to enable them to debug the issue firsthand.
Aside from the server issue, there sometimes is a BUG in dmesg on the client
machine (the ss-local one). It is a physical box. After the bug there is still
a shell, but some command gets stuck forever. I usually reboot it with
sysrq-trigger afterwards.
Routing configuration on the client side:
The text was updated successfully, but these errors were encountered: