Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reset retransmit timer on ack regardless of the current timer state #953

Closed
wants to merge 1 commit into from

Conversation

HKalbasi
Copy link
Contributor

As I explained in #949 (comment) in the current state if you have a link with limited bandwidth and high latency, timer state becomes Timer::Retransmit and won't become Timer::Idle since there is always some packets in the flight, and current code prevents the delay to be updated, so there would be an unnecessary retransmission which drops the bandwidth. After this PR, I'm able to utilize the full bandwidth of a link with high latency.

This PR doesn't completely fix the issue I mentioned above, there is still some problems when the middle switch drops packets, but it makes things much better.

@HKalbasi
Copy link
Contributor Author

@Dirbaio @whitequark sorry for the ping. Can you please review this? I need to make sure this is in the correct path in order to be able to investigate more on #949

@whitequark
Copy link
Contributor

@HKalbasi Sorry, I don't fully understand the description of this PR. I believe the retransmit code was originally written according to one or more of the TCP RFCs linked in the header. Could you please investigate which RFCs require this behavior (or whether the behavior never matched them in first place), link to the relevant portions of the documents, and explain how the changes correspond to them? Thanks!

@HKalbasi
Copy link
Contributor Author

Sure! The rfc6298 says:

When an ACK is received that acknowledges new data, restart the retransmission timer so that it will expire after RTO seconds (for the current value of RTO).

But the code in the set_for_retransmit method prevents the timer being restarted if it is already active and in the Timer::Retransmit state. We may want to also change where we call the set_for_retransmit function, but this patch works for me in terms of preventing unnecessary retransmission which drops the bandwidth.

@Dirbaio
Copy link
Member

Dirbaio commented Sep 10, 2024

Fix looks good to me.

Can you add a test for it? i.e. a test that would previously fail and now works with your fix.

@whitequark
Copy link
Contributor

Sure! The rfc6298 says:

When an ACK is received that acknowledges new data, restart the retransmission timer so that it will expire after RTO seconds (for the current value of RTO).

Thanks. Please add the RFC to the top of tcp.rs next to other RFC references.

@Dirbaio
Copy link
Member

Dirbaio commented Nov 27, 2024

I was looking through this in more detail to add more tests, and i'm not convinced it's correct anymore.

set_for_retransmit() is called on egress when we send more data. Therefore it corresponds to RFC6298 5.1 (emphasis mine):

(5.1) Every time a packet containing data is sent (including a retransmission), if the timer is not running, start it running so that it will expire after RTO seconds (for the current value of RTO).

which matches what the code was already doing.

Rule 5.2 is implemented here

self.timer.set_for_idle(cx.now(), self.keep_alive);

(5.2) When all outstanding data has been acknowledged, turn off the retransmission timer.

so I think the bug is we're just missing implementing 5.3 somewhere.

(5.3) When an ACK is received that acknowledges new data, restart the retransmission timer so that it will expire after RTO seconds (for the current value of RTO).

@HKalbasi
Copy link
Contributor Author

Sorry for the late response. These kind of problems made my tcp connections unreliable so I ended up dropping the smoltcp in favor of a modified linux kernel tcp stack, and I have lost interest in continuing this PR. I'm still looking forward to smoltcp, I think it is one of the cleanest implementations of the tcp/ip stack, and I hope some day these bugs will be fixed. But until that day, I think it would be nice to advertise it somewhere that smoltcp can't really maintain large bandwidth in a real network with high and unpredictable latency and loss. I wasn't aware of that fact and it damaged my reputation.

About this PR, you are completely right. This PR updates the retransmission timer when it sends more data, but it should update it when it receives more acks. It looks like another guy is also interested in fixing this, so I will close this in favor of their PR #1018

@HKalbasi HKalbasi closed this Dec 15, 2024
@whitequark
Copy link
Contributor

You did agree to the following terms when you started using smoltcp:

THE SOFTWARE IS PROVIDED "AS IS" AND THE AUTHOR DISCLAIMS ALL WARRANTIES
WITH REGARD TO THIS SOFTWARE INCLUDING ALL IMPLIED WARRANTIES OF
MERCHANTABILITY AND FITNESS. IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR
ANY SPECIAL, DIRECT, INDIRECT, OR CONSEQUENTIAL DAMAGES OR ANY DAMAGES
WHATSOEVER RESULTING FROM LOSS OF USE, DATA OR PROFITS, WHETHER IN
AN ACTION OF CONTRACT, NEGLIGENCE OR OTHER TORTIOUS ACTION, ARISING OUT
OF OR IN CONNECTION WITH THE USE OR PERFORMANCE OF THIS SOFTWARE.

@HKalbasi
Copy link
Contributor Author

I didn't mean anything bad, I still wish the best for smoltcp. I just wanted to explain why I didn't continue this PR, and that request was just a kindly ask to help other people set their expectations about smoltcp correctly, not a legal requirement or anything like that, so feel free to ignore it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

3 participants