Fault-free packet ingress #483

ryan-summers · 2021-05-31T15:43:03Z

This PR addresses portions of #281 by implementing fail-free packet ingression. In this model, all available RX tokens are ingressed in a fail-free manner. Any failures are logged to the debug interface.

Open questions:

Should the RxToken::consume() function be fallible? What does it mean when an RX packet cannot be dequeued? Is there any action that can be taken here?

Dirbaio · 2021-05-31T16:22:04Z

Should the RxToken::consume() function be fallible? What does it mean when an RX packet cannot be dequeued? Is there any action that can be taken here?

Maybe we can change consume (in both tokens) to not be fallible.

If there's no packet in the rx queue, or no buffer space to queue the response, Device::receive should return None.
If there's no buffer space to queue a new packet, Device::transmit should return None.
consume can then be guaranteed to never fail:

    fn consume<F>(self, timestamp: Instant, f: F)
        where F: FnOnce(&mut [u8]);

The current Device trait can "fail" in 2 places: transmit/receive returning None, or consume returning an error. Reducing that to 1 is nice.

It simplifies the trait, there's now only 1 way to implement it.
Allows polling to be more efficient: If transmit reports it's not ready, iface can completely skip polling sockets. Now it always polls sockets and then discards the packet later if Device is not ready.

Looking at impls in the field:

smoltcp impls return None in transmit/receive.
embassy returns None in transmit/receive.
stm32-eth returns errors in consume. https://github.com/stm32-rs/stm32-eth/blob/0ab4556e762c232ac7f2fc3116733359395ec4fc/src/smoltcp_phy.rs#L72
stm32h7xx-hal returns None in transmit/receive: https://github.com/stm32-rs/stm32h7xx-hal/blob/443eacdc040b24191a4878d94d578f7016d50256/src/ethernet/eth.rs#L774

So yeah, simplifying the trait would ensure consistency among impls on this.

ryan-summers · 2021-06-09T09:26:15Z

What about some implementation using an external network stack in raw mode (e.g. W5500)? What if a phy::Device impl encountered an SPI error during transmission of the TX token (or reception of RX packet)? If we remove the fallibility of the token consumption, it implies that send/receive of packets cannot fail. While that's true for some implementations, it's not always true (e.g. if some external interface encounters an error).

From what I understand, the Device::transmit() and Device::receive() simply indicates if there's buffer space available to send/receive the packets. Whereas the Token::consume() indicates whether or not the send/receive could be completed.

That being said, what would we expect the corrective action is if the token consumption fails? In this implementation, we log and silently ignore the failure. In the current implementation, the error gets bubbled up the network stack (albeit, the error is a smoltcp::Error, which in most cases is silently ignored anyways).

I don't have any problem with implementation an interface that requires an infallible PHY API, it's just something we should consciously decide here.

ryan-summers · 2021-06-09T09:37:05Z

In any case, I think we should consider changing the PHY API in a different PR. That ends up touching a lot of other code

Dirbaio

Makes sense to split in multiple PRs, this change is already useful indivifually. Thank you! :)

Dirbaio · 2021-06-09T14:39:24Z

From what I understand, the Device::transmit() and Device::receive() simply indicates if there's buffer space available to send/receive the packets. Whereas the Token::consume() indicates whether or not the send/receive could be completed.

Well, the docs can be certainly improved on this.

The way I see it (or the way I think it should work) is:

receive returns tokens when a packet has been received (completely received, DMA'd from hardware into RAM, FCS check passes, etc). It makes no sense at all for the "actual receive" (consume) to fail. Stuff like receiving garbage on the line should just be ignored. Receiving garbage is not receiving at all.
transmit returns tokens when the driver is "ready to accept a packet for transmission". This means having buffer space, the hardware being "not busy", etc. If there are transient transmit errors (such as a collision or whatever), the driver should keep the packet in the queue and try transmitting it later. Dropping it and returning an error is less efficient: smoltcp will try transmitting again, but that means running all the TcpSocket logic again and constructing the packet again. It's better if the driver keeps the already-constructed packet. If there's something really bad going on (like no Ethernet cable plugged in) drivers may silently drop everything, or never return TxTokens.

what would we expect the corrective action is if the token consumption fails? In this implementation, we log and silently ignore the failure. In the current implementation, the error gets bubbled up the network stack (albeit, the error is a smoltcp::Error, which in most cases is silently ignored anyways).

For receive, yes. For transmit there's one slight difference: TcpSocket knows the transmit has failed, so it skips updating internal state. This makes it so it'll immediately try to transmit the same packet again next time, instead of having to wait for timeout. However if drivers only return TxToken when transmit is guaranteed to succeed, this is no longer an issue.

tldr: IMO drivers should only return tokens when rx/tx is guaranteed to succeed, so consume should be infallible.

ryan-summers added 7 commits May 31, 2021 17:25

Adding prototype fix for fail-free ingress

d0e3ebd

Fail-free ingress

6091350

cleaning up implementation

f4d4ba0

Reverting phy changes

d86d639

Removing unneeded changes

fcfe355

Simplifying loop logic

5a0ac07

Fixing clippy

b9fb265

Sanitizing log macro to allow usage in match arm directly

fb0ac23

ryan-summers marked this pull request as ready for review June 9, 2021 09:38

Merge branch 'master' into rs/issue-281/ethernet-dos

77b77fa

Dirbaio approved these changes Jun 9, 2021

View reviewed changes

Dirbaio merged commit f0c711c into smoltcp-rs:master Jun 9, 2021

ryan-summers deleted the rs/issue-281/ethernet-dos branch June 9, 2021 14:26

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fault-free packet ingress #483

Fault-free packet ingress #483

ryan-summers commented May 31, 2021 •

edited

Loading

Dirbaio commented May 31, 2021

ryan-summers commented Jun 9, 2021

ryan-summers commented Jun 9, 2021

Dirbaio left a comment

Dirbaio commented Jun 9, 2021

Fault-free packet ingress #483

Fault-free packet ingress #483

Conversation

ryan-summers commented May 31, 2021 • edited Loading

Dirbaio commented May 31, 2021

ryan-summers commented Jun 9, 2021

ryan-summers commented Jun 9, 2021

Dirbaio left a comment

Choose a reason for hiding this comment

Dirbaio commented Jun 9, 2021

ryan-summers commented May 31, 2021 •

edited

Loading