Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fault-free packet ingress #483

Merged
merged 9 commits into from
Jun 9, 2021

Conversation

ryan-summers
Copy link
Contributor

@ryan-summers ryan-summers commented May 31, 2021

This PR addresses portions of #281 by implementing fail-free packet ingression. In this model, all available RX tokens are ingressed in a fail-free manner. Any failures are logged to the debug interface.

Open questions:

  • Should the RxToken::consume() function be fallible? What does it mean when an RX packet cannot be dequeued? Is there any action that can be taken here?

@Dirbaio
Copy link
Member

Dirbaio commented May 31, 2021

Should the RxToken::consume() function be fallible? What does it mean when an RX packet cannot be dequeued? Is there any action that can be taken here?

Maybe we can change consume (in both tokens) to not be fallible.

  • If there's no packet in the rx queue, or no buffer space to queue the response, Device::receive should return None.
  • If there's no buffer space to queue a new packet, Device::transmit should return None.
  • consume can then be guaranteed to never fail:
    fn consume<F>(self, timestamp: Instant, f: F)
        where F: FnOnce(&mut [u8]);

The current Device trait can "fail" in 2 places: transmit/receive returning None, or consume returning an error. Reducing that to 1 is nice.

  • It simplifies the trait, there's now only 1 way to implement it.
  • Allows polling to be more efficient: If transmit reports it's not ready, iface can completely skip polling sockets. Now it always polls sockets and then discards the packet later if Device is not ready.

Looking at impls in the field:

So yeah, simplifying the trait would ensure consistency among impls on this.

@ryan-summers
Copy link
Contributor Author

What about some implementation using an external network stack in raw mode (e.g. W5500)? What if a phy::Device impl encountered an SPI error during transmission of the TX token (or reception of RX packet)? If we remove the fallibility of the token consumption, it implies that send/receive of packets cannot fail. While that's true for some implementations, it's not always true (e.g. if some external interface encounters an error).

From what I understand, the Device::transmit() and Device::receive() simply indicates if there's buffer space available to send/receive the packets. Whereas the Token::consume() indicates whether or not the send/receive could be completed.

That being said, what would we expect the corrective action is if the token consumption fails? In this implementation, we log and silently ignore the failure. In the current implementation, the error gets bubbled up the network stack (albeit, the error is a smoltcp::Error, which in most cases is silently ignored anyways).

I don't have any problem with implementation an interface that requires an infallible PHY API, it's just something we should consciously decide here.

@ryan-summers
Copy link
Contributor Author

In any case, I think we should consider changing the PHY API in a different PR. That ends up touching a lot of other code

@ryan-summers ryan-summers marked this pull request as ready for review June 9, 2021 09:38
Copy link
Member

@Dirbaio Dirbaio left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes sense to split in multiple PRs, this change is already useful indivifually. Thank you! :)

@Dirbaio Dirbaio merged commit f0c711c into smoltcp-rs:master Jun 9, 2021
@ryan-summers ryan-summers deleted the rs/issue-281/ethernet-dos branch June 9, 2021 14:26
@Dirbaio
Copy link
Member

Dirbaio commented Jun 9, 2021

From what I understand, the Device::transmit() and Device::receive() simply indicates if there's buffer space available to send/receive the packets. Whereas the Token::consume() indicates whether or not the send/receive could be completed.

Well, the docs can be certainly improved on this.

The way I see it (or the way I think it should work) is:

  • receive returns tokens when a packet has been received (completely received, DMA'd from hardware into RAM, FCS check passes, etc). It makes no sense at all for the "actual receive" (consume) to fail. Stuff like receiving garbage on the line should just be ignored. Receiving garbage is not receiving at all.
  • transmit returns tokens when the driver is "ready to accept a packet for transmission". This means having buffer space, the hardware being "not busy", etc. If there are transient transmit errors (such as a collision or whatever), the driver should keep the packet in the queue and try transmitting it later. Dropping it and returning an error is less efficient: smoltcp will try transmitting again, but that means running all the TcpSocket logic again and constructing the packet again. It's better if the driver keeps the already-constructed packet. If there's something really bad going on (like no Ethernet cable plugged in) drivers may silently drop everything, or never return TxTokens.

what would we expect the corrective action is if the token consumption fails? In this implementation, we log and silently ignore the failure. In the current implementation, the error gets bubbled up the network stack (albeit, the error is a smoltcp::Error, which in most cases is silently ignored anyways).

For receive, yes. For transmit there's one slight difference: TcpSocket knows the transmit has failed, so it skips updating internal state. This makes it so it'll immediately try to transmit the same packet again next time, instead of having to wait for timeout. However if drivers only return TxToken when transmit is guaranteed to succeed, this is no longer an issue.

tldr: IMO drivers should only return tokens when rx/tx is guaranteed to succeed, so consume should be infallible.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

2 participants