Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

snet: move SCMPHandler to cooked UDP Conn, don't propagate SCMP errors by default #4563

Draft
wants to merge 8 commits into
base: master
Choose a base branch
from

Conversation

matzf
Copy link
Contributor

@matzf matzf commented Jul 3, 2024

  • snet.PacketConn is a "raw" socket. Having a special hook to handle SCMP packets, the SCMPHandler, seems unnecessary here, as any user of this raw socket can (and often wants to) just handle these packets explicitly.
    The SCMPHandler is thus moved from the "raw" PacketConn, to the "cooked" Conn and invoked only there. This makes use of the raw socket for things like sending and reacting to SCMPs much more straight forward.
  • Options like the SCMPHandler and ReplyPather can be set in each Dial/Listen call, overriding the default set in the snet.SCIONNetwork.
  • The snet.DefaultSCMPHandler no longer propagates any errors up. Path revocations are registered, but not directly reported back to the caller of Read/ReadFrom. That is, it's now equivalent to using the (removed) snet.SCMPPropagationStopper. See proposal: re-think SCMP error propagation in snet library #4389
  • Misc related snet API tweaks/cleanup; remove some unused structs, remove OpError, expose setting buffer sizes from snet.Conn (used by quic-go if available).

WIP: simply ignoring the SCMP errors seems undesirable in some places. The "end2end revocation test" relied on these errors and is no longer working. In the end2end test application itself, tighter timeouts for the ping/pong roundtrip might be sufficient to detect path failures quickly enough. However, I also noticed that path lookups themselves are not functioning well here and I suspect that in some place, the CS is also using SCMP errors for rapid detection of path failures. I need to investigate where this is, and what would be a good way to address this.

Perhaps this PR could be split into two parts; 1) moving the SCMPHandler from PacketConn to Conn, and 2) stopping the error propagation by default.

@jiceatscion
Copy link
Contributor

This change is Reviewable

Copy link
Contributor

@jiceatscion jiceatscion left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the places where the default stopper behaviour hurts are known, I guess a non-default handler can used, right? So, is your concern about backward compatibility?

Reviewed 14 of 15 files at r1, 1 of 1 files at r2, all commit messages.
Reviewable status: all files reviewed, 5 unresolved discussions (waiting on @matzf)


gateway/pathhealth/pathwatcher.go line 241 at r2 (raw file):

		}
		if err != nil {
			logger.Info("Unexpected error when reading probe reply", "err", err)

Do we intend to log for all cases of error, even the SCMP errors that we handle? The old code seemed to only log non-scmp errors. Something to do with what the built-in handler did?


pkg/snet/conn.go line 166 at r2 (raw file):

	}
	if o.replyPather == nil {
		o.replyPather = DefaultReplyPather{}

Ok, so this would seem to be the only opportunity to apply the default if WithReplyPather wasn't used, but... what is it that enforces the use of apply(). Using it seems optional (and frankly I don't understand why it exists). Just chaining calls to "WithXyz()" would be my natural way of building options (eventhough I don't really see the point of that pattern in Go). I must be mis-understanding the intended usage pattern.


pkg/snet/reader.go line 78 at r2 (raw file):

	}

	udp := pkt.Payload.(UDPPayload)

Why do this earlier than before? The rest could still fail.


scion/ping/ping.go line 315 at r2 (raw file):

	switch r := pkt.Payload.(type) {
	case snet.SCMPEchoReply:
		if r.Identifier != expectedId {

is that really the way it works? We don't just drop it? It could be a late reply that we had given up on getting, could it not? Why must we care?


scion/traceroute/traceroute.go line 322 at r2 (raw file):

	switch r := pkt.Payload.(type) {
	case snet.SCMPTracerouteReply:
		if r.Identifier != expectedId {

same question as before about echo.

Copy link
Contributor

@JordiSubira JordiSubira left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 11 of 15 files at r1, 1 of 1 files at r2, 14 of 14 files at r3, all commit messages.
Reviewable status: all files reviewed, 8 unresolved discussions (waiting on @jiceatscion and @matzf)


control/revhandler.go line 29 at r3 (raw file):

// RevocationHandler handles raw revocations from the snet stack and inserts
// them into the

them into the RevCache

Code quote:

them into the

pkg/snet/scmp.go line 54 at r3 (raw file):

// revocation handler is configured, it is informed of any received interface
// down messages.
// It never returns an error "in line", so snet.Conn.Read/ReadFrom will not be

My doubts are twofold:

  • Firstly, the current implementation of the RevocationHandler interface, do not propagate any error. So I guess we keep the error because some future implementation could propagate errors?
  • Assuming that at some point the RevocationHandler will propagate internal errors when handling SCMP packets, I do not fully understand this statement. If self.RevocationHandler.Revoke returns an error, I see this as an "SCMP error" or error related to SCMP, so I think we should clarify what this means.

pkg/snet/scmp.go line 69 at r3 (raw file):

	scmp, ok := pkt.Payload.(SCMPPayload)
	if !ok {
		return nil

Related to my lack of understanding in the description. Is this a SCMP error that we shouldn't propagate? I only see that Handle is called for "cases" matching 'SCMPPayload', do we want to just ignore if at some point we introduce some execution flow that provides other packets?


scion/ping/ping.go line 315 at r2 (raw file):

Previously, jiceatscion wrote…

is that really the way it works? We don't just drop it? It could be a late reply that we had given up on getting, could it not? Why must we care?

I tnink the identifier now must be equal to the source port used for sending out the echo request, in the first place. So it should match the value. I guess considering this, the logic is to end up delivering this packet to the ErrHandler that then decides what to do.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants