Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Announce and discover peers over simple UDP broadcasts #243

Closed
wants to merge 1 commit into from
Closed

Announce and discover peers over simple UDP broadcasts #243

wants to merge 1 commit into from

Conversation

ehmry
Copy link

@ehmry ehmry commented Nov 15, 2017

This patch adds a broadcastService that broadcasts the local peer ID and listens for IDs from remote peers. IP broadcasting is limited to IPv4, but is simplier than multicasting and only intended for local area networks.

Copy link
Member

@Stebalien Stebalien left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks useful. This is a preliminary code review, I don't have enough background to comment much on the discovery mechanism itself.

// fudge the interval to dampen oscillations with other nodes
margin := int(interval) >> 2
interval += time.Duration(rand.Intn(margin))
interval -= time.Duration(rand.Intn(margin))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks a bit funky. Any reason not to jus to just add a smaller random delay to the interval (rather than add and then subtract one)?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I add and subtract so the the interval can jitter by a positive or negative amount. And I like getting funky.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, but negative jitter is equivalent to just reducing the default interval (you could also just subtract margin/1).

for _, n := range bs.notifees {
go n.HandlePeerFound(pi)
}
bs.lk.Unlock()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Best to defer this call instead.

func (m *broadcastService) RegisterNotifee(n Notifee) {
m.lk.Lock()
m.notifees = append(m.notifees, n)
m.lk.Unlock()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Again, we generally defer unlocks when possible.

if found != -1 {
m.notifees = append(m.notifees[:found], m.notifees[found+1:]...)
}
m.lk.Unlock()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Defer

}
}
if found != -1 {
m.notifees = append(m.notifees[:found], m.notifees[found+1:]...)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since we don't really care about order, you can just swap the entry you want to remove for the last entry. Also, remember to null out the last entry, otherwise, it won't be garbage collected.

found := -1
for i, notif := range m.notifees {
if notif == n {
found = i
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cleaner to just put the removal code inside here and then return (assuming a deferred unlock).

case <-ticker.C:
bs.sendBroadcast()

case <-ctx.Done():
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we're going to make canceling the context equivalent to calling close, we should call close here. Actually, we should defer a call to close so it gets called regardless.

}

func (bs *broadcastService) recvBroadcasts(ctx context.Context) {
var buf = make([]byte, 48)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are no exit conditions here.

}

func (bs *broadcastService) Close() error {
bs.listenSock.Close()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should also cancel the context (which means you'll need to make the context cancelable and store the cancel function). I'm not really a fan of using contexts in this way but we're already doing it for the other broadcast services so oh well...

@Kubuxu
Copy link
Member

Kubuxu commented Nov 15, 2017

I was working on similar feature for cjdns but explicitly IPv6, it allows for better granularity but more over it works on mobile. I think if we are to add something like that we are better off using IPv6.

@whyrusleeping whyrusleeping requested a review from a user November 16, 2017 04:50
@ehmry
Copy link
Author

ehmry commented Nov 16, 2017

FWIW I haven't written go for few years so I'm open to any review.

@Kubuxu I had considered trying to do this with multicast and the 'all-systems' address, which is supported for IPv6 as well, but multicast seems overly complicated. My intention here is simply discovering nodes within a small physical network so for the sake of simplicity IPv4 only seems justified.

My immediate use case is actually to discover IPFS nodes that are broadcasting on port 4001 and then connect to them on 5001 as an API client.

Broadcast the local peer id from UDP sockets bound to the same IP
addresses and ports as the TCP listening sockets. Listen for these
broadcast and discover peers by deducing the peer address from received
packets.
@ehmry
Copy link
Author

ehmry commented Nov 17, 2017

@Stebalien thanks for the review, I've updated the PR following your recommendations.

// fudge the interval to dampen oscillations with other nodes
margin := int(interval) >> 2
interval += time.Duration(rand.Intn(margin))
interval -= time.Duration(rand.Intn(margin))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, but negative jitter is equivalent to just reducing the default interval (you could also just subtract margin/1).


func (bs *broadcastService) recvBroadcasts(ctx context.Context) {
var buf = make([]byte, 48)
for bs.active {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Checking this without a lock isn't guaranteed to work. Try re-running your tests with the -race flag. Assuming that doesn't hit one of our bugs..., that should catch things like this.

IP: addr.IP,
Port: addr.Port,
})
if err == nil {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Stylistic but... you can reduce the indentation level by using:

if error != nil {
    return
}
// ...


// send broadcasts on a periodic timeout
ticker := time.NewTicker(interval)
for bs.active {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Again, can't read/write to a variable from different threads without using some synchronization mechanism (lock, channel, etc.).

Really, I'd just get rid of this variable and use the context.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's a bool.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's a bool.

I know, I'm referring to https://golang.org/ref/mem. For example, given:

package main

func main() {
	var active = true

	go func() {
		active = false
	}()

	for active {
	}
}

Go may choose to allow the second go routine to loop forever (optimizing out the check of active) because no synchronization primitives were used. Try running the above code with the race detector.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a race condition, it doesn't matter if its just a bool.

defer m.lk.Unlock()
for i, notif := range m.notifees {
if notif == n {
m.notifees[i] = nil
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Having an array filled with nils is probably not the best way to go. Instead, use https://github.com/golang/go/wiki/SliceTricks#delete-without-preserving-order. That will simplify RegisterNotifee and handleBroadcast as well.

func (bs *broadcastService) recvBroadcasts(ctx context.Context) {
var buf = make([]byte, 48)
for bs.active {
n, rAddr, err := bs.listenSock.ReadFromUDP(buf)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm pretty sure most errors from this function are fatal. That is, we can't continue listening on the socket (we'd end up in a rapid, infinite loop). It's probably better to log the error, stop the service, and return.

You could also try checking to see if the error is temporary (cast to net.Error and check if err.Temporary() is true).

//
func bindBroadcast(ip net.IP, port int) (*net.UDPConn, error) {
conn, err := net.ListenUDP("udp4", &net.UDPAddr{IP: ip, Port: port})
if err == nil {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The go idiom is to (almost) always write:

if err != nil {
    return nil, err
}

That helps prevent nesting/confusing control flow.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would argue that typing the same early return idiom over and over and over is worse for understanding control flow.

Copy link
Member

@Stebalien Stebalien Nov 21, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's just how go is written (that is, consistency is king). However, I'd disagree:

  1. It makes it very easy to recognize where errors are being returned.
  2. It reduces indentation.

Now, one can argue that there should be a better way to return errors in go but that's a separate issue (and not something we can do anything about).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps you should think of go's error handling as a monad; when an error occurs you return.
Otherwise the control flow is linear, the less branching the better.

}
}
}
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We may want to return an error if we fail to bind to anything. I wouldn't stop IPFS from starting in that case but, if we return an error and log it, it'll make debugging easier.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When I looked at the mdns file I saw three different methods of logging errors, so I thought it cleaner not to log at all.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We usually use github.com/ipfs/go-log. Any other methods of logging should be fixed. However, I'd just suggest returning an error and logging it from go-ipfs.

Copy link
Member

@Kubuxu Kubuxu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ehmry any interest in continuing with this change?

Copy link
Contributor

@vyzo vyzo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it may seem like a small thing, but it's important to reduce control flow indentation for error checking.
It makes code easier to read, and reduces cognitive overhead.
So please fix your control flow/indentation!

if err != nil {
bs.Close()
return nil, err
} else {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no need for an else branch here, you are returning in the case of error -- this just adds indentation and cognitive overhead.

var buf = make([]byte, 48)
for bs.active {
n, rAddr, err := bs.listenSock.ReadFromUDP(buf)
if err == nil {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if err != nil {
 continue
}

Again, this reduces indentation level and cognitive overhead.


// send broadcasts on a periodic timeout
ticker := time.NewTicker(interval)
for bs.active {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a race condition, it doesn't matter if its just a bool.

//
func bindBroadcast(ip net.IP, port int) (*net.UDPConn, error) {
conn, err := net.ListenUDP("udp4", &net.UDPAddr{IP: ip, Port: port})
if err == nil {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps you should think of go's error handling as a monad; when an error occurs you return.
Otherwise the control flow is linear, the less branching the better.

@ehmry ehmry closed this Mar 7, 2018
@ehmry
Copy link
Author

ehmry commented Mar 7, 2018

I'm just not going to think of go at all.

marten-seemann pushed a commit that referenced this pull request Apr 21, 2022
marten-seemann pushed a commit that referenced this pull request Aug 17, 2022
* feat: harden encoding/decoding functions against panics

Part of #1389

These kinds of functions:

1. Handle user input.
2. Often have out-of-bounds, null pointer, etc bugs.
3. Have completely isolated logic where local panics are unlikely to
   cause memory corruption elsewhere.

* test: add a panic catcher test
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants