-
Notifications
You must be signed in to change notification settings - Fork 36
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Disassociate RT membership from connectivity #50
Conversation
daa33aa
to
ae0bcf5
Compare
table.go
Outdated
} | ||
|
||
// NewRoutingTable creates a new routing table with a given bucketsize, local ID, and latency tolerance. | ||
func NewRoutingTable(bucketsize int, localID ID, latency time.Duration, m peerstore.Metrics) *RoutingTable { | ||
// Passing a nil PeerValidationFnc disables periodic table cleanup. | ||
func NewRoutingTable(ctx context.Context, bucketsize int, localID ID, latency time.Duration, m peerstore.Metrics, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just curious; why not make the peerValidationFnc
one of the Option
s? We could potentially convert the other args (latency, etc) into Option
s as well, as long as we're changing the signature.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Using contexts to stop services is really an anti-pattern (yeah, I know we use it everywhere...). We should provide a close function.
Otherwise, the routing table could stop before the DHT (context cancellation isn't ordered).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@Stebalien Could you elaborate a bit on this ? Why is it an anti-pattern ? Also, could you direct me to some code which uses a Close function like you mention ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@yusefnapora I've made the change.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Context cancellation is designed for requests: https://golang.org/pkg/context/.
- Previous answer: Simulate DONT_HAVE for older peers ipfs/go-bitswap#248 (comment)
- Lots of discussion here: Discussion: Adopting a pattern/library for managing long lived go-routines ipfs/kubo#5810
Basically, if we use contexts, everything will shutdown at the same time in a random order without taking dependencies into account. If we use explicit close methods, we can close a service and then have it close the sub-components it owns, etc.
We've had problems in the past where we'd, e.g., close the datastore before we're done using it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@Stebalien Thanks for this. I've made the change. Also dug through the goprocess
library for this & now have a pretty good idea of how it works internally.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Personally speaking I don't think using context this way is bad. goprocess
is the antipattern, not context + cancellation. Additionally goprocess
turns non-trivial goroutines into relatively speaking expensive operations.
table.go
Outdated
} | ||
|
||
// NewRoutingTable creates a new routing table with a given bucketsize, local ID, and latency tolerance. | ||
func NewRoutingTable(bucketsize int, localID ID, latency time.Duration, m peerstore.Metrics) *RoutingTable { | ||
// Passing a nil PeerValidationFnc disables periodic table cleanup. | ||
func NewRoutingTable(ctx context.Context, bucketsize int, localID ID, latency time.Duration, m peerstore.Metrics, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Using contexts to stop services is really an anti-pattern (yeah, I know we use it everywhere...). We should provide a close function.
Otherwise, the routing table could stop before the DHT (context cancellation isn't ordered).
@yusefnapora @Stebalien Have made the changes save one, where I'd like some more information on why using contexts for service cancellation is an anti-pattern. Please take a look when you can. |
@Stebalien I believe other than the one comment where we are waiting for @raulk , all your comments have been addressed. Let me know if you have any additional concerns. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This mostly looks right but I'm not awake enough at the moment to give a proper review. However, I do have a question.
|
ping @Stebalien |
Eh, that's probably not an issue for now. We're not going for speed here, we're trying to do a background cleanup. I guess we could be too slow, but that's probably not the end of the world. Regardless, we should set a pretty short timeout.
Yes, but let's do that in a followup patch. The current code should be "good enough" and there are some tricky edge-cases here (eg., don't reconnect to the same overloaded peer every 10 seconds). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!
} | ||
log.Infof("failed to validated candidate=%s", c) | ||
// remove candidate | ||
rt.HandlePeerDead(c) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this actually necessary? We've already removed it from the queue and it shouldn't be in the table. I guess it can't hurt.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's necessary as we remove it from the replacement cache as part of HandlePeerDead
.
(merge when ready) |
@raulk @Stebalien @aschmahmann @dirkmc
For libp2p/go-libp2p-kad-dht#283
1. Address management is not a part of this PR & will be next.
2. DHT changes will be made once this goes in.
3. IMO, moving the RT inside
go-libp2p-kad-dht
is a bigger refactoring that will involve multiple discussions and should be done as a separate stream of work.