-
Notifications
You must be signed in to change notification settings - Fork 453
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: discover and connect to closest peers #798
feat: discover and connect to closest peers #798
Conversation
const { | ||
setDelayedInterval, | ||
clearDelayedInterval | ||
} = require('set-delayed-interval') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I created this module for us to use here, but also in:
- https://github.com/libp2p/js-libp2p/blob/0.30.x/src/circuit/index.js#L43-L88
- https://github.com/libp2p/js-libp2p-kad-dht/blob/master/src/random-walk.js#L46-L89
It is always tricky to implement a "setInterval" when the recurrent task is a promise. Other than that, if we need a delay beforehand, things get even more complex as we need to guarantee that the timer can be stopped for any of the pauses.
When looking at the relay
logic I found a bug. If we clear the timeout when await this._libp2p.contentRouting.provide(cid)
is running, then the timeout will be assigned again afterward. We should be checking if we had timeout before creating the new one. This module aims to avoid introducing these issues when dealing with this use cases.
It would be great to have your 👀 on the module: vasco-santos/set-delayed-interval#1
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I'd really like to get rid of this whole boot delay logic. In reality we should probably have some async boot function and when that finishes these other systems initiate, but I think we can revisit this when we update the config logic. I'll review the module
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would be great to orchestrate subsystems like that. I will add this comment to the configuration issue for us to remember
enabled: true, | ||
interval: 300e3, | ||
bootDelay: 10e3 | ||
}, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a bit confusing as it's not quite peer routing. This is really the refresh manager. In go-libp2p since they just have the DHT right now they run the table refreshes when they search for their closest peers. This happens when DHT bootstrap is run at startup and then on an interval. Ideally we would not use a boot delay timer as this is finicky, and instead run once we've finished a bootstrap phase. Since we don't really have this concept atm, I think a delay is fine for now.
Go runs this every 10 minutes by default right now, we should just match that for consistency.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am going to change this into 10 minutes yes. Regarding the namings, thinking on changing this to:
peerRouting: {
refreshManager: { ... }
}
The peerRouting is the scope where we make these queries, so it seems that this configs should live bound to it. What do you think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added as suggested for now
const { | ||
setDelayedInterval, | ||
clearDelayedInterval | ||
} = require('set-delayed-interval') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I'd really like to get rid of this whole boot delay logic. In reality we should probably have some async boot function and when that finishes these other systems initiate, but I think we can revisit this when we update the config logic. I'll review the module
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good, just 1 comment to discuss later when we look at the disco api/config updates.
I'll let you merge this and will review the other delayed interval PR so you can combine them.
|
||
// If we don't have a result, we need to provide an error to keep trying | ||
if (!result || Object.keys(result).length === 0) { | ||
const result = await pAny( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not going to block this PR on this (as we're doing it in other content/peer routing calls), but we should think about the parallel actions this is doing. Most people shouldn't be using two, but in the change they are, the parallel calls could cause some odd behavior. This is probably worth discussing in more depth when we look at the discovery api / configuration updates.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Makes sense, I will cross post it here for us to discuss
This PR adds a peer routing service to discover closest peers recurrently over time. The discovered addresses will be added to the AddressBook and the peers will be dialled if the autoDial is enabled and the connectionManager threshold is not reached. This will enable peers to better advertise themselves on the network as their closest peers will have their address records.
Following up this PR, we should work on maintaining connections to our closest peers, as they will get push notifications whenever address changes occur. Further optimisations might be done, like trigger a find task when the announce addresses change. Keeping connections and more proactive connections should probably come together with connection manager overhaul as we need to implement dial strategies that can be used for different other subsystems. Once this PR is merged I will report the current state in #704
Related to #704
Needs: