Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Implement MdnsDiscovery #177

Merged
merged 38 commits into from
Sep 19, 2023
Merged

feat: Implement MdnsDiscovery #177

merged 38 commits into from
Sep 19, 2023

Conversation

tomasciccola
Copy link
Contributor

@tomasciccola tomasciccola commented Aug 15, 2023

MDNS Discovery

  1. Broadcast an MDNS message, advertising the “Mapeo” service, and the port that it is listening on. Not going to advertise project ids (e.g. which projects a device is part of). Call service mapeo. Currently module we are using: https://gitlab.com/gravitysoftware/dnssd.js
  2. Browse for other devices advertising ‘mapeo’ service and try to open a connection with any discovered devices
  3. Listen for incoming connections
  4. hyperswarm/secret-stream handshake with any discovered device.
  5. Deterministically dedupe connections (incoming and outgoing).

This should close #160

const mdnsDiscovery = new MdnsDiscovery({ identityKeyPair })

await mdnsDiscovery.start() // start listening on port, start advertising port, and start browse and connect.

// before emitting 'connection', we dedup the connection and do the secret-stream handshake.
mdnsDiscovery.on('connection', (stream: NoiseStream) => {})

await mdnsDiscovery.stop() // Stop accepting new connections, stop advertising, stop trying to connect to any new devices, gracefully close existing connections.
`socket.close()` - https://nodejs.org/dist/latest-v18.x/docs/api/net.html#serverclosecallback 

gmaclennan and others added 2 commits August 14, 2023 16:51
Co-authored-by: tomasciccola <tomasciccola@users.noreply.github.com>
@tomasciccola tomasciccola self-assigned this Aug 15, 2023
@tomasciccola
Copy link
Contributor Author

On this last commit I'm having this issue where on an initiated connection socket.remoteAddress (or socket.address) can be undefined, I don't know why but I suspect that is related to the reusing of this.#handleConnection listener for the callback of the started tcp server and the connection to peers adverticing mapeo.
The thing is that if I uncomment line 78 ( if(!socket.remoteAddress) return) then peers won't be able to find each other. I started working on this at the end of the day so the culprit is probably my brain...

@tomasciccola
Copy link
Contributor Author

Heh, had a really silly bug passing the connection handler to the tcp client. Still getting some duplicated connections though...

Copy link
Member

@gmaclennan gmaclennan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great work on this - this is tricky stuff to get right. Left a bunch of comments that hopefully get this working - I can be available to work together on this today if needed.

src/discovery/mdns.js Outdated Show resolved Hide resolved
const socket = net.connect(
service.port,
service.addresses && service.addresses[0]
)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
)
socket.once('connect', () => {

src/discovery/mdns.js Outdated Show resolved Hide resolved
src/discovery/mdns.js Show resolved Hide resolved
})

secretStream.on('close', () => {
this.#connections.delete(remoteAddress)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Need to test what happens when:

  1. secretStream throws an error
  2. socket throws an error before secretStream connects
    Regardless I believe we need to attach an error listener to secretStream - Node has a funny thing where you need to attach error listeners or it will throw an uncaught error and crash.
Suggested change
this.#connections.delete(remoteAddress)
this.#socketConnections.delete(remoteAddress)
this.#noiseConnections.delete(secretStream.remotePublicKey)

src/discovery/mdns.js Outdated Show resolved Hide resolved
src/discovery/mdns.js Outdated Show resolved Hide resolved
src/discovery/mdns.js Outdated Show resolved Hide resolved
src/discovery/mdns.js Outdated Show resolved Hide resolved
tests/discovery.js Outdated Show resolved Hide resolved
Copy link
Member

@gmaclennan gmaclennan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great work on this - this is tricky stuff to get right. Left a bunch of comments that hopefully get this working - I can be available to work together on this today if needed.

@tomasciccola
Copy link
Contributor Author

I've been noticing instability in terms of peers finding each other. With no changes to the code it seems that sometimes two peers find each other, and sometimes they don't...
I don't know what to make up of it, but maybe I have a bug from the beginning that I'm not seeing...

@gmaclennan
Copy link
Member

Try logging the initial dedupe based on remoteaddress. I have a feeling that there is a race condition here where both clients can end up closing both sockets. It might be better to remove this check altogether and just use the dedupe on the noise stream.

@tomasciccola tomasciccola changed the title Implement MdnsDiscovery feat: Implement MdnsDiscovery Aug 16, 2023
Tomás Ciccola and others added 9 commits August 17, 2023 10:22
Co-authored-by: tomasciccola <tomasciccola@users.noreply.github.com>
I dunno if this is the best solution (or if `secretStream` and `server`
should share the same `close` handler), but it works...
It seems that hypercore replication from on instance to another is failing.
There's probably a bug I haven't catch yet...
@tomasciccola
Copy link
Contributor Author

Missing for this to be merged are more tests (testing with more peers, and probably going to look to test that exist on main now an add them)

Tomás Ciccola added 2 commits August 28, 2023 15:39
It was on the `handleConnection` which meant adding unnecessary handlers
for the same event (for each peer connected)
Copy link
Member

@gmaclennan gmaclennan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The tests still need some work.

What we need to test is:

  1. Every device will find every other device and establish a connection
  2. There is no more than one connection between each device

We want to test this in a scenario that is as close as we can get to how the app will be used.

The current multi-connection test waits for 10 connection events and then stops, and doesn't do any checks about whether all clients have connected or whether there are any duplicate connections. 10 connection events = 5 connections. For 10 peers to connect to each other we need n(n-1)/2 connections, or n(n - 1) connection events.

In order to test for duplicate connections there is no great way to do this other than just "wait" for a certain period of time. I think 5 seconds should be enough.

To simulate a real-life situation, we should not start all the discoveries at the same time - although that is worth testing too, because simultaneous start can catch certain race conditions.

One way of doing all this:

  1. Start each discovery at a random time in the first 2 seconds of the test
  2. On each connection event push the [localPublicKey, remotePublicKey] tuple to an array.
  3. Wait until 5 seconds total have passed.
  4. Check the array of connection events has nPeers * (nPeers - 1) entries
  5. Check that each peer is connected to every other peer e.g.
for (publicKey of peers) {
   const peerConnections = allConnections.filter(([localKey, remoteKey]) => localKey === publicKey).map(([ , remoteKey]) => remoteKey)
   // Check peerConnections is the same as peers.filter(peerKey => peerKey !== publicKey)
}

If you need help with this I can work on this with you.

I don't think the hypercore replication test is necessary. We will test for that in the replication code - it's not testing anything in the MdnsDiscovery class: it's testing whether you can replicate a hypercore over a noisestream.

src/discovery/mdns.js Outdated Show resolved Hide resolved
src/discovery/mdns.js Outdated Show resolved Hide resolved
log('started advertiser for ' + addr.port)

// find all peers adverticing Mapeo
this.#browser = new dnssd.Browser(dnssd.tcp(SERVICE_NAME))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suggest we start browsing immediately (e.g. before waiting for the server to be listening). Also, don't wait for browser.start() to resolve before starting the server and advertiser e.g.

const browserStartPromise = this.#browser.start()
// ... start server and advertiser
await Promise.all([
  browserStartPromise,
  this.#advertiser.start()
])

Just to make sure that we start everything as soon as possible

'serviceUp',
/** @param {Service} service */
(service) => {
if (service.txt?.id === this.#id) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What do do with services without a txt record? Ignore?

} ${socket.remotePort}`
)
if (!socket.remoteAddress) return
if (!isIpPrivate(socket.remoteAddress)) return
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add a log here - might be useful in the future for debugging

src/discovery/mdns.js Outdated Show resolved Hide resolved
src/discovery/mdns.js Outdated Show resolved Hide resolved
@gmaclennan gmaclennan merged commit ca3b6c7 into main Sep 19, 2023
6 checks passed
@gmaclennan gmaclennan deleted the feat/mdns-discovery branch September 19, 2023 15:51
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Wrapping up MDNS Discovery Implement MdnsDiscovery
3 participants