feat: generic many to many connection ability for `NodeConnectionManager` #483

tegefaulkes · 2022-10-19T02:21:43Z

Specification

The NodeConnectionManager needs a new method to establish multiple connections at once given a set of NodeId and addresses. This will be similar to withConnF and withConnG but it can take an array of desired Nodeids and a second array of addresses. It will then attempt to establish connections with the set of addresses and find any node in the provided set of nodes.

This would enable us to do generic network entry were we can specify a single DNS hostname that resolves to all of our seed nodes. Then we can connect to one or more of them at the same time using this method.

We will also need some options for this. In some situations we can want all possible connects, any of them, the first 3 connections or just the first connection made.

This would depend on the ability to resolve a hostname to multiple 'A' or 'AAAA' records.

withMultiConnF([node1, node2, node3], [testnet.polykey.io], (conns: Map<NodeId, NodeConnection>) => {});

Additional context

Tasks

Add a feature to the NodeConnectionManager to generically establish one or more connections when looking for one or more nodes.

The text was updated successfully, but these errors were encountered:

tegefaulkes · 2022-11-01T02:06:23Z

The first step would be updating the ConnectionForward to take an array of NodeIds and use them for verification.

After that we need to have logic for attempting connections to a list of Ip addresses resolved from the hostnames. I'll build in a concurrency limiting logic using a Semaphore. I think we'll default to full concurrency, but it would be useful to limit that.

CMCDragonkai · 2022-11-01T03:02:06Z

The verifyServerCertificateChain will need to be updated to take an Array<NodeId> instead of a single NodeId.

It would need to try each one and whichever one succeeds is the correct one.

However some of the checks done can be done independently, so only loop over the checks that involve the NodeId.

tegefaulkes · 2022-11-01T05:20:16Z

How do we handle multi-node connecting without screwing up how the NodeConnectionManager handles the locking for node connections. I can make it a utility that doesn't rely on the node connections. Node connections can be made after the fact and composed.

So with that, we should use the Proxy level locking to handle this on a per address basis. So we can create a method for the proxy that attempts connections for all of the given addresses, and returns a map of nodeId->address for each established connection. Should be relatively simple at this level. From there the normal method of connecting with a NodeConnection should work.

I don't think this needs to be a method of the Proxy I can make it it's own utility function. We'll see how it pans out.

CMCDragonkai · 2022-11-01T05:25:41Z

How do we handle multi-node connecting without screwing up how the NodeConnectionManager handles the locking for node connections. I can make it a utility that doesn't rely on the node connections. Node connections can be made after the fact and composed.

As talked about yesterday, this multinode resolver should be something like:

([NodeId, NodeId], [Host | Hostname, ...]) => Promise<{ NodeId: NodeConnection }>

The creation of a NodeConnection I just realised would be stateful, so yes it would need to be a method as part NodeConnectionManager.

If we want to use it as part of withConnF, then you would have something like withMultiConnF.

For locking... I'm not sure yet, maybe some of the locks needs to be broken down.

CMCDragonkai · 2022-11-01T05:26:14Z

I'm not sure about making it a method of Proxy. Can this not be done at a higher level? The Proxy should be kept rather simple atm.

CMCDragonkai · 2022-11-01T06:56:07Z

The NodeConnectionManager will require some changes:

The usage of Lock for creation, and not locking during usage.
The introduction of reference counting instead of read locking for the `NodeConnection.
The usage MultiLockRequest for the Lockbox when attempting to connect to multiple NodeId.

The getConnection may need to be changed to getConnections, it would then do a multi-lock for the purposes of creating those connections. This is better than iterating locks, because this avoids deadlock.

The multiple NodeId undergoes a transformation:

You want to connect to 2 nodes

[NodeIdA, NodeIdB]

Finding the nodes on the node graph

{ NodeIdA: NodeAddress, NodeIdB: NodeAddress }

Resolving via DNS (assuming NodeAddress is a hostname `testnet.polykey.io`)

{ NodeId: [Host, Host, Host], NodeId: [Host, Host, Host] }

Connecting strategy 1 vs 2

{ NodeId: NodeConnection, NodeId: NodeConnection }

The last transformation could happen 2 ways.

The first way is that you attempt to connect to each NodeId iteratively or in parallel. But this translates to trying each IP/Host one at at time, until the NodeId is succcessful.

{ NodeIdA: [ IP1, IP2, IP5 ], NodeIdB: [IP1, IP2, IP4] }

This can result in wasted connection establishments. Like connecting to IP1 here could be wasted, if NodeIdB is actually the representative NodeId for IP1.

The second strategy is to group all the IP addresses (uniquely), and try all of the IP addresses, classifying them by the NodeId, and do it until you have acquired a connection for every NodeId, or when you have exhausted all the candidates. In this case, suppose you have connected to IP1, which is NodeIdB, then you would try IP2, and you get NodeIdA, and then now you're done, there's no wasted connections.

However, if the NodeIds can be connected via multiple potential IPs, then you could be wasting connections, if you've already made a connection to that NodeId already. Currently this would be unlikely, until we have MatrixAI/js-mdns#1 where private IPs take precedence over public IPs.

CMCDragonkai · 2022-11-01T06:59:18Z

I think in this instance, strategy 2 will be more efficient because it's unlikely for our NodeIds to each have multiple accessible IPs. It's more likely that we just don't know which IP correlates to which NodeId.

With respect to locking, the proxy locks per remote address, while the node connection manager locks per node ID. These locking concepts still make sense. It's just that at the node connection manager, the locks must be generalised to multilocking, while also changing to Lock instead of RWLockWriter, because read locks shouldn't be used for "usage". It should a reference count instead.

CMCDragonkai · 2022-11-01T07:00:07Z

Changing the reference count will address the expiry TTLs as well. One example is:

When ref count reaches 0, set expiry timer.
When it is used, clear expiry timer
If the expiry timer activates, it destroys the connection

Related #483 [ci skip]

tegefaulkes · 2022-11-02T01:54:36Z

One concern with the multi-node connection. We still have to find the node before we connect to it at that is still a fundamentally a single NodeId -> address process. If we have to do a find node operation before a multi-node connection. then we defeat the point of the 2nd strategy by doing a simple find a single node for each node sequentially.

I may have to update the findNode process to support finding multiple nodes simultaneously. it would have to use a priority queue for each node and share information about already visited nodes between the queues.

But if we're only doing multi-connections to the seed nodes then we don't actually do any finding using kademlia. We should already have the required connection information.

findNode still needs to be updated to make use of the resolveHostname though.

CMCDragonkai · 2022-11-02T02:39:16Z

The findNode method currently pings every NodeAddress it finds (by establishing a proxy connection).

This makes strategy 2 infeasible, because it would only work if establishing connections occurred after finding addresses.

Ideally findNode shouldn't be doing this, it should just be providing a stream of candidate node addresses across the network. And something else tries to connect to those addresses.

So any multi-node connection system will end up doubling up on connection establishment work.

We can optimise this later by first collapsing node connections with proxy connections when we upgrade the network system.

Then subsequently change the way we are finding nodes and connecting to them to be more generic and parallelised. This can make use of async generators connect a source of candidate node addresses to candidate hosts to realised node connections.

That means at the moment, we will just continue using findNode as it is implemented and do the equivalent of strategy 1.

Related #483 [ci skip]

tegefaulkes added the development Standard development label Oct 19, 2022

This was referenced Oct 19, 2022

feat: resolve hostname to multiple addresses when connecting to a node #484

Closed

feat: handling many to many relationship between nodes and addresses #485

Closed

CMCDragonkai mentioned this issue May 15, 2023

Integration Tests for testnet.polykey.com MatrixAI/Polykey-CLI#71

Closed

14 tasks

CMCDragonkai assigned CMCDragonkai and tegefaulkes Oct 31, 2022

CMCDragonkai mentioned this issue Oct 31, 2022

Multi-Host DNS and Multi NodeID resolution - for network entry and general usage #491

Merged

24 tasks

tegefaulkes added a commit that referenced this issue Nov 1, 2022

wip: multi-node connection

2036c20

Related #483 [ci skip]

tegefaulkes added a commit that referenced this issue Nov 3, 2022

feat: multi-node connection

7617607

Related #483 [ci skip]

tegefaulkes closed this as completed in #491 Nov 8, 2022

CMCDragonkai added r&d:polykey:core activity 3 Peer to Peer Federated Hierarchy r&d:polykey:core activity 4 End to End Networking behind Consumer NAT Devices labels Jul 10, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: generic many to many connection ability for `NodeConnectionManager` #483

feat: generic many to many connection ability for `NodeConnectionManager` #483

tegefaulkes commented Oct 19, 2022 •

edited

Loading

tegefaulkes commented Nov 1, 2022

CMCDragonkai commented Nov 1, 2022

tegefaulkes commented Nov 1, 2022

CMCDragonkai commented Nov 1, 2022

CMCDragonkai commented Nov 1, 2022

CMCDragonkai commented Nov 1, 2022 •

edited

Loading

CMCDragonkai commented Nov 1, 2022

CMCDragonkai commented Nov 1, 2022

tegefaulkes commented Nov 2, 2022

CMCDragonkai commented Nov 2, 2022 •

edited

Loading

feat: generic many to many connection ability for NodeConnectionManager #483

feat: generic many to many connection ability for NodeConnectionManager #483

Comments

tegefaulkes commented Oct 19, 2022 • edited Loading

Specification

Additional context

Tasks

tegefaulkes commented Nov 1, 2022

CMCDragonkai commented Nov 1, 2022

tegefaulkes commented Nov 1, 2022

CMCDragonkai commented Nov 1, 2022

CMCDragonkai commented Nov 1, 2022

CMCDragonkai commented Nov 1, 2022 • edited Loading

CMCDragonkai commented Nov 1, 2022

CMCDragonkai commented Nov 1, 2022

tegefaulkes commented Nov 2, 2022

CMCDragonkai commented Nov 2, 2022 • edited Loading

feat: generic many to many connection ability for `NodeConnectionManager` #483

feat: generic many to many connection ability for `NodeConnectionManager` #483

tegefaulkes commented Oct 19, 2022 •

edited

Loading

CMCDragonkai commented Nov 1, 2022 •

edited

Loading

CMCDragonkai commented Nov 2, 2022 •

edited

Loading