Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: generic many to many connection ability for NodeConnectionManager #483

Closed
tegefaulkes opened this issue Oct 19, 2022 · 10 comments · Fixed by #491
Closed

feat: generic many to many connection ability for NodeConnectionManager #483

tegefaulkes opened this issue Oct 19, 2022 · 10 comments · Fixed by #491
Assignees
Labels
development Standard development r&d:polykey:core activity 3 Peer to Peer Federated Hierarchy r&d:polykey:core activity 4 End to End Networking behind Consumer NAT Devices

Comments

@tegefaulkes
Copy link
Contributor

tegefaulkes commented Oct 19, 2022

Specification

The NodeConnectionManager needs a new method to establish multiple connections at once given a set of NodeId and addresses. This will be similar to withConnF and withConnG but it can take an array of desired Nodeids and a second array of addresses. It will then attempt to establish connections with the set of addresses and find any node in the provided set of nodes.

This would enable us to do generic network entry were we can specify a single DNS hostname that resolves to all of our seed nodes. Then we can connect to one or more of them at the same time using this method.

We will also need some options for this. In some situations we can want all possible connects, any of them, the first 3 connections or just the first connection made.

This would depend on the ability to resolve a hostname to multiple 'A' or 'AAAA' records.

withMultiConnF([node1, node2, node3], [testnet.polykey.io], (conns: Map<NodeId, NodeConnection>) => {});

Additional context

Tasks

  1. Add a feature to the NodeConnectionManager to generically establish one or more connections when looking for one or more nodes.
@tegefaulkes
Copy link
Contributor Author

The first step would be updating the ConnectionForward to take an array of NodeIds and use them for verification.

After that we need to have logic for attempting connections to a list of Ip addresses resolved from the hostnames. I'll build in a concurrency limiting logic using a Semaphore. I think we'll default to full concurrency, but it would be useful to limit that.

@CMCDragonkai
Copy link
Member

The verifyServerCertificateChain will need to be updated to take an Array<NodeId> instead of a single NodeId.

It would need to try each one and whichever one succeeds is the correct one.

However some of the checks done can be done independently, so only loop over the checks that involve the NodeId.

@tegefaulkes
Copy link
Contributor Author

How do we handle multi-node connecting without screwing up how the NodeConnectionManager handles the locking for node connections. I can make it a utility that doesn't rely on the node connections. Node connections can be made after the fact and composed.

So with that, we should use the Proxy level locking to handle this on a per address basis. So we can create a method for the proxy that attempts connections for all of the given addresses, and returns a map of nodeId->address for each established connection. Should be relatively simple at this level. From there the normal method of connecting with a NodeConnection should work.

I don't think this needs to be a method of the Proxy I can make it it's own utility function. We'll see how it pans out.

@CMCDragonkai
Copy link
Member

How do we handle multi-node connecting without screwing up how the NodeConnectionManager handles the locking for node connections. I can make it a utility that doesn't rely on the node connections. Node connections can be made after the fact and composed.

As talked about yesterday, this multinode resolver should be something like:

([NodeId, NodeId], [Host | Hostname, ...]) => Promise<{ NodeId: NodeConnection }>

The creation of a NodeConnection I just realised would be stateful, so yes it would need to be a method as part NodeConnectionManager.

If we want to use it as part of withConnF, then you would have something like withMultiConnF.

For locking... I'm not sure yet, maybe some of the locks needs to be broken down.

@CMCDragonkai
Copy link
Member

I'm not sure about making it a method of Proxy. Can this not be done at a higher level? The Proxy should be kept rather simple atm.

@CMCDragonkai
Copy link
Member

CMCDragonkai commented Nov 1, 2022

The NodeConnectionManager will require some changes:

  1. The usage of Lock for creation, and not locking during usage.
  2. The introduction of reference counting instead of read locking for the `NodeConnection.
  3. The usage MultiLockRequest for the Lockbox when attempting to connect to multiple NodeId.

The getConnection may need to be changed to getConnections, it would then do a multi-lock for the purposes of creating those connections. This is better than iterating locks, because this avoids deadlock.

The multiple NodeId undergoes a transformation:

You want to connect to 2 nodes

[NodeIdA, NodeIdB]

Finding the nodes on the node graph

{ NodeIdA: NodeAddress, NodeIdB: NodeAddress }

Resolving via DNS (assuming NodeAddress is a hostname `testnet.polykey.io`)

{ NodeId: [Host, Host, Host], NodeId: [Host, Host, Host] }

Connecting strategy 1 vs 2

{ NodeId: NodeConnection, NodeId: NodeConnection }

The last transformation could happen 2 ways.

The first way is that you attempt to connect to each NodeId iteratively or in parallel. But this translates to trying each IP/Host one at at time, until the NodeId is succcessful.

{ NodeIdA: [ IP1, IP2, IP5 ], NodeIdB: [IP1, IP2, IP4] }

This can result in wasted connection establishments. Like connecting to IP1 here could be wasted, if NodeIdB is actually the representative NodeId for IP1.

The second strategy is to group all the IP addresses (uniquely), and try all of the IP addresses, classifying them by the NodeId, and do it until you have acquired a connection for every NodeId, or when you have exhausted all the candidates. In this case, suppose you have connected to IP1, which is NodeIdB, then you would try IP2, and you get NodeIdA, and then now you're done, there's no wasted connections.

However, if the NodeIds can be connected via multiple potential IPs, then you could be wasting connections, if you've already made a connection to that NodeId already. Currently this would be unlikely, until we have MatrixAI/js-mdns#1 where private IPs take precedence over public IPs.

@CMCDragonkai
Copy link
Member

I think in this instance, strategy 2 will be more efficient because it's unlikely for our NodeIds to each have multiple accessible IPs. It's more likely that we just don't know which IP correlates to which NodeId.

With respect to locking, the proxy locks per remote address, while the node connection manager locks per node ID. These locking concepts still make sense. It's just that at the node connection manager, the locks must be generalised to multilocking, while also changing to Lock instead of RWLockWriter, because read locks shouldn't be used for "usage". It should a reference count instead.

@CMCDragonkai
Copy link
Member

Changing the reference count will address the expiry TTLs as well. One example is:

  1. When ref count reaches 0, set expiry timer.
  2. When it is used, clear expiry timer
  3. If the expiry timer activates, it destroys the connection

tegefaulkes added a commit that referenced this issue Nov 1, 2022
@tegefaulkes
Copy link
Contributor Author

One concern with the multi-node connection. We still have to find the node before we connect to it at that is still a fundamentally a single NodeId -> address process. If we have to do a find node operation before a multi-node connection. then we defeat the point of the 2nd strategy by doing a simple find a single node for each node sequentially.

I may have to update the findNode process to support finding multiple nodes simultaneously. it would have to use a priority queue for each node and share information about already visited nodes between the queues.

But if we're only doing multi-connections to the seed nodes then we don't actually do any finding using kademlia. We should already have the required connection information.

findNode still needs to be updated to make use of the resolveHostname though.

@CMCDragonkai
Copy link
Member

CMCDragonkai commented Nov 2, 2022

The findNode method currently pings every NodeAddress it finds (by establishing a proxy connection).

This makes strategy 2 infeasible, because it would only work if establishing connections occurred after finding addresses.

Ideally findNode shouldn't be doing this, it should just be providing a stream of candidate node addresses across the network. And something else tries to connect to those addresses.

So any multi-node connection system will end up doubling up on connection establishment work.

We can optimise this later by first collapsing node connections with proxy connections when we upgrade the network system.

Then subsequently change the way we are finding nodes and connecting to them to be more generic and parallelised. This can make use of async generators connect a source of candidate node addresses to candidate hosts to realised node connections.

That means at the moment, we will just continue using findNode as it is implemented and do the equivalent of strategy 1.

tegefaulkes added a commit that referenced this issue Nov 3, 2022
@CMCDragonkai CMCDragonkai added r&d:polykey:core activity 3 Peer to Peer Federated Hierarchy r&d:polykey:core activity 4 End to End Networking behind Consumer NAT Devices labels Jul 10, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
development Standard development r&d:polykey:core activity 3 Peer to Peer Federated Hierarchy r&d:polykey:core activity 4 End to End Networking behind Consumer NAT Devices
Development

Successfully merging a pull request may close this issue.

2 participants