Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Peer Sharing - how is the reply to a share request calculated #3958

Closed
njd42 opened this issue Aug 17, 2022 · 4 comments
Closed

Peer Sharing - how is the reply to a share request calculated #3958

njd42 opened this issue Aug 17, 2022 · 4 comments
Assignees

Comments

@njd42
Copy link
Contributor

njd42 commented Aug 17, 2022

  • Do we need to make multiple / repeat requests idempotent?
  • What 'entropy' should be present in the responses?
  • Should we share those known to be ledger peers at all?
    • note that we can't take an IP address and determine if it is a ledger [reverse lookups just don't work]
    • not sharing our ledger peers could help with DDoS resistance
@bolt12
Copy link
Contributor

bolt12 commented Aug 18, 2022

After a node receives a Peer Hsaring request from another peer, before responding there are some things it needs to consider:

  • Have I replied to this peer recently? If yes, then probably we should ignore the request, this way we do not keep information about the previous reply (avoiding having to keep more information than we have to) and it makes it clear a node does not benefit a lot from asking the same peer twice.
  • Another check might be worth doing is that if we have any information about the willingness of this peer to participate in Peer Sharing. However, a node might have reloaded its configuration with a new global configuration flag, enabling Peer Sharing participation. There's no way the request receiving node can know about this.. so maybe it does not make much sense to make this check.
  • Should we rate limit share requests? Not sure this is an issue if we ignore requests from the same peer and have well configured targets.

After considering these items, the node needs to calculate its response. There's 2 ways one can probably go about this: The requesting end requests an upper-bound on the amount of peers to fetch or the replying end decides how many peers to give. I think the better way is for the requesting end to ask an upper-limit, since the other way around the receiving end would have to prune the requests anyway to avoid for huge dumps of information and possible resource usage exploits. Given this, a possible algorithm for computing the response can be:

  • Compute a random number between 0 and the upper-limit;
  • Take as many as the generated random number from the to-share set, according to some policy.

The to-share set should not include known-to-be-ledger peers and peers that have expressed their unwillingness to participate in Peer Sharing, either via configuration file, wither by handshake.

The policy should also manage the amount of "entropy" in the response (e.g. pick a random percentage of cold/warm; maybe even some fake peers).


Questions: Should we share Established peers? Or any Known Peers is enough? How can we introduce more entropy in our response?

@coot
Copy link
Contributor

coot commented Aug 18, 2022

Have I replied to this peer recently? If yes, then probably we should ignore the request, this way we do not keep information about the previous reply (avoiding having to keep more information than we have to) and it makes it clear a node does not benefit a lot from asking the same peer twice.

What if a node restarts and we don't have cache of known peers implemented? If that information does not persist over lifetime of a connection, then we won't have any problems.

Q: Should this time delay be a protocol constant?
Which means if violated then we disconnect from that peer?

Another check might be worth doing is that if we have any information about the willingness of this peer to participate in Gossip. However, a node might have reloaded its configuration with a new global configuration flag, enabling Peer Sharing participation. There's no way the request receiving node can know about this.. so maybe it does not make much sense to make this check.

If we also have TTLs, as I suggested here, then this could work.

Should we rate limit share requests? Not sure this is an issue if we ignore requests from the same peer and have well configured targets.

This won't be an issue if we set a minimal time between share requests.

I think the better way is for the requesting end to ask an upper-limit.

Yes this makes sense. We also need to specify an upper bound of data send in response. I think the simplest way to do that is to have another protocol level limit which would correspond to something like 20 addresses (the 20 in this case would be the upper-bound for a request's upper-limit).

Given this, a possible algorithm for computing the response can be:
Compute a random number between 0 and the upper-limit;
Take as many as the generated random number from the to-share set, according to some policy.

If somebody asks us for n peers we should supply n on best effort basis but no more. By suitably crafting the upper bound for n we could control how many peers one can at most get from a single peer.

Questions: Should we share Established peers? Or any Known Peers is enough?

Established peers are a subset of known peers 😄.

How can we introduce more entropy in our response?

Isn't a random choice from our shareable non-ledger known peers enough?

@bolt12
Copy link
Contributor

bolt12 commented Aug 18, 2022

What if a node restarts and we don't have cache of known peers implemented? If that information does not persist over lifetime of a connection, then we won't have any problems.

Not sure I understand what you're trying to say here. Are you saying what I said is okay even if we restart the node without any caching?

Q: Should this time delay be a protocol constant?
Which means if violated then we disconnect from that peer?

Don't think this is necessary. Since we'll discourage such cases by returning nothing I don't think this can be considered bad behaviour. It can also be that for some reason the requesting node restarted and we don't want to disconnect from a possibly good peer just because of this

If we also have TTLs, as I suggested here, then this could work.

Good idea, so this TTL mechanism churns asked and shared peers.

This won't be an issue if we set a minimal time between share requests.

Agree! So this should be yet another requirement, having a delay between requests this should probably be a random value each time to increase entropy

If somebody asks us for n peers we should supply n on best effort basis but no more. By suitably crafting the upper bound for n we could control how many peers one can at most get from a single peer.

Yes, sounds fair, I wasn't sure if randomizing the upper-limit would be the kind of entropy one wants

@bolt12
Copy link
Contributor

bolt12 commented Aug 18, 2022

What if a node restarts and we don't have cache of known peers implemented? If that information does not persist over lifetime of a connection, then we won't have any problems.

Not sure I understand what you're trying to say here. Are you saying what I said is okay even if we restart the node without any caching?

@bolt12 bolt12 changed the title Gossip - how is the reply to a request calculated Gossip - how is the reply to a share request calculated Aug 26, 2022
@bolt12 bolt12 changed the title Gossip - how is the reply to a share request calculated Peer Sharing - how is the reply to a share request calculated Aug 26, 2022
@bolt12 bolt12 moved this to In Progress in Ouroboros Network Sep 14, 2022
@bolt12 bolt12 closed this as completed Sep 14, 2022
Repository owner moved this from In Progress to Done in Ouroboros Network Sep 14, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants