Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add signature verification to gossip #1937

Merged
merged 17 commits into from
Dec 1, 2018

Conversation

sagar-solana
Copy link
Contributor

@sagar-solana sagar-solana commented Nov 28, 2018

Problem

Gossip lacks signature verification for messages, which means data can be spoofed.

Summary of Changes

Added Signatures to replicated gossip values and added verification of values are they get replicated across the network

Fixes #1855 and #1952

@sagar-solana sagar-solana added the work in progress This isn't quite right yet label Nov 28, 2018
@@ -60,13 +60,15 @@ pub enum ClusterInfoError {
pub struct ClusterInfo {
/// The network
pub gossip: CrdsGossip,
/// set the keypair that will be used to sign crds values generated. It is unset only in tests.
keypair: Option<Arc<Keypair>>,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't see the need for the Option here. Instead of None, can you use Pubkey::default()?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, Rob was saying the same. I think it makes a little more sense for clarity sake. I'll explain and maybe you can suggest a better solution to the overall problem.

Cluster Info uses the pubkey from the node_info which gets passed in during construction. In fullnodes (most common production usecase), the node_info's pubkey is the same as the keypair that gets passed in optionally(via the new constructor I added).
In nearly all other uses (spy nodes, tests etc) cluster info just carrying around dummy nodes who's pubkeys are no longer associated with any keypair that cluster info can use for signature verification. If I init cluster's keypair to default, it will always fail verification during gossip. The same failure occurs if keypair isn't even initialized since the messages won't be signed and their signatures will not verify. So given that the end result is the same, I chose the Option to make it a little clearer why verification might be failing. Having a valid keypair makes it seem like the node's pubkey is the same as the keypair's but that's not necessary and that's the actual problem here. Cluster info can be constructed with nodes and not have a valid keypair to sign messages with.

Optionally, I can ignore the node's pubkey and just use the keypair always but then that becomes a special case for gossip where gossip messages are always signed with a random keypair whose pubkey doesn't tie into any of the other info blobs (nodes) in the system.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@sagar-solana have ClusterInfo::new(..) take a keypair and get rid of default?

Copy link
Contributor Author

@sagar-solana sagar-solana Nov 28, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@aeyakovenko, yup I can do that but it doesn't solve the problem that most users of cluster info just give it a random node whose keypair is discarded. What keypair do I pass in those cases? In our primary usecase (fullnode's instance of cluster which gets passed to ncp) this isn't an issue but in a whole bunch of places we don't have a keypair to pass in.

Unless its totally fine to expect users of cluster_info to pass in random keypairs knowing that gossip will fail in that cluster_info since that's what will happen with how I have it right now if cluster_info is constructed without a keypair. In that case I'd be more than happy to get rid of default.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it looking like a ton of work to refactor such that all tests are using legit keypairs?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pretty much. Most tests don't even care about having gossip run so it would be sort of unnecessary as I see it right now. For now I've done what you asked and removed the option but left in the other constructor. All Fullnodes should work correctly with gossip verification. If anything else tries to run gossip the peers will drop their messages (like spynodes shouldn't be gossiping anyway, only listening) so that actually works out well.

src/cluster_info.rs Outdated Show resolved Hide resolved
@sagar-solana sagar-solana added the noCI Suppress CI on this Pull Request label Nov 29, 2018
@@ -159,6 +164,24 @@ impl ContactInfo {
pub fn is_valid_address(addr: &SocketAddr) -> bool {
(addr.port() != 0) && Self::is_valid_ip(addr.ip())
}
pub fn get_sign_data(&self) -> Vec<u8> {
Copy link
Member

@aeyakovenko aeyakovenko Nov 29, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@sagar-solana can you break this out into a singe signed serialized structure? @garious i am not a huge fan of this pattern from Transaction, because of how error prone it is.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, Transaction should have a TransactionData struct that includes everything but its signatures.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Alright I can. I'll end up sticking signature into the Protocol enums instead of into the individual crdsvalues. That way I can just use default serialization for the values. Sounds good? Or should I do what @garious suggested with a data inner struct?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@sagar-solana the protocol enums wont work because when you receive a list of CrdsValues you need to verify the signature of each one. The protocol messages are the p2p messages, but the values themselves propagate p2N

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh that's right! Okay then I need to use the inner data structs to avoid manually serializing. Thanks for pointing that out. Got carried away trying to simplify...

@sagar-solana sagar-solana added CI Pull Request is ready to enter CI and removed noCI Suppress CI on this Pull Request work in progress This isn't quite right yet labels Nov 30, 2018
@solana-grimes solana-grimes removed the CI Pull Request is ready to enter CI label Nov 30, 2018
}

fn pubkey(&self) -> Pubkey;
fn get_sign_data(&self) -> Vec<u8>;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

naming nit... getters don't have get_ on them in our conventions

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't come up with that name. Txs already use it :P Suggestions? Just sign_data (could be confused with "please sign this data")?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

could be "signed_data()" ?

Copy link
Contributor Author

@sagar-solana sagar-solana Nov 30, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But the is_signed state of this data isn't known. It's the data that gets signed and compared against for verification. It's not already signed. :(

Copy link
Contributor

@rob-solana rob-solana Nov 30, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

signable_data() to match the trait?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yup, awesome. Thanks!

@@ -12,6 +12,9 @@ use crds_value::CrdsValue;
use solana_sdk::hash::Hash;
use solana_sdk::pubkey::Pubkey;

///The min size for bloom filters
pub const CRDS_GOSSIP_BLOOM_SIZE: usize = 1000;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this in bits? does it make sense for it to be a multiple of 8?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not in bits. Num bits to use is figured out using this value (usually network_size is passed in). See https://hur.st/bloomfilter/.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

- Assign default to keypair
- Added a Signable trait that all signable values
  can implement.
- CrdsValue is a Signable that contains Signables so
  it doesn't need to provide implementations for
  all the methods and can just override some defaults
- Spy nodes should be able to make pull requests
@sagar-solana
Copy link
Contributor Author

@aeyakovenko, if this looks ok to you, I'll merge.

/// The Pubkey of the intended node/destination for this message
pub destination: Pubkey,
/// The source Addr this message should have been received from
pub source: SocketAddr,
Copy link
Member

@aeyakovenko aeyakovenko Dec 1, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The source shouldn't be necessary, and might not be possible to enforce because of NATs. I would add a wallclock to the message so it expires if its replayed at a later time or by a random node.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh I didn't think of that. The wall clock's a good idea though.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@aeyakovenko, I picked 500ms as the timeout. Does that sound reasonable since prunes are p2p?

Copy link
Member

@aeyakovenko aeyakovenko left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

500ms might be tight, but we need to tests all the constants in the real world. Could you add a counter for expired prune messages? (A separate pr to track behavior is fine)

@sagar-solana sagar-solana merged commit 34c3a0c into solana-labs:master Dec 1, 2018
@sagar-solana sagar-solana deleted the gossip_signing branch December 1, 2018 20:00
vkomenda pushed a commit to vkomenda/solana that referenced this pull request Aug 29, 2021
Bumps [typescript](https://github.com/Microsoft/TypeScript) from 4.3.3 to 4.3.4.
- [Release notes](https://github.com/Microsoft/TypeScript/releases)
- [Commits](microsoft/TypeScript@v4.3.3...v4.3.4)

---
updated-dependencies:
- dependency-name: typescript
  dependency-type: direct:development
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
yihau pushed a commit to yihau/solana that referenced this pull request Jul 18, 2024
…port of solana-labs#1888) (solana-labs#1900) (solana-labs#1937)

Revert "v2.0: Refactor and additional metrics for cost tracking (backport of solana-labs#1888) (solana-labs#1900)"

This reverts commit 0aef62e.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants