Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Non persistent TSS signer and x25519 keypair #1216

Open
wants to merge 36 commits into
base: master
Choose a base branch
from

Conversation

ameba23
Copy link
Contributor

@ameba23 ameba23 commented Dec 13, 2024

entropy-tss has two keypairs which define its identity on the network: the TSS account used to sign extrinsics, and the x25519 encryption keypair. Since both are generated in a confidential virtual machine and their public keys included in the attestation, authenticating with these keypairs proves that the attestation still holds.

However, because they are currently stored on disk, it is possible that the validator operator restarts the virtual machine, makes some modifications, and continues to use these keypairs without needing to make another attestation.

This PR makes the signing / x25519 keypairs no longer persistent, but generated new every time entropy-tss launches (except in development / testing where test mnemonics are used). If the entropy-tss process is killed for whatever reason, it will not be able to continue to participate in the protocols until the validator operator updates the TSS details using the change_threshold_accounts extrinsic. I am hoping to find a way to automate this - see #1214

This PR has implications for our devops flow - the --setup-only command line option is no longer available. entropy-tss should be run only once, and the public keys retrieved using the /info http route.

Closes #1203 by adding a boolean ready flag to the application state, which is set to true once the 'prerequisite checks' are complete. These checks now also include a check the the TSS account id has been registered with the staking pallet, and the balance check is now mandatory (before, only a warning was logged if the TSS account had no funds). In a non-ready state, all the the HTTP routes relating the the protocols will return an error.

This also has implications for slashing, as a TSS node should not be in a non-ready state for too long. How long is acceptable depends a bit on whether we are able to automate the process of the node getting funded and calling change_threshold_accounts.

* master:
  Add TDX test network chainspec (#1204)
  Test CLI command to retrieve quote and change endpoint / TSS account in one command (#1198)
  Bump the patch-dependencies group with 2 updates (#1212)
  Bump thiserror from 2.0.4 to 2.0.6 in the patch-dependencies group (#1206)
  Downgrade parity-scale-codec as version we currently use has been yanked (#1205)
  Bump clap from 4.5.22 to 4.5.23 in the patch-dependencies group (#1202)
@ameba23 ameba23 marked this pull request as draft December 13, 2024 10:58
@ameba23 ameba23 changed the title Non persistant TSS signer and x25519 keypair Non persistent TSS signer and x25519 keypair Dec 16, 2024
@@ -101,26 +93,6 @@ use crate::{
validation::EncryptedSignedMessage,
};

#[tokio::test]
#[serial]
async fn test_get_signer_does_not_throw_err() {
Copy link
Contributor Author

@ameba23 ameba23 Dec 17, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this test is no longer needed as the process of getting a signer from app state is now infallible

}

/// Convenience function to get chain api and rpc
pub async fn get_api_rpc(
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not related to this PR but i though once being here why not

Base automatically changed from peg/generate-mnemonic to master December 18, 2024 07:20
Ok(())
};

if let Err(error) = backoff::future::retry(backoff.clone(), balance_query).await {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This balance check and the one below could maybe be combined into a state machine which makes both checks, but i think for now its ok to make one after the other

let backoff = backoff::ExponentialBackoff::default();
match backoff::future::retry(backoff, connect_to_substrate_node).await {
// Never give up trying to connect
let backoff = backoff::ExponentialBackoff { max_elapsed_time: None, ..Default::default() };
Copy link
Contributor Author

@ameba23 ameba23 Dec 18, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've not looked at the backoff crate in too much detail but i think adding max_elapsed_time: None means it will keep checking indefinitely.

kv: &KvManager,
) -> Result<(PairSigner<EntropyConfig, sr25519::Pair>, StaticSecret), UserErr> {
let hkdf = get_hkdf(kv).await?;
pub fn get_signer_and_x25519_secret(
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is now only used when a ValidatorName (eg: --alice) is given, or in tests.

@@ -179,6 +179,8 @@ pub enum UserErr {
TooFewSigners,
#[error("Non signer sent from relayer")]
IncorrectSigner,
#[error("Node has started fresh and not yet successfully set up")]
NotReady,
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we want to react programmatically to this error (eg: for slashing) we could change the impl IntoResponse below to give a some special status code if this variant is present.

@ameba23 ameba23 marked this pull request as ready for review December 18, 2024 09:55
/// - Communication has been established with the chain node
/// - The TSS account is funded
/// - The TSS account is registered with the staking extension pallet
ready: Arc<RwLock<bool>>,
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This bool could maybe be replaced with an enum with the various states of readiness to make it easier to determine why the node is not ready: no connection to chain, no funds, or not registered with staking pallet.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

entropy-tss /info route to get account ID cannot be used until a connection to chain has been made
1 participant