Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Creating a cryptolink claim from node to node (cross-signing claims) #213

Closed
joshuakarp opened this issue Jul 22, 2021 · 18 comments
Closed
Assignees
Labels
development Standard development r&d:polykey:core activity 3 Peer to Peer Federated Hierarchy security Security risk

Comments

@joshuakarp
Copy link
Contributor

joshuakarp commented Jul 22, 2021

This will be resolved in !209 https://gitlab.com/MatrixAI/Engineering/Polykey/js-polykey/-/merge_requests/209

Specification

!195 has changed the way that node -> node cryptolink claims will need to be created.

Recall that a claim (e.g. a cryptolink) has the following structure:

{
  signatures: [
    {
      signature: <base64 encoding of signature, signed on header + payload>,
      protected: <base64 encoding of serialized header (currently containing signing algorithm + node ID of signee>
    }, 
    ...
  ],
  payload: <base64 encoding of serialized payload (see Claim type for payload contents)>
}

For a keynode, these claims are stored on the node's sigchain.

Therefore, suppose a node X wants to establish a cryptolink to node Y. Both nodes require a claim on their respective sigchain that states they have a cryptolink to the other.

However, we require that any claim from a node -> node be signed by both nodes. Therefore, node X will need to sign both its own claim, and Y's claim on its sigchain (and vice-versa).

This allows any arbitrary node to easily verify the claim. Suppose A receives the cryptolink claim from X, stating that it has a cryptolink to Y. A would verify the claim with X's public key, but importantly, A doesn't need to see the corresponding claim on Y's sigchain - it only needs to verify this claim on X's sigchain with Y's public key.

We need a procedure in place to synchronise the creation of these claims between nodes.

Additional context

Tasks

This will likely make use of the notifications domain. The following process (or some other variation of it) will need to be implemented:

To create a cryptolink from X to Y:

  1. X sends a "cryptolink request" of some sort to Y
  2. Y (at some point in time) accepts this request.
  3. X creates the claim, and transfers the 'unsigned' claim to Y
  4. Y checks the claim to see its correct, signs it, and sends it back to X
  5. X checks the claim again and signs it itself.
  6. X adds this claim to its sigchain
  7. Repeat steps 3-6 in the reverse order to create a claim to store on Y's sigchain

See the following code snippet for a working prototype of this process:

import type { NodeId } from './src/nodes/types';

import * as sigchainUtils from './src/sigchain/utils';
import * as claimsUtils from './src/claims/utils';
import * as keysUtils from './src/keys/utils';
import { KeyManager } from './src/keys';
import { GeneralSign } from 'jose/jws/general/sign';
import canonicalize from 'canonicalize';
import { createPrivateKey } from 'crypto'

async function main() {

  // We want to establish a cryptolink connection between:
  // Node A <--> Node C

  // On Node A:
  const aKeyPair = await keysUtils.generateKeyPair(4096);
  const aKeyPairPem = keysUtils.keyPairToPem(aKeyPair);
  const aPrivateKey = createPrivateKey(aKeyPairPem.privateKey);
  const aHashPrev = 'hashA';
  const aSeq = 2;

  // On Node C:
  const cKeyPair = await keysUtils.generateKeyPair(4096);
  const cKeyPairPem = keysUtils.keyPairToPem(cKeyPair);
  const cPrivateKey = createPrivateKey(cKeyPairPem.privateKey);
  const cHashPrev = 'hashC';
  const cSeq = 5;

  // 1. On Node A:
  // Node A constructs a claim from A -> C
  const payload = {
    hPrev: aHashPrev,
    seq: aSeq,
    data: {
      type: 'node',
      node1: 'A' as NodeId,
      node2: 'C' as NodeId,
    },
    iat: Date.now()
  }
  const canonicalizedPayload = canonicalize(payload);
  const byteEncoder = new TextEncoder();
  const claim = new GeneralSign(byteEncoder.encode(canonicalizedPayload));
  // Node A sends this claim over to Node C

  // 2. On Node C:
  // Node C should check the fields of the claim and make sure they're correct
  // Then, Node C signs the claim
  claim.addSignature(cPrivateKey).setProtectedHeader({ alg: 'RS256', kid: 'C' });
  // Node C sends this signed claim back to Node A

  // 3. On Node A:
  // Node A should check the fields of the signed claim and make sure they're correct
  // Then, Node A also signs the claim
  claim.addSignature(aPrivateKey).setProtectedHeader({ alg: 'RS256', kid: 'A' });
  // Finally, we perform the '.sign' operation
  const jws = await claim.sign();
  console.log(jws);
  // Node A stores this JWS claim on its signchain
}

main();

This will require refactoring the createClaim utility function in claims/utils. See this todo note:

// TODO: Potentially need a createUnsignedClaim function that returns a
// claim = new GeneralSign(payload). This will be needed to sign the claim by
// both nodes, when creating a node -> node claim.

utils.test.ts should be extended to ensure multiple signatures can be added and subsequently verified (currently, they only test claims where 1 signature exists).

@joshuakarp joshuakarp added development Standard development security Security risk labels Jul 22, 2021
@CMCDragonkai CMCDragonkai added design Requires design and removed design Requires design labels Jul 26, 2021
@CMCDragonkai
Copy link
Member

Adding @emmacasolin here into this as there is some cross over between the notifications work and this work.

You 2 should meet up to plan out and work out the kinks of the design of this system.

@CMCDragonkai
Copy link
Member

CMCDragonkai commented Aug 8, 2021

Consider this the next priority after you finish up with the Notifications and Gestalts article @emmacasolin and when @joshuakarp finishes up with the R&D article, (while also reviewing my post-merge review of the nodes cli).

@joshuakarp
Copy link
Contributor Author

To create a cryptolink from X to Y:

  1. X sends a "cryptolink request" (a notification with some expected string) to Y
  2. Y (at some point in time) reads this notification and can accept this request (if desired).
  3. If accepting, Y will need to send a notification back to X to say "I accept"
  4. X creates the claim, and transfers the 'unsigned' claim to Y
  5. Y checks the claim to see its correct, signs it, and sends it back to X
  6. X checks the claim again and signs it itself.
  7. X adds this claim to its sigchain
  8. Repeat steps 3-6 in the reverse order to create a claim to store on Y's sigchain

@joshuakarp
Copy link
Contributor Author

From chat with @emmacasolin , I envisage the above process should work out of the box with the notifications domain.

I think the only issue I see is steps 2 and 3. If Y reads the notification at some later date, but X is offline, obviously it can't send this "I accept" notification back. At the present time, I don't believe any kind of async messaging for notifications is possible https://en.wikipedia.org/wiki/Message_passing#Asynchronous_message_passing.

However, take a step back and think of the use case of establishing a claim between nodes. The 2 nodes are likely to be owned by the same entity/person, so I don't think it's an issue to assume they need to be online at the same time.

@CMCDragonkai
Copy link
Member

CMCDragonkai commented Aug 13, 2021 via email

@CMCDragonkai
Copy link
Member

Clarifying the steps. This is currently fully synchronous, no delay tolerance.

X -> sends notification to (to start cross signing request)( -> Y
X <- calls cross signing grpc request and sends its signed claim (intermediate) <- Y (use bidirectional stream and lock the sigchain on Y and X)
X -> responds with double signing the Y signed claim, and also bundles it with its own signed claim (intermediate) -> Y
X <- responds with double signing the X signed claim <- Y

So you have to use a bidirectional grpc stream because Y has to send a message to X, and X has to respond with a message and Y has to respond again.

@CMCDragonkai
Copy link
Member

For the bidirectional grpc streaming see the grpc utils. And the tests associated. It shows how to use the async generator abstractions for reading and writing on a bidirectional stream. It's better than using stream API. It fits into the promise style.

So Y has to do write, read, write.

X has to do read, write, read.

You'll need to have a timeout applied to both sides. So that the transaction is completed with a certain amount of time. If there's no timeout, it's possible to lock the sigchain forever. Make sure that if any exception occurs that there's a finally block that unlocks the sigchain. If the connection breaks, you must make sure the sigchain gets unlocked.

@CMCDragonkai
Copy link
Member

CMCDragonkai commented Aug 16, 2021

I know we have a connection timeout already inside the underlying network domain. However I'm not sure how this timeout error propagates to the GRPC code. You want to test connection breaking edgecases here. Because if an exception is raised at the network domain, then how does this get caught on the GRPC calling side and on the serving side? They may need to catch the exception and handle something when the connection is broken. I suspect on the serving side, it will happen when you interact with the received object.

Basically we need to find out how GRPC connection errors are raised locally. We're not talking about the errors being serialised back to the client which is handled by the grpc utils. We're talking about what happens locally on each agent/node when the actual connection/network breaks.

You should be able to trigger a network break, by breaking the connection that grpc relies on at the network domain. Look for some way of destroying the connection object.

@CMCDragonkai
Copy link
Member

CMCDragonkai commented Aug 16, 2021

From the GRPC client's perspective you should be either receiving a connection interruption if there was a FIN packet returned from the forward proxy. Or getting a timeout error if the forward proxy just destroys the socket.

@CMCDragonkai
Copy link
Member

Imagine on the grpc client.

// lock resources before the try
try {
  await g.read();
} catch(e) {
  // check if the e is a connection error (timeout or closed error)
} finally {
  // unlock any resources
}

@CMCDragonkai
Copy link
Member

@joshuakarp you also need permission handling here:

  1. X when it sends notification to Y, it needs to add a permission to Y's gestalt internally you can call it GestaltInvite
  2. This permission allows Y to call X.

This permission is similar to vault sharing permission, in that Y elects to be able to join X gestalt if they want to.

Once the cross signing completes, the X and Y gestalt is joined together, so that means the permission can be removed. There's no need to have a GestaltInvite permission added to your own gestalt since there's no need to invite your own gestalt. We already have permission union operations in the acl/utils.ts that can be updated to remove this permission.

Consider that you now need to have notification message types like: GestaltInvite or VaultShare.

Different notifications can be presented to the end user and can be displayed differently or different required data in their payloads. @tegefaulkes this will be relevant to the GUI.

@CMCDragonkai
Copy link
Member

@joshuakarp can you update this issue with the MR that you are working on that addresses this.

@joshuakarp
Copy link
Contributor Author

@CMCDragonkai
Copy link
Member

Due to intermediate resources being created for this interaction between 2 agents to sign each node. We will need deadline to each interaction step.

To simplify, you can have a single deadline for the entire interaction from step 2 onwards.

X -> sends notification to (to start cross signing request)( -> Y
STEP 2 BEGINS HERE:
X <- calls cross signing grpc request and sends its signed claim (intermediate) <- Y (use bidirectional stream and lock the sigchain on Y and X)
X -> responds with double signing the Y signed claim, and also bundles it with its own signed claim (intermediate) -> Y
X <- responds with double signing the X signed claim <- Y

Both X and Y would need a deadline starting from step 2.

To implement a deadline, there are 2 ways:

  1. Using the underlying GRPC deadline functionality - I've seen this parameter get used in unary calls, but I don't know how that applies to streams, it may not relevant to streams.
  2. Using racing promises. I've already implemented this pattern in the networking domain. See the usage of timerStart and timerStop inside ForwardProxy and the Promise.race call in ConnectionForward. The usage is quite complex in the networking domain, but in the case of this node claims process, the entire usage of the Timer type (in src/types.ts) should be executed entirely inside the GRPCClientAgent of Y and agentService of X. However it is possible to consider the construction and destruction of the timer object in the imperative shell, and then use it in an optional way inside the nodes domain.

If the deadline fails it's an exception in the interaction. Upon this exception you need to throw the relevant error to the other side of the interaction, and then finish up and clean up any intermediate resources. Remember to UNLOCK your domains!

@joshuakarp
Copy link
Contributor Author

Deadlines will be implemented in a later MR.

@joshuakarp
Copy link
Contributor Author

@CMCDragonkai
Copy link
Member

@joshuakarp remember if there are new issues that came of out that implementation, you should list them out there too (and cross link under additional context).

@joshuakarp
Copy link
Contributor Author

Further issues stemming from this MR have been created:

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
development Standard development r&d:polykey:core activity 3 Peer to Peer Federated Hierarchy security Security risk
Development

No branches or pull requests

3 participants