WIP: Upgrading @matrixai/async-init, @matrixai/async-locks, @matrixai/db, @matrixai/errors, @matrixai/workers, Node.js and integrating @matrixai/resources #366

CMCDragonkai · 2022-03-28T05:12:20Z

Description

Several core libraries have been updated and so they need to be integrated into PK. Refer to MatrixAI/js-encryptedfs#63 to see how this was handled for EFS and repeat here.

@matrixai/async-init - no major changes
@matrixai/async-locks - all instances of Mutex should change to use Lock or RWLockWriter
@matrixai/db - the DB now supports proper DB transactions and has introduced iterator, however no more sublevels (use KeyPath/LevelPath instead)
@matrixai/errors - we make all of our errors extend AbstractError<T> and also provide static descriptions to all of them, as well as use the cause chain
@matrixai/workers - no major changes here
node.js - we should upgrade to Node 16 in order to integrate promise cancellation, as well as using Promise.any for ICE protocol
@matrixai/resources - since the @matrixai/db no longer does any locking, the acquisition of the DBTransaction and Lock has to be done together with withF or withG

Along the way, we can explore how to deal with indexing #188 and #257, which should be easier now that DB has root levels of data, transactions.

Locking changes

There were three important types of locks brought in by js-async-locks:

Lock (this is the equivalent of Mutex which we were using previously)
RWLockWriter (a write-preferring read/write lock)
LockBox (a collection of related locks)

In most cases, we are removing locks in favour of using optimistic concurrency control (OCC). This means that most domain locks should be removed with a few exceptions:

Counters/serialised data - places where we have strict serialisation, such as counters and lexicographically ordered keys, require locking the DB in order to maintain this order. This applies to the keys used by the Discovery Queue in Discovery.ts, the NotificationIds and message count used by NotificationsManager.ts, the ClaimIds and sequence numbers used by Sigchain.ts, and if the set/ping node or refresh buckets queues introduced in Testnet Deployment #326 become persistent then we would need locking there as well. This could be achieved for these cases by introducing a LockBox to the affected domains and locking the relevant keys when we access the db, e.g. withF([this.db.transaction(), this.locks.lock(...lockRequests)], async ([tran]) => {}) where this.locks is a LockBox and ...lockRequests is an array of [KeyPath, Lock] (see EFS INodeManager.ts).
Object maps - in NodeConnectionManager and VaultManager we need to lock groups of objects, this can be done using a LockBox where we are referencing the same ID each time we want to lock a specific object.

Everywhere else we expect that conflicts will be rare, so we don't use locks in order to simplify our codebase. In the case of a conflict, we can either retry (if safe) or bubble up the error to the user.

Errors changes

The new js-errors allows us to bring in error chaining, along with more standardised JSON serialisation/deserialisation (for sending errors across gRPC). With this error chaining ability, there are now three ways that we can handle/propagate errors:

Error re-raise/re-throw: This re-raises the same error, in this case it just means you couldn't handle the error.

try {
  throw new ErrorPolykey('something happened');
} catch (e) {
  throw e;
}

Error override: This overrides the error with a whole new error. Use this when you are wrapping a library of internal errors and presenting transformed external errors to the user.

try {
  throw new ErrorPolykey('something happened');
} catch (e) {
  throw new ErrorPolykey2(e.message);
}

Error chain: This raises a new error and chains the new error with the old error. Use this when your code represents an onion of different layers each capable of handling different errors.

try {
  throw new ErrorPolykey('something happened');
} catch (e) {
  throw new ErrorPolykey2(e.message, { cause: e });
}

In all places where we are catching one error and throwing a different error in its place, we should be using approach 3 (error chain). If we just want to bubble the original exception upwards then use approach 1 (re-raise/re-throw). Finally, if we want to hide the original error from the user (perhaps it contains irrelevant implementation details or could be confusing and thus requires additional context) we can use approach 2 (error override). There is a fourth approach that exists in Python for errors that occur as a direct result of handling another error, however, this does not exist in TypeScript (in such a case we would use approach 3). When using approach 2 (and in some cases approach 3) you may want to log out the original error in addition to throwing the new error.

JSON serialisation/deserialisation

When sending errors between agents/from client to agent we need to serialise errors (including the error chain if this exists). Then, on the receiving side, we need to be able to deserialise back into the original error types.

We are able to do this using JSON.stringify() (serialisation) and JSON.parse() (deserialisation). These methods allow us to pass in a replacer/reviver to aid with converting our error data structure, as well as being combined with toJSON() and fromJSON() utility methods on the error class itself. These are implemented on AbstractError from js-errors, however, we need to extend these to work with ErrorPolykey in order to handle the additional exitCode property. While toJSON() can simply call super.toJSON() and add in the extra field, fromJSON() needs to be completely reimplemented (although this can be copied from AbstractError for the most part). Similarly, the replacer and reviver can be based on the replacer and reviver used in the js-errors tests.

ErrorPolykeyRemote

Errors are propagated between agents and clients as follows:

toError: ServiceError -> ErrorPolykeyRemote                         fromError: Error -> ServerStatusResponse

                                                                         ┌─────────────────────────────┐
      ┌───────────────────────────────┐                                  │ NodeId 1 - Server           │
      │ Client                        │             Request              │                             │
      │                               │                                  │ ┌─────────────────────────┐ │
      │ Receives toError ServiceError ├──────────────────────────────────┼─► Service Handlers        │ │
      │ Throws ErrorPolykeyRemote     │                                  │ │                         │ │
      │ cause: ErrorX | ErrorUnknown  ◄──────────────────────────────────┼─┤ Sends fromError ErrorX  │ │
      │ data: { nodeID: 1, ... }      │                                  │ └─────────────────────────┘ │
      └───────────────────────────────┘      Response with error         │                             │
                                             JSON string in metadata     └─────────────────────────────┘
                                          error: { type: ErrorX, ... }

ErrorPolykeyRemote is constructed on the client side (not the server side) inside our gRPC toError() utility. After the received error is deserialised, it is wrapped as the cause property of a new ErrorPolykeyRemote, which should also contain the nodeId, host, and port of the agent that originally sent the error in its data property. In order to access this information, it needs to be passed through from wherever the client/agent method is called (this would be bin commands for the client service and domain methods for the agent service). The data can then be passed through to the promisify...() methods, which in turn call toError().

Testing

Now that we have an error chain, we need to adjust our tests to be able to perform checks on these. In many cases where we were originally expecting some specific ErrorPolykey we will now be receiving an ErrorPolykeyRemote with the original error in its cause property. For simple cases like this it is simple enough to just perform the existing checks on the cause property of the received error rather than the top-level error, however this approach becomes more complicated for longer error chains. Additionally, we may want to perform checks on the top-level ErrorPolykeyRemote (such as checking the metadata for the sending agent).

In this case, it would be useful to create an expectation utility that allows one to perform checks on the entire error chain, from the top-level error to the final error in the chain. This could look something like this:

function expectErrorChain(thrownError: any, expectedErrors: Array<any>) {
  let checkingError = thrownError;
  for (error in errors) {
    // Check expected type first
    expect(checkingError).toBeInstanceOf(...);
    // Match against every expected property
    expect(checkingError.message).toBe(error.message);
    if (error instanceof ErrorPolykey) {
      expect(checkingError.data).toEqual(error.data);
      // ...
    }
    // Move along to the next in the cause chain
    checkingError = checkingError.cause;
  }
}

We could also think about using the toJSON() method on each error to allow us to use jest's expect().toMatchObject() matcher rather than having to check every error property individually. Also potentially including parameters to specify which properties of the error you do and don't want to check against.

Additional context

Integrate Error Chaining in a js-errors or @matrixai/errors package #304 - issue for error chaining/JSON serialisation and de-serialisation with additional implementation details in the description
Integrating JSON serialization and deserialization js-errors#2 - bringing JSON serialisation/de-serialisation into js-errors

Database changes

Changes include but are not limited to

Updating any transact wrappers by removing them or using withF, withG locking directly.
In all cases where there is a conflict just throw it up the stack. We will expect to handle them within the handlers or look deeper into it later.
ErrorDBTransactionConflict Error should never be seen by the user. We should catch and override it with a more descriptive error for the context.
Transactions should be started within the handlers and passed all the way down to where they are needed. The idea is to make each handler attomic.
Concurrency testing should be introduced but only after SI transactions has be implemented.
All usage of DB should be updated to use the new API. This means removing sublevels and utilising LevelPath and KeyPaths instead.
All usage of db streams should be replaced with the db iterator.
All instances of db.put, db.get and db.del should be using transactions via tran.put/get/del

This applies to all domains that make use of DB OR domains that depend on others that make use of DB. The goal here is to make any even starting from the handlers atomic.

There are limitations to this however. Since a transaction can fail if there is overlapping edits between transactions. We can't really include changes to the db that will commonly or guarantee conflict. Example of this are counters or commonly updated fields. So far this has been seen in;

NotificationsManager. Makes use of a counter so any transactions that include Adding or removing a notification WILL conflict. Reads also update metadata so concurrently reading the same message WILL conflict.
More to follow?

Some cases we will need to make use of locking along with a transaction. A good example of this is in the NotificationManager where we are locking the counter update. When this is the case we need to take extra care with the locking. Unless the lock wraps the whole transaction it is still possible to conflict on the transaction. we can't compose operations that rely on this locking with larger transactions.

An example of this problem is.

start tran1
start tran2

start lock
	tran1 update counter
end lock

start lock
	tran2 update counter
end lock

end tran1
end tran2 // conflict!

To avoid this tran2 has to start after tran1 ends. 
We need to force serialisation of the transactions here.
As a result we can't compose with a transaction outside of
the lock.

This means that some operations or domains can't be composed with larger transactions. It has yet to be seen if this will cause an issue since more testing is required to confirm any problem. I suppose this means we can't mix pessimistic and optimistic transactions. So far it seems it will be a problem with the following domains.

Vaults domain - Lifecycle and editing of vaults relies heavily on locks.
NotificationsManager - Locking is needed for the counter and preventing other conflicts.

Node.js changes

After upgrading to Node v16 we will be able to bring in some new features.

Promise.any - we are currently using Promise.all in our pingNode() method in NodeConnectionManager (this is being done in Testnet Deployment #326) but this should change to using Promise.any. This is because Promise.all waits for every promise to resolve/reject, however we only care about whichever finishes first and we want to cancel the rest. This change can be made in Testnet Deployment #326 after this PR is merged.
AggregateError - this error is emitted by Promise.any if all of the given promises are rejected. In our case this would mean that we were unable to ping the desired node via direct connection or signalling (and eventually relaying once this is implemented), so we may want to catch this and re-throw some other error to represent this. We will also need to add this into our error serialisation/deserialisation.
AbortController - there are a number of places that have been identified in Asynchronous Promise Cancellation with Cancellable Promises, AbortController and Generic Timer #297 where we could use the new AbortController/AbortSignal. This can be done in a separate PR when rebasing Testnet Deployment #326.

Issues Fixed

Fixes Fine Grained Concurrency Control Standardisation #294
Fixes Integrate Error Chaining in a js-errors or @matrixai/errors package #304
Fixes Align VaultInternal with context API functions with NodeConnectionManager #356
Fixes Format option should affect the output formatting of exceptions on STDERR for the CLI #323
Related Upgrading to node 16.x LTS (and pkgs.nix upgrade to 21.11 or later) TypeScript-Demo-Lib#37
Related Support read-after-write consistency for NodeGraph bucket operations #244
Related Indexing Domains and DB Levels for Faster & Flexible Lookup #188
Related Integrate autoindexing into the vaults domain #257
Related Integrate Automatic Indexing into DB js-db#1
Related Create IPC locking library @matrixai/js-file-locks to introduce RWlocks in IPC #290 - this will only be important for file locking related operations for Inter-Process communication

Tasks

Tasks are divided by domain for conversion. Assignments are as follows. Domains labelled 'last' depend on a lot of other domains and will be divided later.

Final checklist

TBD spec:

Error changes:

Testing for an error with a cause.

Pending changes from JSON serialisation/deserialisation from the new AbstractError
ErrorPolykeyRemote needs to have remote node metadata
Work out how ErrorPolykeyRemote is propagated from agent to agent to client.
Create new expectation utilities that can standardise how we expect things to fail and the checks on those exceptions, as well as the cause property
Bring in the standards of exception override vs exception chain into the spec

DB changes:

domains - Each domain used to have a transact or withTransaction wrapper. Unless this wrapper actually needs some additional context, and it is just calling withF and withG directly, then avoid this and just replace it with direct usage of withF and withG utilities. We should identify domains that require additional context, in those cases we continue to use a with transaction wrapper.

transaction, and its utilisation of SI transactions

withF([db.transaction()], async ([tran]) => {
    await tran.put(k1, ...);
    await tran.put(k2, ...);
});

withF([db.transaction()], async ([tran]) => {
    await tran.put(k2, ...);
    await tran.put(k3, ...);
});

ErrorDBTransactionConflict
  .data => [k1, k2, k3]

In all cases, you just throw this exception up, and propagate it all the way to the client.

The pk client however will change the message to be more specific to the user's action.

Retrying this can be identified and sprinkled into the codebase afterwards.

Work out a way to write a user-friendly message when a ErrorDBTransactionConflict occurs in the PolykeyClient (at the CLI)
Identify where automatic retries can occur, and make those catch the ErrorDBTransactionConflict and resubmit the transaction
Identify write-skews, this is where we must use solutions for dealing with write skews => locking and materialising the write conflict
Identify serialisation requirements - counter updates, and where PCC is demanded as part of the API, in those situations use the LockBox or lock

Handlers need to now start the transaction. For agent and client.

We have tran?: DBTransaction where it allows public methods to create their own transaction context.

But this is only for debugging and convenience.

During production, the transaction context is meant to be setup at the handler level.

This means that each handler is its own atomic operation.

By default they start out at the handler level, this will make it easier for us to identify our atomic operations and to compare them.

Tests - introduce concurrency tests, consider some new cocurrency expectation combinators that allow us to more easily say one of these results has to be X, and the rest has to be Y.

Concurrency tests can be introduced domain by domain. - Use Promise.allSettled

We can however focus our concurrency testing at the handler level, because we expect that to be the unit of atomic operations.

All ops removed, changed to using keypath and levelpath.

In the iterator:

For tran.iterator you will always have the key no matter what because it is being used for matching between the overlay and underlying db data.
The keyAsBuffer and valueAsBuffer are by default true, if you change them to false, it will decode the 2, and return as decoded values. If you need both, use dbUtils.deserialize().
More docs for tran.iterator
Eliminate all subleveldown, we are no longer using that (can be brought back later, would only be useful to share db betweeen EFS and PK DB)
Eliminate all streams

withF([db.transaction()], async ([tran]) => {
    await tran.put('k1', 'adfdsf);
    const i = tran.iterator();
    const r1 = await tran.get('k1')
    i.seek('k1')
    const r2 = i.next()
    // r1 !== r2 (possible)
    // with SI r1 === r2
});

Just think that the iterator has the same snapshot as the original transaction.

Don't start writing concurrency tests until we have SI transactions. Serial tests are still fine.

Nodev16 Changes:

Promise.any - for ICE protocol for the DC and signalling
AggregateError - its in our serialisation and deserialisation, but it's only emitted by Promise.any
AbortController - separate PR (when rebasing testnet deployment) - identify where these may be changed polling

Locking Changes:

All async locks and async mutex is to be replaced with js-async-locks.

In the js-async-locks:

Lock - mutex
RWLockWriter - read/write
LockBox - locking collection

Convert object-map pattern to using LockBox by referencing the same ID.

Use INodeManager in EFS as a guide.

CMCDragonkai · 2022-03-28T13:22:41Z

RWLocks is now in https://github.com/MatrixAI/js-async-locks. Locking tests have moved to there too. This can now be used together here. Note that we pretty much only use RWLockWriter as it is now write-preferring to differentiate from read-preferring RWLockReader.

Standard mutex usage is also Lock.

Why use these?

Because they fit the ResourceAcquire type.

Now anything that needs to acquire a single lock is just a matter of doing:

const lock = new Lock();
withF([lock.acquire], async ([lock]) => { ... });

The lock.acquire are arrow functions so they won't lose the this context. This was important! And so any usage of withF or withG needs to deal with this.

Note that DB did a bit differently:

withF([db.transaction()], ...);

In this case, it returned the ResourceAcquire.

In the locks case, it is just acquire or acquireRead or acquireWrite. Although in our domains, we may further provide abstractions especially in terms of collections of locks that enable us to lock specific keys or ranges of keys (possibly using ES6 symbols) to ensure that we can do concurrency control beyond read-committed transactions.

Furthermore, the withG has some special behaviour https://stackoverflow.com/questions/71645327/typescript-doesnt-infer-tnext-unknown-for-asyncgenerator-function-expressi. @tegefaulkes this could be one of the problems you faced, but there's a simple fix is to just either the generator function in another line or explicitly type the TNext.

So in PK, we would bring in @matrixai/resources and @matrixai/async-locks to replace both locking and with contexts.

The @matrixai/async-init also can use @matrixai/async-locks too, but not super important.

CMCDragonkai · 2022-03-31T06:24:23Z

MatrixAI/js-db#13 allows us now to use levels arbitrarily, no need to worry about reserved characters anymore. No more base encoding of arbitrary level parts. Key parts always had the ability to have any bytes.

CMCDragonkai · 2022-03-31T12:32:14Z

Example of fixing the sameNodePerm with DBTransaction:

  @ready(new aclErrors.ErrorACLNotRunning())
  public async sameNodePerm(
    nodeId1: NodeId,
    nodeId2: NodeId,
    tran?: DBTransaction,
  ): Promise<boolean> {
    const nodeId1Path = [...this.aclNodesDbPath, nodeId1.toBuffer()] as unknown as KeyPath;
    const nodeId2Path = [...this.aclNodesDbPath, nodeId2.toBuffer()] as unknown as KeyPath;
    if (tran == null) {
      return withF(
        [
          this.db.transaction(),
          this.locks.lockRead(
            dbUtils.keyPathToKey(nodeId1Path).toString('binary'),
            dbUtils.keyPathToKey(nodeId2Path).toString('binary')
          ),
        ],
        async ([tran]) => this.sameNodePerm(nodeId1, nodeId2, tran)
      );
    }
    const permId1 = await tran.get(nodeId1Path, true);
    const permId2 = await tran.get(nodeId2Path, true);
    if (permId1 != null && permId2 != null) {
      return IdInternal.fromBuffer(permId1).equals(IdInternal.fromBuffer(permId2));
    }
    return false;
  }

Furthermore, integration of timeouts into js-async-locks also impacts the API of js-async-locks. But so it's best to merge that before going ahead with the current API of @matrixai/async-locks. MatrixAI/js-async-locks#3

CMCDragonkai · 2022-04-06T08:00:09Z

#369 discovered that withConnF isn't using the exception parameter for ResourceRelease. Since this PR brings in js-resources, will need to remember to do that here.

CMCDragonkai · 2022-04-12T05:34:42Z

3.6.3 of js-db builds in the canary check. We should catch the ErrorDBKey exception upon DB.createDB and rethrow as a PK error indicating incorrect key. We technically have something like this already in a number of places.

CMCDragonkai · 2022-04-26T00:15:50Z

TS demo lib has now been upgraded to v16. You should be able to bring it in here in this PR. Make sure it's a separate commit.

CMCDragonkai · 2022-04-26T01:20:59Z

@emmacasolin after v16 upgrades, please refer to MatrixAI/js-encryptedfs#63 in order to know how to start upgrading all of our dependencies. Start with the easiest ones like js-errors then to the harder ones like js-resources. We should meet and discuss further once you get there.

I may be upgrading our transaction system further based on lessons from EFS.

Updated pkgs.nix to a5774e76bb8c3145eac524be62375c937143b80c Updated node2nix to 1.11.0 and loading directly from GitHub Updated pkg to 5.6.0 and using pkg-fetch 3.3 base binaries Updated to TypeScript 4.5+ Updated @types/node to 16.11.7 Updated node-gyp-build to 4.4.0 Updated typedoc to 0.22.15, the .nojekyll generation is automatic Changed to target ES2021 as node 16 supports it Bin executable duct-tape in default.nix because node2nix no longer generates bin executables Updated pkg builds hashes to v16.14.2 and specified target node range

emmacasolin · 2022-04-26T02:07:30Z

I've brought in the Node 16 upgrade here. Nix shell is working fine along with npm commands (e.g. lint), however, anything involving a build is failing due to the existing WIP changes here since there are errors. Will need to check the builds at some point before this PR is merged.

CMCDragonkai · 2022-05-19T03:47:20Z

certain kinds of mocks such as import { mocked } from 'ts-jest/utils'; is now deprecated. We should look into replacing it. Maybe with jest.spyon?

This path should now be import { mocked } from 'ts-jest'. That should work.

CMCDragonkai · 2022-05-19T03:49:58Z

I think I should remove the option to compose transactions with the NotificationsManager methods. We should be taking a pessimistic approach with it. I'll have to look over it again.

The NodeManager needs some of the methods updated with the transactions. I'll fix that as well.

As explained in our chat, there's no problem with mixing PCC and OCC together. During any serialised counter update, add a PCC lock there for that specific key. This will serialise updates to the counter. If other conflicts occur due to overlapping write-set, an ErrorDBTransactionConflict can be thrown but this is by-design. SI always ends up with potentially unnecessary conflicts, it is trade off being made to ensure programming consistency is easier. The work around is to sprinkle PCC where necessary but that specialisation can be done after we setup the foundation of OCC.

emmacasolin · 2022-05-19T03:50:25Z

No you don't need to do that. You want to add metadata at the toError which is called on the client side. The client side has all the knowledge about where it is calling the remote node. You need to pass the GRPC client information to the toError call. This can be done without doing anything to fromError.

So is the nodeId of the GRPCClientClient/GRPCClientAgent the nodeId of the remote node? In that case it's easy to pass that information through for most calls, except for duplex streams where generatorDuplex is called in the service handlers as well. So we still need to acquire the nodeId from somewhere for those.

CMCDragonkai · 2022-05-19T03:52:41Z

No you don't need to do that. You want to add metadata at the toError which is called on the client side. The client side has all the knowledge about where it is calling the remote node. You need to pass the GRPC client information to the toError call. This can be done without doing anything to fromError.

So is the nodeId of the GRPCClientClient/GRPCClientAgent the nodeId of the remote node? In that case it's easy to pass that information through for most calls, except for duplex streams where generatorDuplex is called in the service handlers as well. So we still need to acquire the nodeId from somewhere for those.

The nodeId property of GRPCClientClient and GRPCClientAgent is always the NodeID of what the client is connecting to. That's the purpose of this property.

For generatorDuplex, use promisifyDuplexStreamCall. Inject the this.nodeId there as an additional parameter.

CMCDragonkai · 2022-05-19T03:56:35Z

@emmacasolin to make it a bit more generic, instead of passing just the nodeId: NodeId, you can do something like:

function promisifyUnaryCall<T>(
  client: Client,
  f: (...args: any[]) => ClientUnaryCall,
  clientMetadata?: POJO
): (...args: any[]) => PromiseUnaryCall<T> {

This would allow you to pass any arbitrary data into the ErrorPolykeyRemote.

It's a generic POJO and optional too. Alternatively don't make it optional so that way it's always enforced.

If we're changing the API, and you make a mandatory parameter, then change it to be:

function promisifyUnaryCall<T>(
  client: Client,
  clientMetadata: POJO
  f: (...args: any[]) => ClientUnaryCall,
): (...args: any[]) => PromiseUnaryCall<T> {

emmacasolin · 2022-05-19T04:02:24Z

For generatorDuplex, use promisifyDuplexStreamCall. Inject the this.nodeId there as an additional parameter.

The place I'm referring to is here for example

function nodesCrossSignClaim({
  db,
  keyManager,
  nodeManager,
  sigchain,
  logger,
}: {
  db: DB;
  keyManager: KeyManager;
  nodeManager: NodeManager;
  sigchain: Sigchain;
  logger: Logger;
}) {
  return async (
    call: grpc.ServerDuplexStream<nodesPB.CrossSign, nodesPB.CrossSign>,
  ) => {
    const genClaims = grpcUtils.generatorDuplex(call, true);
    try {
      ...

generatorDuplex calls generatorReadable, which calls toError(), so the nodeId needs to be passed through from the handler.

CMCDragonkai · 2022-05-19T04:07:44Z

For generatorDuplex, use promisifyDuplexStreamCall. Inject the this.nodeId there as an additional parameter.

The place I'm referring to is here for example
function nodesCrossSignClaim({
  db,
  keyManager,
  nodeManager,
  sigchain,
  logger,
}: {
  db: DB;
  keyManager: KeyManager;
  nodeManager: NodeManager;
  sigchain: Sigchain;
  logger: Logger;
}) {
  return async (
    call: grpc.ServerDuplexStream<nodesPB.CrossSign, nodesPB.CrossSign>,
  ) => {
    const genClaims = grpcUtils.generatorDuplex(call, true);
    try {
      ...
generatorDuplex calls generatorReadable, which calls toError(), so the nodeId needs to be passed through from the handler.

Well because all of the promisify* utilities ultimately call the generator* utilities, then when you parameterise the promisify* utilities to include the new client metadata, then you have to do the same for the generator* utilities. You're just pushing/passing down information from the top of the call hierarchy.

Also I realised that clientMetadata may be confusing, perhaps connectionMetadata? Or just metadata could work too.

emmacasolin · 2022-05-19T04:12:10Z

Well because all of the promisify* utilities ultimately call the generator* utilities, then when you parameterise the promisify* utilities to include the new client metadata, then you have to do the same for the generator* utilities. You're just pushing/passing down information from the top of the call hierarchy.

Yes that's what I'm saying. But from my previous comment:

In order for the service handlers to have access to the nodeId, host, and port, they would need to be injected with both the keyManager and proxy (they can't be injected with these values (specifically the nodeId) directly since the keyManager is not yet started when the client/agent services are created in PolykeyAgent).

There's also the issue of the nodeId/host/port needing to be defined outside of the try-catch handlers of the service handlers, since we need these values inside the catch block. This means that if there are any errors while we are trying to acquire the values they won't be handled correctly.

Ignoring the host/port since that can be retrieved from call.getPeer()

CMCDragonkai · 2022-05-19T04:41:40Z

Well because all of the promisify* utilities ultimately call the generator* utilities, then when you parameterise the promisify* utilities to include the new client metadata, then you have to do the same for the generator* utilities. You're just pushing/passing down information from the top of the call hierarchy.

Yes that's what I'm saying. But from my previous comment:

In order for the service handlers to have access to the nodeId, host, and port, they would need to be injected with both the keyManager and proxy (they can't be injected with these values (specifically the nodeId) directly since the keyManager is not yet started when the client/agent services are created in PolykeyAgent).

There's also the issue of the nodeId/host/port needing to be defined outside of the try-catch handlers of the service handlers, since we need these values inside the catch block. This means that if there are any errors while we are trying to acquire the values they won't be handled correctly.

Ignoring the host/port since that can be retrieved from call.getPeer()

Well now that we allow generic POJO data, you can just write mock data. It doesn't have to be real node ID. This is for a testing situation.

As for acquiring the values, in the catch handler. Fetching these values cannot throw exceptions. They are static values that is set by the time you have started the connection.

Even if ClientMetadata had a specific nodeId: NodeId, then you can easily mock one with IdInternal.from<NodeId>(...).

CMCDragonkai · 2022-05-20T00:33:55Z

What's the reason to encode the NodeID? Why not leave it as normal NodeId when setting it in the data?

emmacasolin · 2022-05-20T00:42:55Z

What's the reason to encode the NodeID? Why not leave it as normal NodeId when setting it in the data?

I'm adding it into the metadata of the ServiceError that gets passed into toError, and if I leave it as a NodeId then it loses its type during the serialisation/deserialisation. I could pass the metadata as a POJO instead as a second argument to toError? And that way it could stay as a NodeId. That would preserve the types of the host/port and any other data as well so I might do that.

CMCDragonkai · 2022-05-20T01:01:30Z

Yes pass it as a second argument. Don't mutate the types, ensure separation of responsibility here.

CMCDragonkai · 2022-05-20T06:06:12Z

@tegefaulkes

The usual suspects of things breaking due to node v16 are:

network
nodes
vaults

Because vaults rely on nodes, and node rely on network. Focus on fixing the network domain first, then investigate the problem with nodes (at the same time migrating to using the LockBox).

Finally the vaults requires LockBox over the object map, and try fixing the isogit package version to what we have in package.json on master branch in case the isogit introduces changes to their internals, we manipulate a bunch of stuff in isogit to get the cloning system working.

…ors properly Making use of the updated API to use the error provided to the resourceRelease.

CMCDragonkai · 2022-05-20T08:49:56Z

The dbtran was renamed to feature-dep-upgrades due to security requirements. This PR information has to be copied to the new PR.

CMCDragonkai · 2022-05-20T08:50:36Z

And also need to reset the links in zenhub.

tegefaulkes · 2022-05-20T08:59:06Z

New PR at #374. Work from there from now on.

CMCDragonkai self-assigned this Mar 28, 2022

This was referenced Mar 28, 2022

Testnet Deployment #326

Closed

Support read-after-write consistency for NodeGraph bucket operations #244

Closed

Create IPC locking library @matrixai/js-file-locks to introduce RWlocks in IPC #290

Open

CMCDragonkai force-pushed the dbtran branch from da1aef9 to bd82a13 Compare March 29, 2022 02:35

CMCDragonkai mentioned this pull request Mar 31, 2022

Enable escaping and unescaping of level parts MatrixAI/js-db#13

Merged

10 tasks

CMCDragonkai mentioned this pull request Apr 5, 2022

Integrate Error Chaining in a js-errors or @matrixai/errors package #304

Closed

12 tasks

CMCDragonkai added 10 commits April 7, 2022 18:32

WIP

78f819f

WIP

ff47a34

WIP

def75ea

WIP

b7139af

WIP

eba4ab5

WIP

1513517

WIP

56e6ca1

WIP

1450920

WIP

36d70de

WIP

9f3908c

CMCDragonkai force-pushed the dbtran branch from 46f56be to 9f3908c Compare April 7, 2022 08:32

CMCDragonkai mentioned this pull request Apr 12, 2022

LockBox - Collection structure for managing multiple locks MatrixAI/js-async-locks#5

Closed

CMCDragonkai mentioned this pull request Apr 26, 2022

Asynchronous Promise Cancellation with Cancellable Promises, AbortController and Generic Timer #297

Closed

emmacasolin changed the title ~~WIP: Integration of DB Transactions for all Domains~~ WIP: Upgrading @matrixai/async-init, @matrixai/async-locks, @matrixai/db, @matrixai/errors, @matrixai/workers, Node.js and integrating @matrixai/resources Apr 26, 2022

CMCDragonkai closed this May 19, 2022

CMCDragonkai reopened this May 19, 2022

emmacasolin and others added 4 commits May 19, 2022 16:05

feat: adding metadata of sending node to errors

a55538d

WIP fixing bin tests

b465657

feat: adding transactions to client service handlers

17d4cc0

feat: NodeId in ErrorPolykeyRemote data now encoded

e0328e0

wip: fixes for proxy tests

8dd56a0

feat: updated NodeConnectionManager's resourceAcquire to handle err…

e5c74b5

…ors properly Making use of the updated API to use the error provided to the resourceRelease.

tegefaulkes closed this May 20, 2022

tegefaulkes deleted the dbtran branch May 20, 2022 08:45

CMCDragonkai restored the dbtran branch May 20, 2022 08:47

CMCDragonkai reopened this May 20, 2022

tegefaulkes mentioned this pull request May 20, 2022

Upgrading lib dependencies and node.js version #374

Merged

40 tasks

tegefaulkes closed this May 20, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

WIP: Upgrading @matrixai/async-init, @matrixai/async-locks, @matrixai/db, @matrixai/errors, @matrixai/workers, Node.js and integrating @matrixai/resources #366

WIP: Upgrading @matrixai/async-init, @matrixai/async-locks, @matrixai/db, @matrixai/errors, @matrixai/workers, Node.js and integrating @matrixai/resources #366

CMCDragonkai commented Mar 28, 2022 •

edited by tegefaulkes

Loading

CMCDragonkai commented Mar 28, 2022 •

edited

Loading

CMCDragonkai commented Mar 31, 2022

CMCDragonkai commented Mar 31, 2022 •

edited

Loading

CMCDragonkai commented Apr 6, 2022

CMCDragonkai commented Apr 12, 2022

CMCDragonkai commented Apr 26, 2022

CMCDragonkai commented Apr 26, 2022

emmacasolin commented Apr 26, 2022

CMCDragonkai commented May 19, 2022

CMCDragonkai commented May 19, 2022

emmacasolin commented May 19, 2022

CMCDragonkai commented May 19, 2022 •

edited

Loading

CMCDragonkai commented May 19, 2022 •

edited

Loading

emmacasolin commented May 19, 2022

CMCDragonkai commented May 19, 2022

emmacasolin commented May 19, 2022

CMCDragonkai commented May 19, 2022 •

edited

Loading

CMCDragonkai commented May 20, 2022

emmacasolin commented May 20, 2022

CMCDragonkai commented May 20, 2022 •

edited

Loading

CMCDragonkai commented May 20, 2022 •

edited

Loading

CMCDragonkai commented May 20, 2022

CMCDragonkai commented May 20, 2022

tegefaulkes commented May 20, 2022

WIP: Upgrading @matrixai/async-init, @matrixai/async-locks, @matrixai/db, @matrixai/errors, @matrixai/workers, Node.js and integrating @matrixai/resources #366

WIP: Upgrading @matrixai/async-init, @matrixai/async-locks, @matrixai/db, @matrixai/errors, @matrixai/workers, Node.js and integrating @matrixai/resources #366

Conversation

CMCDragonkai commented Mar 28, 2022 • edited by tegefaulkes Loading

Description

Locking changes

Errors changes

JSON serialisation/deserialisation

ErrorPolykeyRemote

Testing

Additional context

Database changes

Node.js changes

Issues Fixed

Tasks

Final checklist

CMCDragonkai commented Mar 28, 2022 • edited Loading

CMCDragonkai commented Mar 31, 2022

CMCDragonkai commented Mar 31, 2022 • edited Loading

CMCDragonkai commented Apr 6, 2022

CMCDragonkai commented Apr 12, 2022

CMCDragonkai commented Apr 26, 2022

CMCDragonkai commented Apr 26, 2022

emmacasolin commented Apr 26, 2022

CMCDragonkai commented May 19, 2022

CMCDragonkai commented May 19, 2022

emmacasolin commented May 19, 2022

CMCDragonkai commented May 19, 2022 • edited Loading

CMCDragonkai commented May 19, 2022 • edited Loading

emmacasolin commented May 19, 2022

CMCDragonkai commented May 19, 2022

emmacasolin commented May 19, 2022

CMCDragonkai commented May 19, 2022 • edited Loading

CMCDragonkai commented May 20, 2022

emmacasolin commented May 20, 2022

CMCDragonkai commented May 20, 2022 • edited Loading

CMCDragonkai commented May 20, 2022 • edited Loading

CMCDragonkai commented May 20, 2022

CMCDragonkai commented May 20, 2022

tegefaulkes commented May 20, 2022

CMCDragonkai commented Mar 28, 2022 •

edited by tegefaulkes

Loading

CMCDragonkai commented Mar 28, 2022 •

edited

Loading

CMCDragonkai commented Mar 31, 2022 •

edited

Loading

CMCDragonkai commented May 19, 2022 •

edited

Loading

CMCDragonkai commented May 19, 2022 •

edited

Loading

CMCDragonkai commented May 19, 2022 •

edited

Loading

CMCDragonkai commented May 20, 2022 •

edited

Loading

CMCDragonkai commented May 20, 2022 •

edited

Loading