Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add logic back to try to reconnect to the app websocket if socket is closed #279

Merged
merged 15 commits into from
Oct 28, 2024

Conversation

matthme
Copy link
Contributor

@matthme matthme commented Oct 14, 2024

Addresses #271

@matthme matthme requested review from ThetaSinner and jost-s October 14, 2024 21:45
Copy link
Contributor

@jost-s jost-s left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor changes, please. LGTM, thanks!

src/api/app/websocket.ts Outdated Show resolved Hide resolved
src/api/client.ts Outdated Show resolved Hide resolved
test/e2e/common.ts Outdated Show resolved Hide resolved
// typescript forgets in this promise scope that `this.url` is not undefined
const socket = new IsoWebSocket(this.url as URL, this.options);
this.socket = socket;
socket.onerror = (errorEvent) => {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can this tell the difference between an invalid token and other kinds of error?

It seems like the user experience wouldn't be great if requests always result in a reconnect that fails when a temporary token was in use. If we can tell the token is invalid here, it would be good to clear it so that the client stops trying to reconnect

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With an invalid token what happens is that you get the "Client disconnected with pending requests" error and this onError callback here is never called in the first place.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't actually understand where the "client disconnected with pending requests" occurs...but if I log something in that onError callback it's not getting logged when it fails due to an invalid token.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Interesting. Well Holochain won't send back anything to tell the client that the connection is going to be closed, it'll just close the connection from its end. Where is that error thrown, can/should it be caught?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I confirm that his happens. It comes in as a close event on the websocket with code 1005 which means "No status received". It's indistinguishable from another unexpected close event which usually is 1005 or 1006. For this reconnect we could register another close handler that unsets the token, but it's not 100 % certain that the close was due to an invalid token.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's a specific InvalidTokenError now which only happens on failed automatic websocket reconnect. If the reconnect fails, the stored token is deleted. No risk of this happening in a loop, because the reconnects are only attempted as part of a request being made by the client consumer.

test/e2e/index.ts Outdated Show resolved Hide resolved
@@ -97,7 +97,8 @@ export const withConductor =
};

export const installAppAndDna = async (
adminPort: number
adminPort: number,
nonExpiringToken?: boolean
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it make sense to call this parameter singleUseToken to stay close to the conductor API naming?

Copy link
Contributor Author

@matthme matthme Oct 15, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And then have it be true by default? We can do that yes. It's single-use and expiring...I could also make two parameters for expiry and single use.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I changed it to take two separate parameters now to be aligned with the conductor API naming: d6a7db7

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That works yes, just expose the API with defaults. I was thinking that single use could have a timeout and when it's false then it could have no expiry. I wouldn't be opposed to the client exposing the API that way because those are the two expected configurations really

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Though it's probably useful to let people just have access to these

);

test(
"client fails to reconnect to websocket if closed before making a zome call and does not end up in a loop",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This isn't quite testing what I was concerned about with looping. It's repeatedly reconnecting with an invalid token that I was concerned about. If the caller isn't careful about which error is which then they might treat this error as retryable and keep trying to reconnect. I'd prefer if this test could make two zome calls and the second one got a different error message back asking the caller to provide new credentials

Copy link
Contributor Author

@matthme matthme Oct 15, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, configuring the message to ask for new credentials doesn't work I think due to the "client closed with pending requests" error that doesn't even let us define the error message here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe I can look into handling this close event better in the first place.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But I'm not sure whether we should conflate this with this PR

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is no way of modifying the error in a way to ask the user to provide a valid token. Any zome call with an invalid token will now lead to the following error:

ClientClosedWithPendingRequests: client closed with pending requests - close event code: 1005, request id: 1

And this seems to be a result of how the conductor handles things and cannot be affected at the js side as I understand it. This error however is distinguishable by a caller from a "normal" zome call error if required so I don't think there's more we can do here without touching the conductor behavior.

@jost-s jost-s requested a review from ThetaSinner October 17, 2024 20:11
@@ -142,18 +63,22 @@ export class WsClient extends Emittery {
static connect(url: URL, options?: WsClientOptions) {
return new Promise<WsClient>((resolve, reject) => {
const socket = new IsoWebSocket(url, options);
socket.onerror = (errorEvent) => {
socket.addEventListener("error", (errorEvent) => {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

addEventListener used throughout now, with one-off listeners for all calls returning a promise.

// Message just needs to be sent first, no need to wait for a response or even require a flush
resolve(null);
// Wait before resolving in case authentication fails.
setTimeout(() => {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If authentication fails, it'll still return a resolve. Waiting here prevents that. It's not a universal fool-proof way, ideally a ping can be sent after authenticating or Holochain returns an error or closes the websocket with a specific error code.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great stuff, thanks for polishing this 👍 . This idea did not occur to me.

@@ -233,6 +200,144 @@ export class WsClient extends Emittery {
this.index += 1;
}

private registerMessageListener(socket: IsoWebSocket) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unchanged, just separated into a function and moved within the file.

this.socket.addEventListener(
"open",
async (_) => {
const encodedMsg = encode({
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would have loved to call authenticate again here, but that's so convoluted with the sendHandler parameter and the requesterTransformer and _request and promiseTimeout, it's head-wrecking. Would be brilliant to simplify that part of the client.

@@ -422,6 +422,28 @@ test(
})
);

test(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Test unhappy websocket authentication path.

"error should be of type HolochainError"
);
assert(error instanceof HolochainError);
t.equal(error.name, "InvalidTokenError", "expected an InvalidTokenError");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For the future - create an enum with all possible error kinds #281

@@ -96,6 +96,7 @@ export const promiseTimeout = (
return res(a);
})
.catch((e) => {
clearTimeout(id);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Without clearing the timeout, failed requests were waiting for the duration of the timeout (60 seconds) before returning.

@@ -77,6 +82,7 @@ export class AppWebsocket implements AppClient {
CallZomeResponseGeneric<Uint8Array>,
CallZomeResponse
>;
private readonly appAuthenticationToken: AppAuthenticationToken;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't believe this is very high risk, but I don't feel comfortable not pointing this out. Storing this token may expose it to an injected script. If it is exfiltrated then it is re-usable elsewhere.

I much prefer the single-use tokens but I understand that this is being done for UX.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is, and it's similar to the zome call signing credentials. They're equally stored in JavaScript memory for convenience, which compromises security to a degree.

const invalidTokenCloseListener = (
closeEvent: IsoWebSocket.CloseEvent
) => {
reject(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we're rejecting because of an invalid token, should this.authenticationToken not be cleared here?

Copy link
Contributor

@jost-s jost-s Oct 28, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, absolutely! d3d8412

t.assert(error instanceof HolochainError);
assert(error instanceof HolochainError);
t.equal(error.name, "InvalidTokenError", "expected InvalidTokenError");
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If at this point you call it again, I'd expect to get a "token is required", rather than "invalid token error"?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This only calls once. What are you referring to by "you call it again"?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see now this is the wrong test to ask this question. I'm looking for a test where we

  • connect with a valid token that is single use,
  • lose the connection then try to make a zome call, but now have an invalid token,
  • so we get a invalid token error
  • then try to make another zome call and we should get a "no token" error or similar because the reconnect can't be attempted

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is that what the test below with the comment "Websocket reconnection has failed and subsequent calls should just return" is doing?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, yes, exactly. The subsequent call is attempted after not handling the invalid token error. In a real app, the invalid token error must be handled and a new token be requested before attempting another call.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sweet, okay. I'm caught up now and my original concern with this change has been addressed. Thank you :)

@jost-s jost-s enabled auto-merge (squash) October 28, 2024 19:31
@jost-s jost-s merged commit d6b8893 into main Oct 28, 2024
11 checks passed
@jost-s jost-s deleted the fix/try-reconnect-if-closed branch October 28, 2024 19:35
matthme added a commit that referenced this pull request Oct 29, 2024
jost-s pushed a commit that referenced this pull request Oct 29, 2024
* backport #279

* updated changelog

* workflow back to main branch
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants