-
Notifications
You must be signed in to change notification settings - Fork 925
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Race between untrusted operations and events websocket listener #12994
Comments
This makes sense. Although I am wondering why the |
There is a secret set in the request header or operation metadata which is manually verified in the corresponding I looked into handling the secret in the events endpoints as well, but I couldn't figure out an elegant solution for verifying it. |
@masnax ok in that case im not following what you mean by:
|
The client function which spawns the racy event listener is CreateImage. It sets up an http.Request with the secret in its header. Performing the request spawns an operation. The request that spawns the operation bypasses the the normal auth process of comparing the request peer certs to the locally trusted certificates, and instead verifies the secret, which was sent to that server beforehand.
|
So what I'm suggesting is to not leave spawning the listener until after the operation is created and we call |
@masnax ok lets have a look at a PR that does this when you get a chance. Thanks |
Filtering events by what the user can see, ala lxc/incus#292 would be helpful here. @markylaing says
|
An issue causing intermittent test failures in the clustering image refresh tests was fixed by e7b3c92
The problem was a race where an operation would complete after setting up the events websocket listener, but before the listener is added to the event handler, and so we would block while waiting for the operation complete event. The fix works by reducing the time between the creation of the listener and adding it to the handler.
While this works, I think it is a bit obscure and overly couples the client with the order of operations on the server. It made more sense to me to just ensure we set up the client and all its listeners before we make any request to create an operation, which those listeners are meant to track. While testing this over here: #12885 I found a separate race that is connected to the one solved above, resulting in this error: https://github.com/canonical/lxd/actions/runs/7920353180/job/21623203927?pr=12885#step:13:15409
This happens if we run
lxc image copy localhost:${fingerprint} lxd2: --mode=push
to copy an image across 2 remotes.Internally, we make the following
POST /1.0/images/{fingerprint}/export
tolocalhost
here. From this endpoint we make aCreateImage
request tolxd2
without a client keypair. The request is untrusted since it's run from the server, but when we addlxd2
as a remote, it only trusts the client, not the server.The race comes in here when
CreateImage
performs the request which spawns an operation onlxd2
. Around this time, it also sets up an event listener which requires trust. If the operation completes before we set up the listener then everything works fine, but if the listener manages to start up first, then the request will fail with anunauthorized
error.I think a simple solution in this case would be to just ensure that we don't spawn an events listener if the client hasn't been assigned a keypair. We can infer from this that the request we are trying to make is going to be untrusted, so we shouldn't try to separately connect to a trusted endpoint.
The text was updated successfully, but these errors were encountered: