Skip to content

API client authentication and token granting #1194

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 20 commits into from
Jun 29, 2022
Merged

API client authentication and token granting #1194

merged 20 commits into from
Jun 29, 2022

Conversation

plotnick
Copy link
Contributor

@plotnick plotnick commented Jun 10, 2022

Support OAuth 2.0 Device Authorization Grant for client (e.g., CLI) authentication. Remaining TODOs:

Dependents:

@plotnick plotnick marked this pull request as ready for review June 23, 2022 19:31
@plotnick plotnick requested review from davepacheco and smklein June 23, 2022 19:32
@davepacheco davepacheco mentioned this pull request Jun 23, 2022
69 tasks
Copy link
Collaborator

@davepacheco davepacheco left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is an impressive piece of work! I have a bunch of comments below but most are aimed at clarity and error handling. Thanks for doing this!

-- a token is granted.
-- TODO-security: We should not grant a token more than once per record.
CREATE TABLE omicron.public.client_authentication (
client_id UUID NOT NULL,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be great to have a Big Theory Statement-style comment about the client id, device code, user code, etc. It doesn't have to be here -- it could be in the Rust code somewhere. (edit: I suggested putting it in nexus/src/app/client_api.rs)

If there's something like the Beer Drinker's Guide to SAML for the Oauth device flow, that might work too.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, basic flow outline added to app/device_auth.rs in c9839ba.

time_created TIMESTAMPTZ NOT NULL
);

-- Matches the primary key on client authentication records.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you add: "this is critical for ensuring that no more than one token is ever created for a client authentication attempt"? Like we talked about my fear here is that someone won't realize what's load-bearing here (e.g., that if they change this index in some ways, they might break that at-most-one-for-the-other-table property)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, fixed in 139c47f. Further schema revisions to be taken as a follow-up (#1285).

@@ -0,0 +1,75 @@
// This Source Code Form is subject to the terms of the Mozilla Public
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we talked about this a bit but I do find the naming here a little confusing. We have lots of clients and lots of tokens, and browsers are clients that authenticate with tokens too. Might this be better called "oauth_device.rs" with oauth_device_authenticate() etc? I might apply that to the tables too.

I realize this great fodder for bikeshedding and we can always change it later instead but I figured I'd ask.

edit: see also the comment in datastore.rs

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed. c9839ba renames client_apidevice_auth, which I think is clearer. Function and table names updated as well.

use omicron_common::api::external::{CreateResult, LookupResult};
use uuid::Uuid;

impl super::Nexus {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Above this block (or part of the module-level doc comment) might be a good place to give an overview of the oauth device flow and how these functions are used. It may also be nice to have a sentence or two for each function saying which part of the flow it's used in.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep, done in c9839ba.

@@ -111,6 +113,12 @@ pub enum SchemeResult {
Failed(Reason),
}

/// A context that can look up a Silo user's Silo.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What do you think of renaming this to SiloUserGetSilo or something? (I know I overuse "context" but this feels like less of a "context" than it was when it was in spoof.rs.)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done in bd8da5f (SiloUserSilo).

.expect_status(Some(StatusCode::BAD_REQUEST))
.execute()
.await
.expect("client_id required to start client authentication flow");
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure if you're trying to check the error message here, but this won't do that.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, it was just poorly worded. Fixed in bbab057.

},
};

let model = nexus.client_authenticate(&opctx, params.client_id).await?;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this going to do the right thing in terms of the error response form? Say the database is down. It seems like this will generate an Err(Error::ServiceUnavailable), which will get turned into a dropshot::HttpError, which I think we're not supposed to be returning from this function?

If that's true, even if we fix it here, it seems easy to accidentally re-introduce that bug. I wonder if we should have the body of these handlers call a Rust function that returns only Response<Body> (no Result). We could document why -- we need to format all the responses ourselves to ensure compliance.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Excellent catch, thank you. I'd like to take this as a follow-up, recorded as #1286.

&ClientAuthentication::from_model(model, host),
)
};
// TODO: instrumentation doesn't work because we use `Response<Body>`
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why doesn't the instrumentation work with that?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Excellent question; I don't actually know! I copied the comment from console_api.rs, but will investigate as part of #1286.

@plotnick plotnick merged commit e4126a1 into main Jun 29, 2022
@plotnick plotnick deleted the client-authn branch June 29, 2022 00:17
Copy link
Contributor

@ahl ahl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

a couple of questions as I try to integrate these changes into the generated client.

#[endpoint {
method = POST,
path = "/device/token",
content_type = "application/x-www-form-urlencoded",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm late to this (sorry) but I thought endpoints using application/x-www-form-urlencoded were not going to appear in the OpenAPI spec for omicron. Did I misunderstand?

If our client is the consumer of this, why isn't it using application/json?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't have strong opinions on whether these should appear in the spec or not; we can set unpublished = true if you think that's a good idea. Personally I think they should appear, but don't necessarily need full body descriptions if that's not easy. The TS generator, for instance, just skips non-application/json endpoints, which I think is reasonable, since they'll be used only by OAuth client libraries, not directly by our client.

The reason for not using application/json is that the OAuth protocol requires application/x-www-form-urlencoded request bodies, and I would like us not to reinvent that particular wheel. OAuth clients exist for every popular language & framework, and the idea is that we can use off-the-shelf, well-tested client implementations instead of rolling our own n times. That is, we could implement and document our own not-quite-compatible protocol, but I don't see any concrete advantage to doing so. OAuth is not a great protocol, but it's been widely analyzed and implemented, and I think if we're not going to use it, we should have good reasons why we think our own version would be better, more secure, etc. This PR implements only the Device Authorization Grant flow, but we can layer more things on it, such as short-lived access tokens with long-lived refresh tokens. RFD 275 attempts to provide some of this rationale, and is probably the right place for a serious discussion of this issue.

Finally, I suppose it's fair to ask why we should write our own OAuth server but not the client libraries. To this I would respond that (1) we only need one server implementation, whereas for the clients we'd need n clients with specialized token, error, retry, and timeout handling, and (2) to the best of my understanding, we want there to be a single source of truth server-side (CockroachDB) and a single piece of software (Nexus) that handles all client requests and talks to that database, provides authz for it, etc. I did look into off-the-shelf OAuth servers, but none of them seemed suitable for our needs. OTOH, the client libraries are readily available and easy to integrate.

method = POST,
path = "/device/auth",
content_type = "application/x-www-form-urlencoded",
tags = ["hidden"], // "token"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems important for OAuth, but can we set unpublished = true since I don't think we expect other consumers to use this?

@plotnick plotnick mentioned this pull request Jul 13, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants