feat: implement RFC 8628 #3851

nsklikas · 2024-09-30T06:56:04Z

Implements the Device Authorization Grant to enable authentication for headless machines (see https://datatracker.ietf.org/doc/html/rfc8628)

Related issue(s)

Implements RFC 8628.

This PR is based on the work done on #3252, by @supercairos and @BuzzBumbleBee. That PR was based on an older version of Hydra and was missing some features/tests.

We have prepared a spec, that describes our design and implementation. We have tried to mimic the existing logic in Hydra and not make changes that would disrupt the existing workflows

Checklist

I have read the contributing guidelines.
I have referenced an issue containing the design document if my change
introduces a new feature.
I am following the
contributing code guidelines.
I have read the security policy.
I confirm that this pull request does not address a security
vulnerability. If this pull request addresses a security vulnerability, I
confirm that I got the approval (please contact
security@ory.sh) from the maintainers to push
the changes.
I have added tests that prove my fix is effective or that my feature
works.
I have added or changed the documentation.

Further Comments

Notes:

The current implementation has been manually tested only for memory and postgres databases. The tests pass all of them.
Fosite is installed from our fork to ease testing. Once the relevant PR in fosite is merged, we will update go.mod.

Testing

To test this you need to built the hydra image:

make docker

This will create an image with the name: oryd/hydra:latest-sqlite

To run the flow you can use our UI, from https://github.com/canonical/identity-platform-login-ui/tree/hydra-device-test:

git clone git@github.com:canonical/identity-platform-login-ui.git -b hydra-device-test
cd identity-platform-login-ui/
# The image name is hard-coded in the docker-compose file
docker compose up --remove-orphans --force-recreate -d

Create a client for Hydra:

docker exec -it identity-platform-login-ui-hydra-1 hydra create client   --endpoint http://localhost:4445   --grant-type authorization_code,refresh_token,urn:ietf:params:oauth:grant-type:device_code --scope openid,offline_access,email,profile --token-endpoint-auth-method client_secret_post

Use that client to perform the device flow:

docker exec -it identity-platform-login-ui-hydra-1 hydra perform device-code --client-id <client-id> --client-secret <client-secret> -e http://localhost:4444 --scope openid,offline_access,email,profile

The user for logging in is:

username: test@example.com
password: test

CLAassistant · 2024-09-30T06:56:11Z

All committers have signed the CLA.

supercairos · 2024-10-16T15:00:51Z

You kept the user_code & device_code in separate tables ?
I thought It could be merge with the flow table but might be tricky to do (lots of SQL constrains to manage)

Otherwise, it's great work! Would love to see this land into Hydra as IMO it's a much needed feature :)

zepatrik · 2024-10-30T16:15:18Z

One thing I am struggling to understand is the device_challenge parameter. On one side, I cannot find any mention of that in the RFC, on the other side I don't see how it would be available in all variants of the device flow. Sure, when using the verification_uri_complete from the device authorization response, it is easy to set additional query parameters. However, the device auth flow also has to work when only the user code is entered into a generic website, like on https://youtube.com/tv/activate

I just noticed the design doc you linked, should have taken a look there first.

zepatrik · 2024-10-31T08:34:34Z

OK, I think I now cleared up the confusion I had about the device_challenge.
IMO it is not really necessary because it would also work to send the user straight to the UI implementation, and let the /admin/oauth2/auth/requests/device/accept create the flow. The relevant information at this point is always in the user-code, and only the UI provides that. However, I see how it fits better into the existing code and architecture, so I do like the proposal 👍

nsklikas · 2024-11-01T10:18:18Z

Apparently the tests are broken because of https://github.com/ory/fosite/pull/827/files#diff-b92270a81f4021a9cdf52dfcfaeac9b66254471b85fd5ef4101acdbad02e4296R161, not sure if it's a bug or if the hydra tests need to be updated. @zepatrik any tip on how to fix this?

zepatrik · 2024-11-05T08:57:27Z

@nsklikas I have fixed the upstream fosite issue.

I also thought about the overall flow a bit more, and I think this flow is a bit better wrt database strain (writes and storage). Also it is less complex from the bigger picture, but I agree that it might be more complex to implement. Did you consider something similar, and what are your thoughts? I would especially prefer the user and device codes to be in one table, so we can rely on the database to ensure the link between the two.
The big difference is that we would create the flow already when the device initializes everything, and then reuse it for the user as soon as we have the user code.

Also happy to discus synchronously.

nsklikas · 2024-11-06T14:26:49Z

We considered merging the 2 tables (I think it was also discussed in the older PR). Merging the 2 tables would complicate the schema (you would need to have 2 expiration periods, 2 active fields, more indexes, etc) and we would have to create new logic to handle calls to this table (now we re-use the logic that is used for all the other tokens), of course that is not a blocker but would require a decently sized refactor of this PR. We would also have to merge the 2 fosite APIs (DeviceCodeStorage and UserCodeStorage), again not a big blocker. AFAICT by merging the 2 tables we would be making 1 less read to the database (we wouldn't need to fetch the device_code in performOAuth2DeviceVerificationFlow) and one less write (we could invalidate the user_code and mark the device_code as ready to be used in a single query). The main drawback with the current approach is that the 2 tables (user_code and device_code) are not directly merged, instead we use the requestID to connect them. The reason we decided not to go that route is that we thought that the performance benefit does not out-weight having a uniform experience with the existing fosite APIs and hydra database, but I can see the value of changing this.

About persisting the flow table to the database at the beginning of the flow, we would be doing one less redirect (we could send the user directly to the login UI, they wouldn't need to go through Hydra). But we would have to:

persist the flow to the database every time a device starts a flow (1 write)
fetch the flow every time a user_code is accepted (we wouldn't update it on the db, because we want to allow the user to to restart the flow)
update the database constraints to handle the device flow status
This was something we considered, but decided that we wouldn't be gaining much for these extra call/changes. What is the reason you think we should change this?

I would rather we keep the design as it is, because we think that these changes wouldn't improve the current flow much (I understand that, depending on the load, merging the 2 tables can offer considerable improved performance) and it would complicate the implementation. Ideally I would rather we do not change the design of the current PR (unless you think that there is something wrong with it), to get something going and to avoid getting lost on the many changes that it introduces. We can always iterate on it on subsequent PRs, BUT I understand that if we want to change the database schema (by merging the tables), it would be best to do it as early as possible to avoid having to create a migration plan.

nsklikas · 2024-11-06T14:40:42Z

About persisting the flow table to the database at the beginning of the flow, we would be doing one less redirect (we could send the user directly to the login UI, they wouldn't need to go through Hydra). But we would have to:
1. persist the flow to the database every time a device starts a flow (1 write)

2. fetch the flow every time a user_code is accepted (we wouldn't update it on the db, because we want to allow the user to to restart the flow)

3. update the database constraints to handle the device flow status
   This was something we considered, but decided that we wouldn't be gaining much for these extra call/changes. What is the reason you think we should change this?

I now realize that we wouldn't be avoiding the first redirect, as we want to setup csrf protection. I don't think I see the value of making this change (referring only to writing the flow in the database when creating the user code).

zepatrik · 2024-11-06T15:31:36Z

Thanks for revisiting, I was just adding some follow-up clarifications.

TL;DR we would like to do the refactor to have one-table for the codes, everything else looks good.

We had even more discussions also with @alnr and basically came to these conclusions:

We need to write the device & user codes into a table when creating them, mainly to avoid collisions. Ideally this would be one table (further device_auth_codes) that is only used for this purpose. The table should have the PK (nid, device_code_sig) and a secondary unique index (nid, user_code_sig). This table will be polled by the device using the device code.
The flow should not be persisted in the beginning by the device, but used only by the user browser and persisted after successful completion.
Once the user code is used, we should mark it as such and release it by setting user_code_sig=null. We can then include the device code signature in the flow. The main reason here is that we can make sure the code is only used once.
However, this is not a strong opinion and just what we thought would be the better option. Making the code reusable for error cases would probably reduce some friction in the UX. Open to discuss.
There seems to be no value in adding the device_verifier. We propose to remove that. The existing CSRF token should be sufficient to ensure that a flow was completed in the same browser it was started. The reason here is so that we can persist the flow state. Makes sense now.
The "accept user code API" should be an admin API (as it is now), for flexibility and consistency reasons.

Overall, the refactor to use only one table should be worth it right away. We can also help out with the refactor if necessary.

nsklikas · 2024-11-07T14:54:29Z

Thanks for the quick reply.

Just to be clear, afaict refactoring the database affects both fosite and hydra. In fosite we should merge the DeviceCodeStorage and UserCodeStorage APIs into one. In hydra we should first agree on a table schema and then do the changes needed.

Regarding the unique index, I am not sure if this is the right approach. We can have a PK with (nid, device_code_sig). But on the user_code index we can't have an index that involves the nid. The reason for this is that when the user accepts the user_code we have no other information about the flow, we don't know who initiated or what parameters were used (in my mind the user_code must have a one-to-one mapping to original request). If we want to support having an index like the one proposed, we would need to have different user_code verification URL per nid. E.g. for nid=42 the verification url would be https://hydra.example.com/oauth2/device/verify/42 and for nid=17 the verification url would be https://hydra.example.com/oauth2/device/verify/17. I have to say that I don't really like this, especially taking into account that the user may have to manually write the URL down on their browser.

IMO the migration should look something like the following:

CREATE TABLE IF NOT EXISTS hydra_oauth2_device_auth_codes (
    device_code_signature          VARCHAR(255) NOT NULL PRIMARY KEY,
    user_code_signature          VARCHAR(255) NOT NULL,
    request_id         VARCHAR(40)  NOT NULL,
    requested_at       TIMESTAMP    NOT NULL DEFAULT NOW(),
    client_id          VARCHAR(255) NOT NULL,
    scope              TEXT         NOT NULL,
    granted_scope      TEXT         NOT NULL,
    form_data          TEXT         NOT NULL,
    session_data       TEXT         NOT NULL,
    subject            VARCHAR(255) NOT NULL DEFAULT '',
    device_code_active             BOOL         NOT NULL DEFAULT true,
    user_code_state             SMALLINT         NOT NULL DEFAULT 0,
    requested_audience TEXT         NULL DEFAULT '',
    granted_audience   TEXT         NULL DEFAULT '',
    challenge_id       VARCHAR(40)  NULL,
    # device_code and user_code share the same lifespan
    expires_at         TIMESTAMP    NULL,
    nid                UUID         NULL,

    FOREIGN KEY (client_id, nid) REFERENCES hydra_client(id, nid) ON DELETE CASCADE,
    FOREIGN KEY (nid) REFERENCES networks(id) ON UPDATE RESTRICT ON DELETE CASCADE
    PRIMARY KEY (device_code_signature, nid)
);

CREATE INDEX hydra_oauth2_device_auth_codes_request_id_idx ON hydra_oauth2_device_auth_codes (request_id, nid);
CREATE INDEX hydra_oauth2_device_auth_codes_client_id_idx ON hydra_oauth2_device_auth_codes (client_id, nid);
CREATE INDEX hydra_oauth2_device_auth_codes_challenge_id_idx ON hydra_oauth2_device_auth_codes (challenge_id);
CREATE INDEX hydra_oauth2_device_auth_codes_user_code_signature_idx ON hydra_oauth2_device_auth_codes (user_code_signature);

(Disclaimer: I haven't tested this, so there may be some issues with it)

The reason there is a user_code_state field is that the user_code has 3 states, that is active(0), used(1) and revoked(2). Alternatively we could create 2 columns user_code_active and browser_flow_completed, but I think that having a single column is better.

To optimize the user_code_signature index, we could create a unique partial index (this should work on postgres and cockroach, we probably can find a workaround for mysql):

CREATE UNIQUE INDEX hydra_oauth2_device_auth_codes_user_code_signature_idx ON hydra_oauth2_device_auth_codes (user_code_signature) WHERE user_code_state = 0;

This should be a big performance improvement on the current db queries.

@zepatrik I suggest that as a way forward:

If you agree with what I described for fosite, I start making the changes in fosite (shouldn't be much work). The hydra PR is blocked by the changes on fosite, so IMO this should be top priority.
We try to come up with a consensus on what the table schema should be, if this proves hard to do over comments/chat we can set up a call
I start making the changes on Hydra, unless something unexpected comes up I think it should be straight forward and I should be able to handle it on my own. If I face something blocking or it turns out that I need to refactor more parts of the code, I will ask for your help/contribution.

zepatrik · 2024-11-08T15:28:07Z

I agree with the dependency here. I think now would be a good time to set up a call and discuss the schema in detail. I'll reach out via email.

zepatrik · 2024-11-11T16:36:22Z

We agreed on the proposed table schema. This is the refined and detailed version of the flowchart:

driver/config/provider.go

oauth2/handler.go

supercairos · 2024-11-14T14:00:34Z

We agreed on the proposed table schema. This is the refined and detailed version of the flowchart:

At the end of the User-Browser login flow part, It's good to redirect the user to a "post device login webpage" so the user knows he can safely close his Browser webpage and that the device is authenticated succesfully.

I don't see this being mentioned, Is this handled ?

zepatrik · 2024-11-14T14:37:00Z

At the end of the User-Browser login flow part, It's good to redirect the user to a "post device login webpage" so the user knows he can safely close his Browser webpage and that the device is authenticated succesfully.

Yes, this is part of the "redirect to URL from accept consent response". There will be a config setting for that page.

Instead of updating the device session, we were over-writing it causing existing session info that were created from fosite to be lost.

nsklikas · 2024-11-18T15:28:02Z

I updated the database schema and refactored the logic a little. I tested it on postgres with ~1M rows and it looks like all queries are using the indexes:

@zepatrik please have another look when you can

nsklikas requested review from aeneasr, hperl and alnr as code owners September 30, 2024 06:56

nsklikas changed the title ~~Implement RFC 8628~~ feat: Implement RFC 8628 Sep 30, 2024

nsklikas changed the title ~~feat: Implement RFC 8628~~ feat: implement RFC 8628 Sep 30, 2024

nsklikas mentioned this pull request Sep 30, 2024

Implement RFC 8628 ory/fosite#826

Open

6 tasks

bateller approved these changes Oct 16, 2024

View reviewed changes

nsklikas force-pushed the canonical-master branch 2 times, most recently from 87c6315 to 349743e Compare October 18, 2024 13:42

nsklikas force-pushed the canonical-master branch from 349743e to e0b066f Compare October 25, 2024 12:03

nsklikas force-pushed the canonical-master branch 3 times, most recently from 20555bd to 140f75e Compare November 1, 2024 09:44

nsklikas force-pushed the canonical-master branch from 140f75e to b7767f9 Compare November 6, 2024 14:23

zepatrik reviewed Nov 12, 2024

View reviewed changes

driver/config/provider.go Show resolved Hide resolved

oauth2/handler.go Outdated Show resolved Hide resolved

oauth2/handler.go Outdated Show resolved Hide resolved

nsklikas mentioned this pull request Nov 12, 2024

refactor: refactor storage API canonical/fosite#26

Open

nsklikas and others added 28 commits November 18, 2024 17:12

refactor: move logic to updateSessionWithRequest method

73b9543

fix: rename device auth endpoint handler

da682ad

feat: add device user verification handler

ec9a15d

fix: implement device user verification logic

e47ca85

feat: update flow

6a11a7c

fix: add post device auth handler

21517e9

feat: add consent handler for accepting a user_code

bd8fcf7

chore: add post_device_done to config schema

fb38b6a

chore: add e2e tests

b3a8c62

feat: token request handling for device flow

7d0930f

chore: update config

d8a86da

fix: fix the OIDC token and refresh token issue for device flow

4f7bd9f

fix: update OpenID Connect session after user consent

a074d6f

fix: add GetDeviceCodeSessionByRequestID method

7d6d6f3

fix: return client_id to post_device page

799e7bd

fix: update existing device session

f88bcc2

Instead of updating the device session, we were over-writing it causing existing session info that were created from fosite to be lost.

fix: update tests

efe9fd0

fix: add device auth endpoint in discovery metadata

83ac5e4

fix: make device grant lifetimes configurable

fa7d3c4

test: update sql fixtures

2e70195

fix: perform device flow from CLI

35dd1c5

fix: wrap db calls in transaction

3dd0859

chore: fix license

d0cfa42

chore: update sdk

d3ef058

fix: duplicate user_code update

25c6d1f

refactor: merge user and device code tables

0f9025c

fix: create openid session when log in succeeds

93e49bf

refactor: update device session persistence logic

0e55f5c

nsklikas force-pushed the canonical-master branch from b82cbb8 to 0e55f5c Compare November 18, 2024 15:13

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: implement RFC 8628 #3851

feat: implement RFC 8628 #3851

nsklikas commented Sep 30, 2024 •

edited

Loading

CLAassistant commented Sep 30, 2024 •

edited

Loading

supercairos commented Oct 16, 2024 •

edited

Loading

zepatrik commented Oct 30, 2024 •

edited

Loading

zepatrik commented Oct 31, 2024

nsklikas commented Nov 1, 2024

zepatrik commented Nov 5, 2024

nsklikas commented Nov 6, 2024

nsklikas commented Nov 6, 2024

zepatrik commented Nov 6, 2024 •

edited

Loading

nsklikas commented Nov 7, 2024 •

edited

Loading

zepatrik commented Nov 8, 2024

zepatrik commented Nov 11, 2024

supercairos commented Nov 14, 2024

zepatrik commented Nov 14, 2024

nsklikas commented Nov 18, 2024

feat: implement RFC 8628 #3851

Are you sure you want to change the base?

feat: implement RFC 8628 #3851

Conversation

nsklikas commented Sep 30, 2024 • edited Loading

Related issue(s)

Checklist

Further Comments

Testing

CLAassistant commented Sep 30, 2024 • edited Loading

supercairos commented Oct 16, 2024 • edited Loading

zepatrik commented Oct 30, 2024 • edited Loading

zepatrik commented Oct 31, 2024

nsklikas commented Nov 1, 2024

zepatrik commented Nov 5, 2024

nsklikas commented Nov 6, 2024

nsklikas commented Nov 6, 2024

zepatrik commented Nov 6, 2024 • edited Loading

nsklikas commented Nov 7, 2024 • edited Loading

zepatrik commented Nov 8, 2024

zepatrik commented Nov 11, 2024

supercairos commented Nov 14, 2024

zepatrik commented Nov 14, 2024

nsklikas commented Nov 18, 2024

nsklikas commented Sep 30, 2024 •

edited

Loading

CLAassistant commented Sep 30, 2024 •

edited

Loading

supercairos commented Oct 16, 2024 •

edited

Loading

zepatrik commented Oct 30, 2024 •

edited

Loading

zepatrik commented Nov 6, 2024 •

edited

Loading

nsklikas commented Nov 7, 2024 •

edited

Loading