Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

No longer allow users to set the client ID #2911

Closed
3 of 6 tasks
aeneasr opened this issue Jan 3, 2022 · 51 comments · Fixed by #3628
Closed
3 of 6 tasks

No longer allow users to set the client ID #2911

aeneasr opened this issue Jan 3, 2022 · 51 comments · Fixed by #3628
Labels
breaking change Changes behavior in a breaking manner. feat New feature or request.
Milestone

Comments

@aeneasr
Copy link
Member

aeneasr commented Jan 3, 2022

Preflight checklist

Describe your problem

It should no longer be possible to set the Client ID.

Describe your ideal solution

The Client ID should be generated by the server.

Workarounds or alternatives

None

Version

master

Additional Context

No response

@aeneasr aeneasr added the feat New feature or request. label Jan 3, 2022
@aeneasr aeneasr added this to the v2.0 milestone Jan 3, 2022
@aeneasr aeneasr added the breaking change Changes behavior in a breaking manner. label Jan 3, 2022
@eric-poitras
Copy link

What is the rationale behind this ?

@aeneasr
Copy link
Member Author

aeneasr commented Jan 7, 2022

It’s according to spec

@tsoestini
Copy link

Is there some context for understanding this?
A link to what is changing?
A link to the spec?
Who is the user?
I don't allow my users to set the client id, but it's important that I be able to set it. I can't imagine an OAuth spec would rule that out.

@mig5
Copy link
Contributor

mig5 commented Jan 20, 2022

As far as I can see, the oAuth2.0 standard doesn't make any statement about whether the client id can be manually defined or not.

https://www.rfc-editor.org/rfc/rfc6749#section-2.2

The authorization server issues the registered client a client
identifier -- a unique string representing the registration
information provided by the client. The client identifier is not a
secret; it is exposed to the resource owner and MUST NOT be used
alone for client authentication. The client identifier is unique to
the authorization server.

The client identifier string size is left undefined by this
specification. The client should avoid making assumptions about the
identifier size. The authorization server SHOULD document the size
of any identifier it issues.

On the other hand, OAuth.com's document https://www.oauth.com/oauth2-servers/client-registration/client-id-secret/ says

The client_id is a public identifier for apps. Even though it’s public, it’s best that it isn’t guessable by third parties, so many implementations use something like a 32-character hex string. If the client ID is guessable, it makes it slightly easier to craft phishing attacks against arbitrary applications. It must also be unique across all clients that the authorization server handles.

At first I was dismayed about this idea of forcing uncustomizable client ids in Hydra, but the above paragraph has kind of turned me around. From a security and human error point of view, a random client id is stronger.

And it's true that other oAuth OPs like Github use random strings in this way.

I kind of agree with both sides. Perhaps, @aeneasr , rather than force this change, you could make it optional, even if you default to this new behavior. Allowing people to set some sort of CLIENT_ID_IS_CUSTOMIZABLE boolean config flag/env var? Or simply accept the --client-id flag/field when creating it, but generating a random one if not provided.

@tsoestini
Copy link

What is this "phishing attack" that can be carried out "against arbitrary applications"???

Phishing is a social engineering attack. It needs to be mounted against people. You can't phish an application.

Why should client_id ever be something that a human looks at and makes decisions upon so much that it can form the basis for a phishing attack?

Moreover, even if a human did make a decision based on it, how would that result in the attacker getting more information?

@mig5
Copy link
Contributor

mig5 commented Jan 20, 2022

I was just quoting an article. I didn't write it. I agree that's not phishing, the point they are making is that both client id and secret are needed to authenticate - if both are totally random instead of one potentially more-easily-guessable, that's technically more secure. (against a brute force attack - not as a phishing attack)

Anyway, not here to get emotional or argue (it's not my words anyway :) ) turning off notifications now, but interested to hear what happens in Hydra 2.0

@tsoestini
Copy link

Appreciate your context, it helps a lot. Wasn't trying to argue with you.

I just want to make sure the questions get put on the record so we can think critically about the decisions getting made.

It's quite possible there's an attack I'm missing, and, if so, I'd love to learn about it, primarily to make sure there's nothing else going wrong with how I'm going about things than just the client_id.

I very much agree with the OAuth spec section you quoted. I think it is important that the authorization server issues the client an id, that it not be secret, and that it not be used alone for identification. I also agree with your point that the spec leaves open how the authorization server assigns that ID.

@tsoestini
Copy link

I just came across this page: https://www.ory.sh/hydra/docs/next/guides/openid-connect-dynamic-client-registration/

I want to say that I support that for dynamic client registration that it should not be possible for the developer to specify their client id during registration.

My prior comments were under the assumption that this issue was about the REST API ( https://www.ory.sh/hydra/docs/reference/api/#operation/createOAuth2Client ) for creating clients on the admin (non-public) interface.

The REST API admin interface is used by our security administrators, who need to be able to define all the details of a client.

@aeneasr
Copy link
Member Author

aeneasr commented Jan 22, 2022

We will deprecate setting custom client IDs on the Admin API as well. It is already disabled on OpenID Connect Dynamic Client Registration.

The problem with letting people choose IDs is that it can lead to serious scalability issues if the ID generation mechanism is not cryptographically secure ("true" random) but e.g. sequential (1,2,3,4).

The Client ID is used everywhere and is always in need to be looked up. Depending on the data set, this can have very serious performance issues. See also: https://segment.com/blog/the-million-dollar-eng-problem/

Thus, we'll just disallow this and set the client ID to a UUID. Legacy clients will still be able to use the legacy ID but new clients won't.

@aeneasr
Copy link
Member Author

aeneasr commented Jan 22, 2022

And sorry for being so short on my last reply, I was in a hurry. Thank you for expanding on the context. So tl;dr two reasons:

  1. Performance with large number of clients
  2. The specification disallows this (for good reasons) on public registration

@tsoestini
Copy link

Okay, thanks for the context, and good to know legacy clients won't be affected. I will figure out a way to change our assumptions for new clients, so that they can be assigned random UUIDs.

As for the performance concerns, I definitely see the DynamoDB hot shard problem. Wrangled with that one myself quite a few times.

Just by way of warning: I think that simply disallowing client ID choice will not be enough to solve the DynamoDB hot shard problem.

Of particular note, in the article, Segment didn't solve their DynamoDB hot shard problem by disallowing users from picking their own user id. They solved their hot shard problem by
(a) blocking "a daily automated test against our production API that resulted in a burst of hundreds of thousands of events attached to a single userID" and
(b) "blocking any new discovered badly behaved keys."

To explain this point further, consider the actual usage pattern among our oauth clients:

  • Client 1: 55% of all requests
  • Client 2: 43% of all requests
  • Client 3: 1% of all requests
  • All other clients combined: the remaining 1% of all requests

It doesn't matter whether Clients 1 and 2 have an administrator-set id or a random UUID, whatever shards Clients 1 and 2 end up on will be hot shards.

In fact, it might be better if the administrator can set their ids. At least that way the administrator could ensure that Client 1 and 2 end up on different shards, instead of just hoping that they do.

The actual solution, though would probably be one of the following:

(1) If most access to client data is reads, then reads of client data could be passed through a Least-Recently-Used read cache. The top three clients would virtually always be cache hits, with the remaining clients competing for the other cache slots and accounting for nearly all of the cache misses, which would be okay, because those are spread across a large number of client ids, which would be spread out over the shards, so the hot shard issue would be avoided.

(2) If there are frequent client-specific writes (I'm not sure if hydra has that problem, depending on how tokens are stored?), then there probably needs to be an administrator tuning setting where specific heavy-use client-specific data (tokens?) can be split off into their own designated shards, with the ability to assign several shards to a single such client.

@aeneasr
Copy link
Member Author

aeneasr commented Jan 23, 2022

Thank you for putting this in context! It’s unlikely that external parties will be able to properly choose shard IDs in my experience. We do however plan to have shard IDs in the DB to allow choosing hot spots for admins. The primary client IDs will still not be able to be chosen by users as most clients are not sophisticated enough to understand the inner workings of the underlying database store. This feature is primarily to help us integrate Ory Hydra into Ory Cloud and run it at scale for millions pf consumers :)

@willbengtson
Copy link

Would you be able to support setting a prefix for the client ID to still allow the randomness as you desire but not being able to control the entire value, but have an ability to detect the client ID in code?

@arnolf
Copy link

arnolf commented May 16, 2022

In our platform, we synchronize oauth2 clients from a config file with a script using Hydra admin API.
It allows us to automate setup new dev/qlf environment quickly.
We need to set a specific client_id and client_secret.

Moreover, we take decision in our code based on client_id.

Can't we have a flag to enable/disable this breaking change ? Or maybe you could use a consistent hash to derivate you technical id from client_id (something like mongodb hashed sharding strategy) ?

@aeneasr
Copy link
Member Author

aeneasr commented May 16, 2022

I understand your use case - having predictable IDs is important. We might add the option to specify the ID (only UUID v4 will be allowed) on client creation if there are no collisions. However, having arbitrary text as client IDs will no longer be possible

@arnolf
Copy link

arnolf commented May 16, 2022

Ok, allow to specify the client id with an UUID format will answer to our needs to create new env.

However, to keep things simple, when we have to make decision in our code, i still think it would be better to have an human readable ID, or fields metadata on which we can based our decision.

Maybe, a great way to address this second usecase, would be to give the possibility to expose the metadata client object inside the introspect response and include it also in jwt. This will give flexibility.

@willbengtson
Copy link

client secret is the only real "secret". As long as client IDs are unique is there an issue with whether it is a UUID v4 or something else. We have the goal to be able to detect OIDC client ID and secret in source code which makes having a prefix of known pattern valuable.

@aeneasr
Copy link
Member Author

aeneasr commented Jun 14, 2022

client secret is the only real "secret". As long as client IDs are unique is there an issue with whether it is a UUID v4 or something else. We have the goal to be able to detect OIDC client ID and secret in source code which makes having a prefix of known pattern valuable.

I think prefixing is a good idea. In that case, it would apply to the secret though. A client ID should be considered public knowledge and there is no point in scanning for it!

You will be able to set client secrets in the future, so prefixing this is something that can be solved by you (for example by making sure that all calls to the /clients endpoint and enforcing a prefixed secret).

aeneasr added a commit that referenced this issue Jun 14, 2022
BREAKING CHANGE: It is no longer possible to set arbitrary client IDs. Instead, IDs have to be UUID V4 from now on. Clients created before the 2.0 release will continue to work with their legacy IDs.

Closes #2911
aeneasr added a commit that referenced this issue Jun 14, 2022
BREAKING CHANGE: It is no longer possible to set arbitrary client IDs. Instead, IDs have to be UUID V4 from now on. Clients created before the 2.0 release will continue to work with their legacy IDs.

Closes #2911
@aeneasr
Copy link
Member Author

aeneasr commented Jun 14, 2022

a3a9e64

@aeneasr aeneasr closed this as completed Jun 14, 2022
aeneasr added a commit that referenced this issue Jun 15, 2022
BREAKING CHANGE: It is no longer possible to set arbitrary client IDs. Instead, OAuth2 Client IDs must be valid UUID V4. Clients created before the 2.0 release will continue to work with their legacy IDs.

Closes #2911
aeneasr added a commit that referenced this issue Jun 15, 2022
BREAKING CHANGE: It is no longer possible to set arbitrary client IDs. Instead, OAuth2 Client IDs must be valid UUID V4. Clients created before the 2.0 release will continue to work with their legacy IDs.

Closes #2911
@aeneasr
Copy link
Member Author

aeneasr commented Jun 15, 2022

Not yet mergable, reopening

@aeneasr aeneasr reopened this Jun 15, 2022
sezanzeb added a commit to sezanzeb/docs that referenced this issue Mar 8, 2023
Setting the client id is not possible anymore (ory/hydra#2911), but the --id option was still in the docs.
@ileixe
Copy link

ileixe commented May 24, 2023

Does Oauth2Client CRD not yet support this change? https://github.com/ory/hydra-maester/blob/master/controllers/oauth2client_controller.go#L394

I'm failed to create OAuth2Client via maester for both of (1) client id missing is not allowed from the above code, (2) hydra does not accept client_id specified from this change.

I wonder what's current status.

(Edit) I found it myself: ory/hydra-maester#117

@A-Helberg
Copy link

... But for the developer doing repeated deployments of hydra as part of a larger solution, it is a big problem.

As has been mentioned, being able to set the client id is crucial for testing.
The flow for us is the same as ericb-summit mentioned.
Artifacts are built and then tested.

Not knowing the client-id at compile time means we would have to have a pre configured / running instance of hydra.
Having a single instance means that developers are scared to make changes, as they have no way to test them ahead of time (obviously it's possible, but too much effort for most people to get involved)
and their change will directly impact all testing. So a single test instance won't work for us.

We've tried to do the same (shared testing instance) with hosted providers such as (auth0 and okta).
For the reasons mentioned above, we are looking to move away from them and to ory hydry.

The best solution I can think of, is to inject the sql into the hydra DB with known client ids.
Do you foresee any pitfalls in this?

@pflipp
Copy link

pflipp commented May 30, 2023

I ended with the following:

  1. In our login/consent provider i create a new client in the before script of each test which interacts with hydra.
  2. In client applications i use oidc providers which are designed for testing (it also saves me bootstraping the login provider for testing).
  3. For test environments (monkey / QA ..) i use the injection of the sql (create the client with the api as in v1 and exec an update to switch the returned client id to the wellknown id). I really wouln't recomment this, but as this part of the infrastructure isn't critical it will not be a big thing if it breaks some day.

I really understand hydra not allowing to set internal id's, but maybe separating the external client_id from the internal id in the storage layer would be a way which allows hydra enterprise scale and letting us choose the client id? Maybe for v3?

@springroll12
Copy link

We are evaluating the Ory stack and this is a major problem. As others have mentioned not being able to specify the client id (or secret for that matter) at deploy time is a major hurdle for testing and ephemeral environments. It should also be possible to store client information in a secrets store (e.g. Hashicorp Vault) so that this information can be securely distributed to internal clients. If I understand this change correctly, we cannot do either without calling the hydra api to first generate the credentials.

I also disagree with the spec here that human-readable client identifiers are a security risk. Creating a client-id that is "more unguessable" is just security by obscurity. The entropy to prevent guessing should all be in the secret.

Could the performance problem not have been solved with surrogate identifiers? (e.g. uuid column, name column). I suppose that is what the metadata fields are for, but it is much more difficult to provision a new client repeatably this way.

@aeneasr
Copy link
Member Author

aeneasr commented Jun 9, 2023

Could the performance problem not have been solved with surrogate identifiers? (e.g. uuid column, name column). I suppose that is what the metadata fields are for, but it is much more difficult to provision a new client repeatably this way.

No, because you still need to look up the Client ID on every request. There are also other solutions to this. For example, you can add predicatbility to your tests by loading a database backup with an existing client which then has an immutable ID as it already exists!

@WilliamsDL
Copy link

Adding myself here... We absolutely need the ability to specify client id so that we can migrate out clients from our legacy system to Hydra without regenerating client id and secret. Is my solution really to have to downgrade to an old version of Hydra to accomplish the data migration, and then upgrade after? Why would you remove the ability to specify the client id? It's very important for data migration tasks.

@aaw20904
Copy link

How can I integrate Ory Hydra if the one can`t generating automatically user_ID? I will nod using this ID in cookie - I will use it in my client back-end side to access to the database.

@dexter4life
Copy link

If no client_id then it won't work with apache superset. Any other application I can use for superset apart from the keycloak?

@miran248
Copy link

oryd/hydra-maester:v0.0.28 still requires client_id, which, when specified, fails due to the It is no longer possible to set an OAuth2 Client ID as a user. The system will generate a unique ID for you., thrown by oryd/hydra:v2.1.1.
Will try to create the client via command line but that won't fly longterm.

@aeneasr
Copy link
Member Author

aeneasr commented Jul 31, 2023

We are walking back on this decision, reopening :)

@aeneasr aeneasr reopened this Jul 31, 2023
@ileixe
Copy link

ileixe commented Aug 1, 2023

Is there any available options now (e.g. incoming PR)? We're downgrading hydra when client_id is required, and it's absolutely painful 😇

@gdsmith
Copy link

gdsmith commented Aug 1, 2023

Which part is being walked back? Will we be able to set any client id or is it still going to have to be UUID based?

@aeneasr
Copy link
Member Author

aeneasr commented Aug 1, 2023

Is there any available options now (e.g. incoming PR)? We're downgrading hydra when client_id is required, and it's absolutely painful 😇

You can manually set the legacy ID in the database if you want.

Which part is being walked back? Will we be able to set any client id or is it still going to have to be UUID based?

Most likely it will be a varchar type. We'll see how we can deal with hotspots when the issue arises in our system

@mig5
Copy link
Contributor

mig5 commented Sep 19, 2023

Thank you so much for listening to the community - it is a really big deal. I am looking forward to finally upgrading to Hydra 2 with this change.

@mig5
Copy link
Contributor

mig5 commented Feb 20, 2024

The 2.2.0 release notes say that this ability to set the Client ID when creating a client, has been reinstated.

Yet running create client via the hydra v2.2.0 Docker image does not allow the --id option and it's not listed in the available options.

How do we set the client?

@mig5
Copy link
Contributor

mig5 commented Feb 20, 2024

Looks to me like you didn't add an --id option back in to the CLI at https://github.com/ory/hydra/blob/v2.2.0/cmd/cmd_create_client.go#L19-L47 ? :/

@mig5
Copy link
Contributor

mig5 commented Feb 27, 2024

@aeneasr just pinging you on the above two comments ^ is my understanding right? Will you make a release that reintroduces the ability to set --id from the CLI tool, or is it intended that it can only be done via the API (which I haven't tried yet, but assume it's possible that way)

@guptaaashutosh
Copy link

guptaaashutosh commented Jun 7, 2024

The 2.2.0 release notes say that this ability to set the Client ID when creating a client, has been reinstated.

Yet running create client via the hydra v2.2.0 Docker image does not allow the --id option and it's not listed in the available options.

How do we set the client?

I am getting the same problem.

@hughpv
Copy link

hughpv commented Jun 19, 2024

@mig5 @guptaaashutosh It looks like it's been merged, but not released yet.

#3725

@hughpv
Copy link

hughpv commented Jun 19, 2024

For what it's worth -- and I'm only using this in a test environment where I need well-known client IDs -- it looks like I can just update the id field in the hydra_client table after creating with CLI (my backend is MySQL).

Well... I should qualify that to say I tried getting a token one time and it worked...

Clearly it's an ugly workaround, but until we get proper CLI support back, it will have to do.

@guptaaashutosh
Copy link

The 2.2.0 release notes say that this ability to set the Client ID when creating a client, has been reinstated.
Yet running create client via the hydra v2.2.0 Docker image does not allow the --id option and it's not listed in the available options.
How do we set the client?

I am getting the same problem.

@aeneasr

@mig5
Copy link
Contributor

mig5 commented Jun 19, 2024

For what it's worth, I can confirm that it's possible to set the ID when creating the client when doing so via a script that interacts with the Admin API (e.g not via the CLI tool).

If anyone is using PHP and needs help, I can share an example.

@hughpv
Copy link

hughpv commented Jun 19, 2024

Thanks @mig5, I was just running into this because the CLI doesn't let you specify "access_token_strategy": "jwt" either. The request I'm building looks basically like this so far:

#!/usr/bin/env bash
curl 'http://localhost:4445/admin/clients' \
--header 'Content-Type: application/json' \
--data '{
  "client_id": "*** whatever ***",
  "client_secret": "*** whatever ***",
  "client_name": "*** whatever ***",
  "grant_types": [
    "client_credentials"
  ],
  "response_types": [
    "code"
  ],
  "scope": "*** whatever ***",
  "audience": [
    "http://api.my-app.localhost/"
  ],
  "subject_type": "public",
  "access_token_strategy": "jwt",
  "token_endpoint_auth_method": "client_secret_post"
}'

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
breaking change Changes behavior in a breaking manner. feat New feature or request.
Projects
None yet
Development

Successfully merging a pull request may close this issue.