Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add a way to filter/sort the list of clients #1485

Closed
ngrigoriev opened this issue Jul 3, 2019 · 26 comments
Closed

Add a way to filter/sort the list of clients #1485

ngrigoriev opened this issue Jul 3, 2019 · 26 comments
Assignees
Labels
feat New feature or request.
Milestone

Comments

@ngrigoriev
Copy link

I believe it would be a good idea to be able to specify the filter for the client ID list (List OAuth 2.0 Clients method)

Thus, I feel that sooner or later we will need to be able to search for the client by the client ID. At least for the reasons of maintenance. I expect that the client IDs in our multi-tenant system would have some sort of structure (URN-like probably) and at least the ability to search them by a prefix or mask would be great.

It could be also useful to do filtering based on:

  • scopes
  • redirect URIs
  • owner
  • client name
  • client URI
  • audience

And where is filtering, there is sorting - right? :)

It would be also important to add a way to get the shorter version of the client object in the response - through a filter or just "verbose=true|false".

Solution description

It could be:

  • OData-like filter on the client ID or even some other
  • custom simple query format

The reason why I like OData is because, I believe, it is possible to convert it to SQL query where clause relatively easily.

Alternatives

I do not think there are any alternatives except looking directly in Hydra's database, which is something I would like to avoid.

Additional context

I am not sure how practical is to implement it. For runtime, obviously, GET with known ID is just enough. Any sort of developer portal that integrates with Hydra should also be happy about CRUD interface already available. My concern is really about the maintenance.

I am sure this kind of feature will be more important if one day Hydra supports custom client attributes as part of the client data.

@aeneasr
Copy link
Member

aeneasr commented Jul 11, 2019

Yeah I think that's generally a good idea. We're not entirely sure yet how querying should look like. I think there are some (sort of) standards for querying REST APIs. However, I also like the idea of maybe adding a GraphQL API?

@ngrigoriev
Copy link
Author

I think GrapQL's query capabilities are overkill for Hydra's relatively simple data model. This is why I was thinking that OData may be more appropriate - OData is simpler and does, I think, what is needed. It is also an old standard, rooted in OASIS (although it does smell Microsoft ;) ). There are many OData parsers out there and there are some that compile OData query and build SQL.

But even OData - in order to implement it, one needs additional indexes in most of the cases. Which may be problematic for non-relational data stores. And those are very appropriate for using with the OAuth/OIDC server. This may potentially limit how far you can go with sorting and filtering.

Also, I think that gRPC interface may be much more useful for Hydra than anything else. Apart from the public endpoints, of course, the admin APIs are most likely to be used by the backend software. And gRPC is probably the simplest thing to use today. Not to mention that many Hydra Admin API operations do map well to RPC in general. But probably you want to support both gRPC and REST (with automatic transcoding?).

So, considering all this I would probably try to come up with a simple limited query support based on OData - something similar to what Azure did for many of their APIs (like https://docs.microsoft.com/en-ca/previous-versions/azure/ad/graph/howto/azure-ad-graph-api-supported-queries-filters-and-paging-options). This approach maps well to both gRPC and REST queries, I think. And it should be easy to implement and it may be even conditional on the actual storage implementation. E.g. some queries may be refused if they require the filtering/sorting the storage cannot support.

@ysaakpr
Copy link

ysaakpr commented May 12, 2020

Any update on this. Instead of OData or GraphQL, very simple filters will help at least to narrow down the result sets. As of today, we are loading all the clients to frontend and doing client-side filter, now the number of clients have drastically increased, we are seeing load time for this.

regular equality filters will be an immediate help in this on fields like

  1. client id
  2. client name (contains check)
  3. owner
  4. metadata (if possible to do json key equality search)

Waiting for graphql/OData will be pretty heavy for this, since the number of fields where one have to query will be usually limited for clients. Can we conclude on this feature set. I could help to roll this up as quickly possible.

@ysaakpr
Copy link

ysaakpr commented May 12, 2020

Found that there couple of stores which cannot support this feature, especially in-memory stores.
We may exclude this filter and only provide this filter only for SQL Stores currently.

@aeneasr
Copy link
Member

aeneasr commented May 14, 2020

I think GraphQL is still preferred. It should also not be that hard to do I guess? Or do you not want to deal with GraphQL here?

@ysaakpr
Copy link

ysaakpr commented May 15, 2020

GraphQL makes the existing APIs not backwards compatible, and unless there is a plan to migrate all admin APIs (Which is not part of OAuth specs) to move to GraphQL, I believe just adding graphQL to list clients calls won't make sense

@aeneasr
Copy link
Member

aeneasr commented May 16, 2020

No, it would be an additional endpoint!

@ngrigoriev
Copy link
Author

+1 for additional endpoint or "clients v2" one maybe.

@ysaakpr
Copy link

ysaakpr commented May 20, 2020

@aeneasr yeah, we can go with graphql and introduce a new endpoint, which eventually can be used to serve all other admin apis.

Do you have any preferred GraphQL library, I will opt for https://github.com/graphql-go/graphql

@ysaakpr
Copy link

ysaakpr commented May 20, 2020

@aeneasr I have another concern on using GraphQL, though graphQL is very good for many API definitions, its not good for dynamic response types. For example
(graphql/graphql-spec#101) Maps of arbitrary keys are not supported in GraphQL,

In our case, let's take an example of metadata in the client, which is a map object with any arbitrary data. Which is actually difficult to model using GraphQL and have to consider as a string datatype.

Any thoughts on this ?

@ysaakpr
Copy link

ysaakpr commented May 20, 2020

We always have an option to bring up new API with GraphQL or OData at any time, but I believe that needs a concrete discussion beyond this thread to conclude. Meantime why don't we think about enhancing the existing endpoint with more filters added to it? no braking changes will be added to the existing endpoint and for the current use case, we can unblock by that.

Considering this ticket is almost a year old, a go-to option will be a gradual enhancement. I could give you a PR if we can think to enhance the existing API to support a few more additional filters. Which less than a days effort.

@aeneasr
Copy link
Member

aeneasr commented May 20, 2020

I just checked for some guidance on designing REST filter/search and found several approaches:

  • JSON in query: /api/rest/v1/products?search={"enabled":[{"operator":"=","value":true}],"completeness":[{"operator":">","value":70,"scope":"ecommerce"}]} (found here)
  • RHS Colon:/app/items?finished_at=ge:15:30&finished_at=lt:16:00 (found here)
  • this discussion which concluded in no consensus
  • LHS Brackets: 'price[gte]=10&price[lte]=100 found here
  • Search Query Param (GitHub/Google way): GET /items?q=title:red chair AND price:[10 TO 100]

As you can see, there are many, many approaches to search and we need one consistent search strategy across the ecosystem because not every product should have it's own approach to search/filter.

That's why GraphQL is intriguing. I am aware that searching for unstructured data is hard/impossible with GraphQL so yes, Metadata would not be searchable.

We always have an option to bring up new API with GraphQL or OData at any time, but I believe that needs a concrete discussion beyond this thread to conclude. Meantime why don't we think about enhancing the existing endpoint with more filters added to it? no braking changes will be added to the existing endpoint and for the current use case, we can unblock by that.

We pride ourselves with building good solutions for hard problems. Hacking something together that we'll deprecate in half a year is not an approach we take to product. We do understand that not being able to do this is frustrating, but we do need time to figure this out correctly.

@ysaakpr
Copy link

ysaakpr commented May 20, 2020

I definitely agree with you on this. There are no defined standards on REST for filtering. And GraphQL gonna definitely make it as a good option for it filtering.

As I mentioned, if you are aware of the drawback of GraphQL, that's well and good.
Do you have any plan in mind on how can we go ahead with GraphQL

  1. Library, base code for GraphQL implementation
  2. What all are the APIs we need to target for the first version of GraphQL API,

@aeneasr
Copy link
Member

aeneasr commented May 20, 2020

It's been a while since I've implemented a GraphQL API (I found it actually quite frustrating back then). Our priorities for new features are also pinned to ORY Kratos and ORY Oathkeeper right now so our team will not be able to work on this in the next months, but we do welcome contributions.

We're also rolling out support services for enterprises which will help us hire more people to work on more things and help customers prioritize features like that. If contributing is not an option it might be interesting to talk to Jared.

@jaredpreston
Copy link

Happy to have a chat with you about the different options we have for customers looking for support above and beyond the self service. Send me a dm on Slack - if you have not already, then go and join the ORY Slack chat and community over at https://slack.ory.sh/

@matthewhartstonge
Copy link

matthewhartstonge commented May 27, 2020

We have used the lessons learned from Google around API design when it comes to filtering/searching, of course they still have the generic filter param for stringy queries 😂

For example, based on the OP for clients:

  • scopes
  • redirect URIs
  • owner
  • client name
  • client URI
  • audience
GET /clients?scopes=["groups", "admin"]
    &uri="https://developers.exact-match.example.com"
    &owner="jensen huang"
    &audiences=["https://spa.exact-match.example.com"]
    &name="nvidi"
    // example range
    &start_created_at=2020-05-23T15:44:59Z
    &end_created_at=2020-05-23T16:44:59Z

Things that are known, should be enforced to be correct, for example scopes, then each param just becomes additive where clauses i.e. client must contain groups && admin.

If fuzzy search is required, i.e. client names, you could do that with a "like" based query or decide on if you want to support full text search/wildcard parsing.

One big benefit I find by using query params is that you can avoid going down the road of implementing an "OryQL" unless absolutely needed in future.

@aeneasr
Copy link
Member

aeneasr commented May 27, 2020

Thank you for the insightful comment! I thought about this a bit more and I do think that GraphQL is overkill. Most people probably want to filter per owner or per name or whatever. The problem with the example above is of course that we can't do OR queries but maybe that's just the limitation we have to live with? I do however see more issues e.g. around ranges or around LIKE queries.

Isn't there like one good spec (IETF or FANG) that addresses this in a sane way?

@matthewhartstonge
Copy link

matthewhartstonge commented May 28, 2020

Yeah, design for OR is where it gets tricky, there doesn't seem to be any specs/guides for that use case that I have seen around the place.
I mean, following on from fosite arguments you could use a suffix of _has_one for OR?

clients?scopes_has_one=["groups", "admin"]

Personally, I prefer the cleanness of the dot/period approach, but does mean potentially validating against using "one of" the query params for a single field.

clients?scopes=["groups", "admin"] // <- base field query should always imply AND
clients?scopes.or=["groups", "admin"]
clients?scopes.like="admin"

But then, feelings doth make no IETF/FANG specification.

@aeneasr
Copy link
Member

aeneasr commented May 28, 2020

Hm yeah that's really tricky. This issue has priority for me now but I do want a clean solution. I somewhat feel that GraphQL isn't the right tool here. Honestly, I don't want to write a GraphQL query just to filter per owner...

@aeneasr aeneasr added the feat New feature or request. label May 28, 2020
@aeneasr aeneasr added this to the v1.6.0 milestone May 28, 2020
@aeneasr aeneasr self-assigned this May 28, 2020
@ngrigoriev
Copy link
Author

Maybe a silly thought. If I wanted to have a perfect search, I would actually love to be able to search on the custom client properties (e.g. something in "metadata"). So, it is impossible to satisfy everyone's needs, right?

Maybe what can be done instead, creating a separate table hooked 1:1 to hydra_client with the structure defined by the configuration. The configuration would map any (supported) client parameter to a field in this table. Using JSON path expressions, for example. So, even "metadata.my.personal.field" -> "my_field:string" would work. And this is the table the new API will use to make queries and do filtering/sorting. Yes, it would not be possible to alter it in runtime but it would be possible to, say, change the mapping and rebuild this table if needed.

@aeneasr
Copy link
Member

aeneasr commented May 30, 2020

I don't think we really want to solve search with Hydra :/ This would probably be better suited when hooking up ElasticSearch with Hydra, right?

I think that filtering stuff is still pretty important. Maybe we'll just go with one of the REST standard patterns of e.g. gt: / le:� etc

@matheusmatos
Copy link

Hi guys, I'm new to Ory ecosystem and I'm still looking around to learn more about all products.

Probably, the most important field to filter is the owner, it would help on "listing your clients" when creating the developer portal.

We could implement that adding a new param in GET {HYDRA_ADMIN}/clients?owner=string.

IMHO, more complex filtering must be better handled by a third tool with some indexing, like ElasticSearch. I think that it's out of scope for Hydra. (it's just an opinion).

Also, indexing the information is relatively easy with hooks after create/update in Hydra Admin.

What you think? Thanks!

@pike1212
Copy link
Contributor

pike1212 commented Jun 3, 2021

Is anyone working on this issue? We would like a way to lookup clients by name? Something like GET {HYDRA_ADMIN}/clients?client_name=string. I can look into creating a PR.

@aeneasr
Copy link
Member

aeneasr commented Jun 11, 2021

There's currently no one working on this. A big problem with query lookups like these would be that we need to full-text index all of these fields. Also, we need to define a query strategy for all of ory. How do "AND"s work? How do "OR"s work? So far we haven't found a good solution.

So before going with a PR I think we need to figure this out first.

@pike1212
Copy link
Contributor

Can we differentiate between a more complex filtering and a simple strategy of ?client_name=string&owner=string? The only way to do this currently is to query and page through all clients then filter on the client side which is not ideal and will result in a lot of overhead for our use case.

@aeneasr
Copy link
Member

aeneasr commented Jul 12, 2021

One option would be to apply AND filtering and have, in the future, an optional ?q query string which would allow for searching. So client_name=foo&owner=bar would only return clients for owner bar and clients with name foo. This appears, as far as I can see, to be the best practice around this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feat New feature or request.
Projects
None yet
Development

No branches or pull requests

7 participants