Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add query param flag for retrieving collections by curator name and visibility in GET /collections index endpoint #2793

Closed
danieljhegeman opened this issue Jun 29, 2022 · 29 comments
Assignees

Comments

@danieljhegeman
Copy link
Contributor

danieljhegeman commented Jun 29, 2022

August 8: There was agreement to remove this issue from the curation API MVP. It's not required for the 3.0.0 migration and can be considered in parallel with the symmetrical curation UX request..


Update: See curation-api. access_type MUST also be removed from responses and the Swagger API.


see #2083 (comment)

Add a query param MyCollections flag (or similar) that causes GET /collections to return only Collections which are owned by the curator making the call.

@brianraymor @jahilton should visibility be specified? Or should the usage of this flag default to returning all of a curator's Collections, both public and private? **edit**: see comment below

@danieljhegeman
Copy link
Contributor Author

@brianraymor I'm seeing a comment on the other issue from you said

Use GET /collections?MyCollections which is familiar to the curators from the UX - which returns nothing without an access token and is specifically for curators since all other consumers only have "READ" access.

I'm assuming this means then that we should include both public and private Collections in the response (since it's mimicking the UI endpoint that curators are familiar with, which displays both).

@brianraymor
Copy link

brianraymor commented Jun 29, 2022

I'm assuming this means then that we should include both public and private Collections in the response (since it's mimicking the UI endpoint that curators are familiar with, which displays both).

That is correct.

To be more precise, it should return all public collections and all authorized private collections.

@danieljhegeman
Copy link
Contributor Author

To be more precise, it should return all public collections and all authorized private collections.

Wait, this is different than what is returned when visiting My Collections on the UX: the UX shows public collections but only the public collections which belong to the user (and of course it shows the private collections as well).

@brianraymor
Copy link

Doh. Writing too fast.

Yes - all collections private or public that I'm authorized to see. Lead curators see everything. Other curators see only their collections.

@danieljhegeman
Copy link
Contributor Author

danieljhegeman commented Jun 29, 2022

I'm still unclear on whether that means a non-super curator should get back, in addition to their private Collections...

all public Collections

OR

just their own public Collections.

😅

@brianraymor
Copy link

brianraymor commented Jun 29, 2022

just their own public collections - MyCollections like a filter collections by curator.

@brianraymor @jahilton should visibility be specified? Or should the usage of this flag default to returning all of a curator's Collections, both public and private? edit: see comment below

That's an interesting question for @jahilton since he has recently requested filtering in the MyCollections UX. See Curator can filter "My collections" by curator and status.

@jahilton
Copy link
Collaborator

If I'm understanding the Q correctly...
Querying MyCollections via the API, I would expect to see the same results that I see at https://cellxgene.cziscience.com/my-collections, so both public and private.

@danieljhegeman
Copy link
Contributor Author

Yes, both public and private, but insofar as the public Collections are concerned, I presume it should only be the public Collections which the curator owns/has write access for. For super curators this is all public Collections. For non-super curators this is just their own public Collections.

@jahilton
Copy link
Collaborator

it should only be the public Collections which the curator owns/has write access for

👍

@brianraymor
Copy link

Returning to my question -

@jahilton - would you prefer this to be an endpoint (rather than a query parameter) with similar filters to what you've requested in the UX?

GET /MyCollections?curator="X"&visibility="REVISION"

or some such ?

@danieljhegeman
Copy link
Contributor Author

Good question, Brian.

@jahilton
Copy link
Collaborator

jahilton commented Jul 1, 2022

@brianraymor thanks for clarifying. Yes, I would prefer that endpoint with the similar filters to specify.

@danieljhegeman
Copy link
Contributor Author

Re the related issue of filtering by curator in the UI

@Bento007 if we were to do this via the API we would have to figure out how we would implement an ability for the curator to specify which curator they want to retrieve collections for. This makes me think that, despite my comments on the linked issue, we'd actually prefer to have this be a UI feature since it would drastically simplify the problem of implementing 'text search -> single Auth0 user', which is a whole different can of worms (correct me if I'm wrong).

@danieljhegeman danieljhegeman self-assigned this Jul 7, 2022
@danieljhegeman danieljhegeman removed their assignment Jul 22, 2022
@Bento007 Bento007 self-assigned this Aug 8, 2022
@danieljhegeman
Copy link
Contributor Author

danieljhegeman commented Aug 9, 2022

@jahilton @brianraymor is it considered acceptable if the curator submits a query param "curator_name='Daniel'" or something like that, and then we return all collections that contain "Daniel", case-insensitive, in the collection's curator_name? And we'd be ok with a query string "curator_name='daniel hegeman'" failing to return collections for which the curator_name is daniel j hegeman (note middle initial)? cc @Bento007

@jahilton
Copy link
Collaborator

jahilton commented Aug 9, 2022

I expect to query all Collections in the data corpus, and browsing those, I will see the curator names.
I would like to take any one of those curator names (as returned by the API) & plug it into my query param.
I expect it to fail (or return 0 results) if the value I put in my query param does not exactly match one of the curator names in the data corpus.
I don't know if I would specifically object to the partial string search behavior as I don't anticipate running into it. But it is unexpected behavior.

@brianraymor
Copy link

Per my comment from yesterday's very long slack thread::

I don't see why I would need search when it's simple to return a list of curators for an API user to choose from. Here's the current set:

{'Batuhan Cakir',
'Erica Marie Rutherford',
'Jason Hilton',
'Jennifer Yu-Sheng Chien',
'Kirtana Veeraraghavan',
'Rachel Schwartz',
'Wei Kheng Teh'}

For example:
GET /curator_names -> returns the list above
Pick a name and:
GET /collections?curator_name="Rachel Schwartz"&visibility="PRIVATE"

or per Jason's remarks - extract one out of previous query.

@brianraymor brianraymor changed the title Add query param flag for retrieving only collections owned by the curator for GET /collections index endpoint Add query param flag for retrieving collections by curator name and visibility in GET /collections index endpoint Aug 9, 2022
@danieljhegeman
Copy link
Contributor Author

danieljhegeman commented Aug 9, 2022

Yep, this sounds like a reasonable plan. I had been thinking we wanted a more sophisticated functionality for a single call. But this works if it's what you want. Thank you guys for the comments!

@brianraymor
Copy link

Adding comments from the linked PR:

This API is targeted at super curators. Currently, they are authorized to see all collections, so a filter is proposed as a convenience.

The example below assume that a valid curator name is passed as the parameter.

Here's how I imagine the experience:

User

  1. GET /collections returns all the public collections.
  2. GET /collections?curator_name="John Smith" returns all the public collections curated by John Smith. The use in this scenario is doubtful.
  3. GET /collections?visibility="PRIVATE" returns UNAUTHORIZED.
  4. GET /collections?visibility="PRIVATE"&curator_name="John Smith" returns UNAUTHORIZED.

Curator

  1. GET /collections returns all the public collections.
  2. GET /collections?curator_name="John Smith" returns all the public collections curated by John Smith. Perhaps the curator is looking for best practices from another curator.
  3. GET /collections?visibility="PRIVATE" returns all the curator's private collections.
  4. GET /collections?visibility="PRIVATE"&curator_name="John Smith" returns UNAUTHORIZED if the curator is not "John Smith". Otherwise, it's equivalent to 3. The use in this scenario is doubtful.

Super Curator

  1. GET /collections returns all the public collections.
  2. GET /collections?curator_name="John Smith" returns all the public collections curated by John Smith.
  3. GET /collections?visibility="PRIVATE" returns all private collections in the corpus.
  4. GET /collections?visibility="PRIVATE"&curator_name="John Smith" returns all private collections in the corpus curated by John Smith.

@danieljhegeman
Copy link
Contributor Author

@Bento007 so, according to the above, we're getting rid of self?

@brianraymor
Copy link

brianraymor commented Aug 24, 2022

I moved this back to Ready for Staging until Jason and I can review and test.

@Bento007
Copy link
Contributor

This was decided in slack that self would be removed because only super curators will be using this query parameter. They have permission to query for their name already.

@brianraymor
Copy link

Thanks for clarifying. The swagger example ("self") made me wonder.

Screen Shot 2022-08-24 at 9.36.38 AM.png

@brianraymor
Copy link

brianraymor commented Aug 24, 2022

@Bento007 - initial testing is looking good, but I have some questions.

  1. If I'm a user or curator, is the expected status code 400 (Bad Request)? If so, then this is not documented in the swagger. I thought we were being more consistent about forbidden operations?
  2. If I pass a non-existent curator name as the parameter ("Mister Peanut"), then it appears that the operation succeeds and returns 0 collections. Is that by design? If so, @jahilton - any concerns?

@Bento007
Copy link
Contributor

I'll change it to forbidden.

I would only be concerned if "Mister Peanut" had more than 0 collections, and we returned 0. Otherwise, that is by design.

@brianraymor
Copy link

There's a slight difference in status codes returned:

User -> 401
Curator -> 400

@jahilton
Copy link
Collaborator

If I pass a non-existent curator name as the parameter ("Mister Peanut"), then it appears that the operation succeeds and returns 0 collections.

That would be my expectation

@brianraymor
Copy link

@Bento007 - the results look good for super curator. I can filter collections by curators for both public and private collections. Once the status code is addressed, then this is ready for production.

@Bento007
Copy link
Contributor

@brianraymor what request did you send that returned a 400?

@brianraymor
Copy link

Hmm. Earlier, I was seeing it only for the curator role notebook. I did a restart and clear output on the kernel. Now, I'm seeing 401(s) for both curator and user.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants