Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

The description for GET /curation/collections needs more detail #2083

Closed
brianraymor opened this issue Mar 9, 2022 · 28 comments
Closed

The description for GET /curation/collections needs more detail #2083

brianraymor opened this issue Mar 9, 2022 · 28 comments
Assignees
Labels
design Design work

Comments

@brianraymor
Copy link

brianraymor commented Mar 9, 2022

https://app.swaggerhub.com/apis/danieljhegeman/Curation/1#/default/get_curation_collections

Fetch all Collections metadata. Authorization required for private collections.

It neither fetches all collections nor all collections metadata. Perhaps something more like:

  1. When the visibility parameter is unspecified or set to "PUBLIC", a list of all public collections is returned. The Authorization header is not required. If a collection in the list has been deleted, then it is annotated with tombstone set to True.
  2. When the visibility parameter is set to "PRIVATE", a list of all private collections that the user is authorized to access is returned. The Authorization header is required. If a collection in the list is a private revision of a public collection, then it is annotated with revision_of.
  3. Each collection in the list contains a subset of collection metadata as a preview.
  4. To retrieve full collection metadata for a specific collection, GET /curation/collections/{collection_uuid} must be used.

(danieljhegeman note: despite lengthy discussion below, these rules still stand as stated)

@brianraymor brianraymor added the design Design work label Mar 9, 2022
@danieljhegeman
Copy link
Contributor

danieljhegeman commented Mar 9, 2022

Each collection in the list contains a subset of collection metadata as a preview.

what do you mean by subset? Are you intending to contrast that with what is returned for a single collection with GET /curation/collections/{collection_id} (which should be the same)?

To retrieve full collection metadata for a specific collection, GET /curation/collections/{collection_uuid} must be used.

I should've kept reading before commenting! Ha. Is this a reference to the fact that the metadata for the associated datasets is different? "incomplete" for the former and "complete" for the latter?

@brianraymor
Copy link
Author

Is this a reference to the fact that the metadata for the associated datasets is different?

Yes. There is a preview of the dataset metadata which is part of the collection metadata.

@brianraymor
Copy link
Author

brianraymor commented Mar 9, 2022

Quick assessment:

GET collections GET collection
access_type x x
contact_email x x
contact_name x x
created_at x x
curator_name x x
datasets/assay x x
datasets/batch_condition x
datasets/cell_count x
datasets/cell_type x
datasets/curator_tag x x
datasets/dataset_assets x
datasets/explorer_url x
datasets/development_stage x
datasets/disease x x
datasets/ethnicity x
datasets/id x x
datasets/is_primary_data x
datasets/mean_genes_per_cell x
datasets/name x
datasets/organism x x
datasets/revised_at x
datasets/revision x
datasets/schema_version x
datasets/sex x
datasets/tissue x x
datasets/tombstone x x
datasets/X_approximate_distribution x
datasets/X_normalization x
description x x
id x x
links x x
name x x
published_at x x
publisher_metadata x x
revised_at x x
revision_of x x
tombstone x x
visibility x x

@danieljhegeman
Copy link
Contributor

@brianraymor I've updated the description (and the Swagger API doc).

@danieljhegeman danieljhegeman self-assigned this Jun 17, 2022
@danieljhegeman
Copy link
Contributor

@brianraymor can you confirm that you still wish for only a curator's public Collections to be returned in the case where they provide authentication but do not specify a visibility? The alternative being that both public and private Collections would be returned.

When the visibility parameter is unspecified or set to "PUBLIC", a list of all public collections is returned.

cc @Bento007

@brianraymor
Copy link
Author

Requested that @jahilton confirm agreement.

@brianraymor
Copy link
Author

I proposed something different above. Emphasis added:

Fetch all Collections metadata. Authorization required for private collections.

  1. When the visibility parameter is unspecified or set to "PUBLIC", a list of all public collections is returned. The Authorization header is not required. If a collection in the list has been deleted, then it is annotated with tombstone set to True.

Public collections do not require authorization because they're accessible to everyone.

@jahilton
Copy link
Collaborator

Agree w/ ☝️

@danieljhegeman
Copy link
Contributor

danieljhegeman commented Jun 19, 2022

@brianraymor I accidentally caused the focus to be on "all public collections" vs "just the curator's public collections" because I worded my question poorly. My mistake!

I meant for the focus to be on the following: when a curator does not specify a visibility query parameter, and they DO provide authentication, do we want to return

  • all public collections

OR

  • all public collections AND all private collections for which the curator is authorized

?

I know that you have said the first option, i.e. that we should, be default, return all public collections when the visibility is not set. But I was wondering if we should return all authorized collections instead, meaning both public and private.

I just want explicit confirmation of what you want for this requirement 🙏 👍

@danieljhegeman
Copy link
Contributor

Also, separate clarification @brianraymor - above you've listed the following two attributes to be returned in the full dataset metadata response:

datasets/X_approximate_distribution
datasets/X_normalization

Do you actually want/need these columns to be X_... or can they be x_... (as they are in the database)?

@danieljhegeman
Copy link
Contributor

Also, the dataset_assets child attribute of datasets seems redundant in its name: shouldn't it just be assets?

datasets/assets rather than datasets/dataset_assets

@danieljhegeman
Copy link
Contributor

danieljhegeman commented Jun 21, 2022

One more issue: for the Collection-level attribute access_type, which will be included in both the GET /v1/collections and GET /v1/collections/{collection_uuid}, I was thinking we might want to return None if the user does not include credentials in their request. That way, if they see "READ" as their access level, they will know "ah, yes, I included my access token and so I know my access has been evaluated correctly", but if they see None, then they will be reminded "Ah, I didn't include my access token in my request, so if I want to know my access type, I should resend the request with my access token".

Otherwise, a curator could see "READ" for a Collection that they actually have permission to write to, and they might forget/not realize that they forgot to include their access token in the request.

Alternatively, instead of None, we could return N/A or UNKNOWN or (No credentials provided) or...?

@brianraymor
Copy link
Author

brianraymor commented Jun 22, 2022

RE questions from comment

datasets/X_approximate_distribution
datasets/X_normalization

Do you actually want/need these columns to be X_... or can they be x_... (as they are in the database)?

I wonder why Leslie(?) lowercased them since there was no requirement to do so. These fields are referenced in the schema as X_ so I would like them to be returned as I specified.

I'm also struck out X_normalization in the table above because we plan to deprecate this field in Q3.

@brianraymor
Copy link
Author

RE questions from comment

datasets/assets rather than datasets/dataset_assets

I prefer your datasets/assets to eliminate the redundancy. And we're planning to rename a similar case - dataset-deployments.

@brianraymor
Copy link
Author

brianraymor commented Jun 22, 2022

RE questions from comment

@jahilton -can you review the linked comment and share your preferences?

Personally, I question the existence of access_type for introspection.

@jahilton
Copy link
Collaborator

I don't have a good handle on our use case for looking at access_type (and I anticipate having all access to everything, so especially not helpful). But from the options in that comment, I would think if you're trying to communicate that I didn't provide credentials then explicitly state No credentials provided

@brianraymor
Copy link
Author

I agree that No credentials provided offers more clarity than None.

@danieljhegeman - Is this the proposed experience?

  1. I do not include my access token in the GET /collections request. That seems odd in production code, but I'll allow it.
  2. Only public collections are returned with their access_type set to "No credentials provided".
  3. Distress. Where are my private collections and private revisions of public collections? Bang on the side of the API. What's the matter with this thing!?
  4. And then I notice the access_type values in the response and achieve enlightenment?
  5. or - I'm oblivious and decide to revise one of the public collections. It fails. Hopefully, there's helpful error message in the response that shows me the error of my ways?

I seems to me that access_type is only valuable when an access token is present. Then a curator can filter for access_type == "WRITE" as the long way round to My Collections which suggests that we should simply provide a convenience endpoint?

@danieljhegeman
Copy link
Contributor

I seems to me that access_type is only valuable when an access token is present. Then a curator can filter for access_type == "WRITE" as the long way round to My Collections which suggests that we should simply provide a convenience endpoint?

I think the write/right 😉 way to do this would be to have an access_type query param option for the GET /collections/{collection_uuid} endpoint.

@danieljhegeman
Copy link
Contributor

Distress. Where are my private collections and private revisions of public collections? Bang on the side of the API. What's the matter with this thing!? And then I notice the access_type values in the response and achieve enlightenment?

In that case, yes. It's not a panacea but it seems better than seeing "READ" under the exact same circumstances. What do you think, especially combined with the suggestion I made directly above for adding access_type as a query param?

I'm oblivious and decide to revise one of the public collections. It fails. Hopefully, there's helpful error message in the response that shows me the error of my ways?

Yes

@brianraymor
Copy link
Author

I think the write/right 😉 way to do this would be to have an access_type query param option for the GET /collections/{collection_uuid} endpoint.

Did you mean GET /collections endpoint?

@danieljhegeman
Copy link
Contributor

Thank you, yes, my mistake. GET /collections

@brianraymor
Copy link
Author

Would the only options be either "READ" or "WRITE"?

If so, would it be easier to:

  1. Deprecate access_type
  2. Use GET /collections?MyCollections which is familiar to the curators from the UX - which returns nothing without an access token and is specifically for curators since all other consumers only have "READ" access.

@danieljhegeman
Copy link
Contributor

sure, that seems functionally equivalent and, as you say, easier for curators. @Bento007 thoughts?

One minor alteration: it would return an error if the MyCollections flag is passed but without an access token. The only time the response would be empty is if the curator provides an access token but doesn't happen to have any Collections.

@brianraymor
Copy link
Author

it would return an error if the MyCollections flag is passed but without an access token

Even better.

@danieljhegeman
Copy link
Contributor

As long as Jason is ok with it. Can you get his approval? Specifically, the deprecation of access_type.

@jahilton
Copy link
Collaborator

Better yet @danieljhegeman , you can get his approval.
👍

@danieljhegeman
Copy link
Contributor

Haha @jahilton I lacked the ability to tag you in issues until the last couple of days, when Ambrose added you to the repo! 😅 😅

@danieljhegeman
Copy link
Contributor

Added the MyCollections query param flag as an issue in Fit & Finish

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
design Design work
Projects
None yet
Development

No branches or pull requests

3 participants