-
Notifications
You must be signed in to change notification settings - Fork 173
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
added /service-info endpoint to htsget openapi specification #493
Conversation
This PR aims to incorporate the GA4GH service info specification into htsget. Service info will enable htsget services to inform clients about what features they support. There are 3 main aspects to the proposed htsget
The proposed spec can be viewed via Swagger Editor by pasting the new openAPI yaml into the editor. |
This is fantastic @jb-adams! I've just added your version on SwaggerHub where I originally developed the previous specs so that people can interact with it directly (and also compare with previous releases): https://app.swaggerhub.com/apis/brainkod/htsget/1.3.0 As for the content of the PR itself, LGTM ;) |
@jb-adams following your lead here I've just added (as a PR onto your branch which is in turn PRed here) a markdown description of service-info. First, rough draft, needs work for sure. https://github.com/jb-adams/hts-specs/pull/1/files |
This draft implements this at a The question of what endpoint to implement a service-info response at has been discussed at some length on the htsget mailing list. This question would need to be resolved and that resolution reflected in this PR before htsget service-info can progress. |
|
I pushed some changes following the May 13th discussion:
|
I also updated https://github.com/jb-adams/hts-specs/pull/1/files although with some intentional divergences; in particular, wondering if we can indeed get away with one response schema description even though there are slight differences between them for reads vs variants. In particular, the question will arise, does the schema have to capture with full fidelity, the constraint that |
@mlin, I hear you. Would it make sense to keep those parameters common to all endpoints and return nothing or null when and if they don't make much sense for a particular file format?... we then have a more uniform, predictable interface, IMHO. Also discussed/covered in PR #495 (comment). |
@mlin @jmarshall @brainstorm @daviesrob To clarify from the last htsget meeting, we decided on tackling and closing these issues relating to the htsget spec sequentially:
@mlin is this correct? I think the first order of business is to resolve the divergences between the OpenAPI and MD incorporating |
@mlin 's markdown changes and my OpenAPI changes have been consolidated and merged. What is the next step? |
I think we're in good shape for merging this, pending any further comments from others |
|
Previously in #493 (comment) I wrote:
This has still not been resolved. As service-info is not for service discovery, it seems to me that we would have the option of implementing service-info at URLs that did not conflict with our IDs, e.g. |
Re what "artifact" should look like, summarizing prior side discussion between @jb-adams and myself. First the discrepancy was noted,
Then I wrote along similar lines as @jmarshall above,
And for now we snapped the markdown to the distinct artifacts approach, but certainly open for further discussion:
|
Because the spec states that the default
Why is it silly? The canonical values need to be registered somewhere, either in 1 attribute and 1 registry, or across 2 attributes and 2 registries. Why should we invent a separate htsget
My understanding of
Noted. My understanding is that since service info is a new feature, that constitutes a minor version bump.
Noted |
At heart, because it is not normalised (in the relational database sense).
A registry is what you have when you have multiple groups collaborating in a shared namespace. If we have a separate htsget So if we invent a third htsget datatype, the choice is between going cap in hand to TASC and asking for The other aspect of this is: what exactly is a service-info artifact:
type: string
description: 'Name of the API or GA4GH specification implemented. […]'
example: 'beacon' The name of the GA4GH specification implemented here is We can also be guided by what artifact values other groups have listed in TASC's registry.
I guess the real question a client hopes to answer by looking at an artifact value is “can I speak this protocol”. Regardless of whether an endpoint is returning reads or variants, our basic request is the same (“basic” meaning when datatype-specific fields like |
pub/htsget-openapi.yaml
Outdated
|
||
ServiceInfo: | ||
type: object | ||
'$ref': https://raw.githubusercontent.com/ga4gh-discovery/ga4gh-service-info/develop/service-info.yaml#/components/schemas/Service |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
'$ref': https://raw.githubusercontent.com/ga4gh-discovery/ga4gh-service-info/develop/service-info.yaml#/components/schemas/Service | |
'$ref': https://raw.githubusercontent.com/ga4gh-discovery/ga4gh-service-info/v1.0.0/service-info.yaml#/components/schemas/Service |
I completely agree. IMHO the best way to encourage data providers to fill in these service-info fields is by not providing a default. |
Ok, I'm good with this.
Perhaps relating to the question of I brought this up to the Discovery Networks group. I can paraphrase the conversation in greater detail later, but the prevailing sentiment is that it will be very difficult to link htsget into a federated network (via a service registry) if the endpoints are not prescriptive. The endpoints for all other GA4GH API specs have prescriptive endpoints, therefore by knowing the base URL and the |
Pushed new commit to address the following:
Made this replacement in both yaml and markdown
Removed the default assumption, stating that clients cannot assume
Removed the default assumptions, stating that clients cannot assume Bumped reference URL to Service Info YAML from
I've added a "GA4GH Service Registry" section in the Markdown (below GA4GH service-info), outlining what has been tentatively discussed in this group about how to register htgset services within a registry. It outlines that |
I'm with @jb-adams analysis and feedback from the discovery group: having concrete endpoints is simpler and in the event of coming up with other endpoints I don't see that much overhead on explicitly defining it. |
@brainstorm We had some back-and-forth on that point above & gravitated toward the one artifact, two datatype (for now) approach. I'm a bit reluctant to reverse at this point, but let us know if you feel strongly about it (I don't think anyone was forcefully arguing one way or the other). Thx |
htsget.md
Outdated
The [GA4GH Service Registry API specification](https://github.com/ga4gh-discovery/ga4gh-service-registry) allows information about GA4GH-compliant web services, including htsget services, to be aggregated into registries and made available via a standard API. The following considerations SHOULD be followed when registering htsget services within a service registry. | ||
|
||
* Endpoints for different htsget data types should be registered as separate entities within the registry. If an htsget service provides both `reads` and `variants` data, both endpoints should be registered. | ||
* The `url` property should reference the API's `service-info` endpoint for a single data type (i.e. an htsget reads API registration should provide the URL to the reads `service-info` endpoint, an htsget variants API registration should provide the URL to the variants `service-info` endpoint). Clients should be able to assume that by replacing the URL's "service-info" string with an object id, they will hit the corresponding `reads`/`variants` endpoint. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are we happy with this assumption? I might be missing something but:
- If we're reserving
/service-info
so that it's not allowed as a sample ID, then it seems it would be fine to leave that off instead of making the client snip it. - If we're saying we could use this to declare the service info endpoint as something else like
/service-info2
so that the server can still have a sample ID=service-info
, we should specifically advise that client is supposed to strip whatever the last URL component is, and on the whole it seems unfortunate to make this an htsget-specific caveat)
My opinion is still that the first approach is a simple and basically acceptable way forward. If in future we motivate a change to service registry so that it admits a query string for service-info then I think we could relax the reservation constraint at that time.
(comments welcome from all, tagging @jmarshall @brainstorm @jb-adams)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm +1 for the first approach. Registering the base url (ie. without "service-info") is much more in line with the service info spec and the networks team's plans for federating services of mixed type. Requiring that "service-info" be reserved for this endpoint is simple and will lead to better interoperability.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I totally second Jeremy.
- collapsed htsgetReadsServiceInfo and htsgetVariantsServiceInfo into htsgetServiceInfo
…sEffective - removed explicit statement saying it has no effect in variants endpoint
Co-authored-by: John Marshall <John.W.Marshall@glasgow.ac.uk>
…stry considerations - removed 'htsget-reads' and 'htsget-variants', instead using a 'datatype' parameter to indicate 'reads' or 'variants' - removed default assumptions when service-info parameters not provided
following the Oct 27th htsget meeting, we closed #495 . The above commits involved rebasing |
e3f8ced
to
93c598e
Compare
Thanks @jb-adams, I'm 👍 to merging this |
htsget.md
Outdated
"type": { | ||
"group": "org.ga4gh", | ||
"artifact": "htsget", | ||
"version": "1.2.0" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will this (and the second one a dozen lines below) need to be bumped to 1.2.1
?
I believe I got all references to the htsget spec version and changed them to |
Htsget implemented service-info in PR samtools/hts-specs#493.
addition of
/service-info
endpoint to htsget OpenAPI specifcation