Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Placeholder for ids is not consistent between services #39

Closed
metaschell opened this issue Feb 18, 2020 · 10 comments · Fixed by #85
Closed

Placeholder for ids is not consistent between services #39

metaschell opened this issue Feb 18, 2020 · 10 comments · Fixed by #85
Labels
API design Where we discuss any changes we want to introduce in the API

Comments

@metaschell
Copy link

The placeholders used to embed an entity id in the service URLs defined in the manifest differ between the services. These should be consistent, ideally choosing {{id}} over ${id} to avoid pitfalls in clients/language interpreting $ as replacement character within strings (which is kind of the point here as well, but still using this without explicitly providing the character, e.g. in a bash with CURL might lead to subtle and hard to spot errors...)

The preview service (section 5.1) uses {{id}} as placeholder in the service description url:
https://reconciliation-api.github.io/specs/latest/#preview-metadata

The flyout service (sections 6.1 and 6.5) uses ${id} as placeholder in the service description url:
https://reconciliation-api.github.io/specs/latest/#suggest-metadata
https://reconciliation-api.github.io/specs/latest/#flyout-services

This also affects the Manifest JSON schema defined in section A.1
https://reconciliation-api.github.io/specs/latest/#manifest-schema

@wetneb
Copy link
Member

wetneb commented Feb 18, 2020

That's also something I have found annoying. It would be great to unify this.

Also, we should make it clearer how URL escaping works in these patterns. If an id contains special characters, are they going to be escaped before being inserted in the template?

@metaschell
Copy link
Author

Escaping should be defined, especially when returning IRIs/URIs as IDs.

We are looking to implement the Reconciliation API with linked data, so the ID is always an IRI. Or rather the local name of an IRI as I understand it, as the namespace is declared in the service manifest? Which is a bit of a problem when searching through data from mixed sources which might have mixed/different namespaces...

@wetneb
Copy link
Member

wetneb commented Feb 18, 2020

If you are using entire IRIs as IDs, then you should be able to use templates without any prefix or suffix - that relies on the assumption that ids are not escaped when inserted in templates.

The identifierSpace defined in the manifest does not have any concrete influence on how ids are understood. You can just use any URI that describes your "identifier space" - it does not need to be a namespace in the sense of RDF.

On top of my head I don't remember exactly what OpenRefine currently does concerning escaping. (It's probably not worth following - we just need to think about what the specs should be in an ideal world and then update OpenRefine to follow that.)

@metaschell
Copy link
Author

That sounds like a good approach :-)

@wetneb wetneb added the API design Where we discuss any changes we want to introduce in the API label Feb 18, 2020
@wetneb
Copy link
Member

wetneb commented Apr 16, 2021

In OpenRefine we are considering introducing escaping for the ids, so that would potentially break @metaschell's use case of using full URIs as ids:
OpenRefine/OpenRefine#3730
Can we think of a solution? Ideally we should agree on something here, update the specs, and then OpenRefine would follow.

@fsteeg
Copy link
Member

fsteeg commented Apr 23, 2021

I think the approach to treat the {{id}} as a URI component as implemented in OpenRefine/OpenRefine#3731 is good. It works with the basic use case described by @metaschell in #39 (comment):

We are looking to implement the Reconciliation API with linked data, so the ID is always an IRI. Or rather the local name of an IRI as I understand it, as the namespace is declared in the service manifest?

This is also what we do in our GND service, where we use the identifierSpace https://d-nb.info/gnd/ and IDs like 118624822 to create linked data URIs like https://d-nb.info/gnd/118624822. With the URI component encoding from OpenRefine/OpenRefine#3731 this should also work for hash-based linked data URIs like https://d-nb.info/standards/elementset/gnd#AuthorityResource.

For the second part of #39 (comment) (and what @wetneb described in OpenRefine/OpenRefine#3730 (comment)):

Which is a bit of a problem when searching through data from mixed sources which might have mixed/different namespaces...

I think actually supporting full URIs as IDs is conceptually quite a different approach from what is specified so far, in particular regarding the identifierSpace. Would that be empty in a service where IDs are full URIs? The identifierSpace is quite central in data extension (https://reconciliation-api.github.io/specs/latest/#data-extension-responses):

If the property values are entities, their identifiers MUST be in the service's identifier space.

This allows the client to offer further extension on extended entities.

I also assumed that services with a common identifierSpace could provide interoperability. E.g. when working with the GND, I could try data extension with any service using the https://d-nb.info/gnd/ identifierSpace (if there were multiple in the future).

Basically my understanding of the spec is that a single reconciliation service is responsible for a single identifierSpace. So reconciliation in different/mixed namespaces would require different reconciliation services.

I'm not saying this is the only right way, and that using full URIs as IDs is something we can't do, but I think that's probably a separate, and larger issue. For the current spec, I think using {{id}} consistently and defining it as something that will be URL component encoded (as in OpenRefine/OpenRefine#3731) is good. I can look into suggesting a spec update if you agree @wetneb.

fsteeg added a commit that referenced this issue Jun 15, 2021
Update obsoleted URI syntax RFC (2396 -> 3986)

See #39
@fsteeg
Copy link
Member

fsteeg commented Jun 15, 2021

I've opened #71 for the encoding part of this issue, which can be resolved without changing the template syntax. We can then focus on the actual syntax consistency issue of {{id}} vs. ${id} vs. {id} (the latter is used in OpenAPI, which was brought up by @edgardmarx on the mailing list), and its implications, in particular how to handle breaking changes in the spec.

wetneb pushed a commit that referenced this issue Jul 20, 2021
Update obsoleted URI syntax RFC (2396 -> 3986)

See #39
@Robsteranium
Copy link

Perhaps the syntax could follow URI Template (RFC6570)? Of course the whole specification isn't required here but it at least provides a standard to work from. The expectations about escaping and interpolation are clear and well-documented. You can do stuff like {+id} if the id is already a URI.

@wetneb
Copy link
Member

wetneb commented Aug 19, 2021

Hey @Robsteranium that sounds like a really really good idea!

@wetneb
Copy link
Member

wetneb commented Mar 28, 2022

I realized last week that the update in #71 is incompatible with the OpenCorporates endpoint. They use the following view URL template: https://opencorporates.com{{id}}, and the service uses identifiers such as /companies/de/U1104_HRB22512, which then gives you URL such as https://opencorporates.com/companies/de/U1104_HRB22512. With the escaping we have introduced, the / characters in the ids get escaped, which breaks the URLs.

I think we need to follow @Robsteranium's suggestion here, because I feel uncomfortable with breaking such a use case.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
API design Where we discuss any changes we want to introduce in the API
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants