Skip to content

Hyper-schema: Alternative forms for listing link relationships #124

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
awwright opened this issue Nov 3, 2016 · 25 comments
Closed

Hyper-schema: Alternative forms for listing link relationships #124

awwright opened this issue Nov 3, 2016 · 25 comments
Assignees
Labels
Milestone

Comments

@awwright
Copy link
Member

awwright commented Nov 3, 2016

For a variety of cases, it's often simpler to use:

{
links: { "self": { href:"/document/{id}" } }
}

or:

{
links: { "stylesheet": [ {href:"style.css"}, {href:"base.css"} ] }
}

HAL and other JSON-based hypermedia formats follow a similar convention.

@handrews
Copy link
Contributor

handrews commented Nov 3, 2016

I'm definitely in favor of this. I always have to translate the links into something like this inside code anyway because searching a list for the right rel value is really awkward.

@slurmulon
Copy link

I only see one minor downside to this, and it's consistency in a different sense - when links can only be an Array, you don't have to check for special cases (like is it an Object instead). This should be generally considered in the user's Hyper-Schema library that parses LDOs and the user should not have to worry about it at all, but I honestly don't even know of a Hyper-Schema library that does this (for instance, I had to write my own LDO parser because I couldn't even find a Hyper-Schema library for looking up or parsing LDOs)

@slurmulon
Copy link

My minor concern also applies to rel - in one case it's the LDO's key, in another case it will be defined in rel. This makes writing parsers more difficult and bug prone, but I think it's just a consideration that people should be aware of. Until is a more robust JSON Hyper-Schema consumer library that provides a simple API for finding and resolving LDOs is available, the user will always have to consider these complications themselves.

@handrews
Copy link
Contributor

handrews commented Nov 3, 2016

My minor concern also applies to rel - in one case it's the LDO's key, in another case it will be defined in rel. This makes writing parsers more difficult and bug prone

No, rel is always the key. It's never in the LDO as written out. The only cases to handle are whether there is one LDO attached to rel (the value is an object) or several (the value is an array of LDO objects). This can be normalized as the hyper-schema is read in.

@handrews
Copy link
Contributor

Having given this some thought and looked at other media types more, I think I am reluctantly against this.

  • RFC 5988 allows multiple space-separated "rel"s per link. While we could use keys with space-separated URI references, that's much more awkward than allowing a JSON array for a "rel" keyword in the LDO
  • URI templating for "rel" as requested in #56 is a really useful feature that would be much easier to manage if "rel" is kept separate (this allows us to potentially use a "relVars" in the manner of "hrefVars" and "baseVars" from Eliminate template pre-processing #142 if it is accepted)

It's easy enough to internally parse things into a dict but, at least based on RFC 5988 link-to-rel is a many-to-many relationship and the current LDO format is better for that.

@Relequestual
Copy link
Member

Sorry. LOD?

@handrews
Copy link
Contributor

handrews commented Dec 1, 2016

@Relequestual LDO == Link Description Object, a.k.a. the elements of the "links" array.

@Relequestual
Copy link
Member

Right. I understand what an LDO is now. yay.

I'm struggling to work out what problem this issue is raising or trying to solve. Could the problem be rephrased with a clear problem statement, with an explanation of the suggested fix please?

@handrews
Copy link
Contributor

@Relequestual The "links" keyword takes an array of LDOs because some link relation types allow a greater cardinality than 1. For instance, you may have many links of relation "stylesheet" because you are applying styles from multiple sources. You may have many links of relation "type" because your resource instantiates multiple types. etc.

At first glance, an array looks difficult to work with. If the link relation name is the important thing about finding which link to use, why not just make it an object keyed by link relation name? I think everyone starts out looking at the "links" keyword asking that- I know I and my colleagues on my last project did.

We were thinking of link types that were designed to appear only once, or if they did appear multiple times, each link was anchored to a separate point in the JSON representation (for example, representing a collection as an array of ids, and associating an "item" link with each id- there are many "item" links, but they just need to be specified once in the schema).

So I feel strongly that while using link relation names as a key seems nice at first, it is a trap and we should stick with the existing list approach. Otherwise, we need to do annoying things like have each link relation name take either an LDO or a list of LDOs (annoying to process) or make each link name map to a list of LDOs, even if that list often only has one LDO in it (annoying on general principle).

Does that help?

@Relequestual
Copy link
Member

OK. Thanks for the explanation.

Unless we can get some reasoning from the other projects as to why they chose to do it that way, and why they consider it's preferable to our current approach, I'm against making this change. Following the crowd is sometimes the way to go, but only if there is valid reason beyond simply following for followings sake.

@handrews
Copy link
Contributor

I've since come around to the array approach. It more closely matches formats such as HTTP's link header.

The main argument I can see in favor of the object approach is that most of the situations involving multiple links with the same relation type are handled by URI templating for us.

However, if an application wanted to generate an LDO list at runtime that is already fully resolved, then having it be in list format would make it easy to produce such a thing.

I'd like to make a decision on this one way or another, as it's a pretty fundamental thing for parsers to work with.

@awwright or anyone else still want to advocate for this change?

@handrews handrews added this to the draft-07 (wright-*-02) milestone Sep 14, 2017
@dlax
Copy link
Member

dlax commented Sep 14, 2017

I've since come around to the array approach. It more closely matches formats such as HTTP's link header.

On the other hand, HTTP headers could not be represented as objects so I'm not sure how far the comparison can go.

The array model always looked strange to me, probably because other hypermedia formats use an object as mentioned above (interestingly, in JSON Home "links" is an object). I don't know the reason for this in other formats, but do we know the reason behind our array model? I guess one reason is that rel is not required, but this will likely change (#393).

@handrews
Copy link
Contributor

rel was required up to and including draft-04, so that's not the reason.

@handrews
Copy link
Contributor

I'm not dead set against changing this, but as @Relequestual notes it shouldn't just be because it seems a bit more convenient at casual glance. Sadly, I don't think anyone still here was involved in that decision.

@philsturgeon @geemus @dret any thoughts on this one?

I really think that if we do make it an object, then the value of each property needs to be an array to avoid the annoying "is this an object or array" check while working with the structure in memory. Is that really better? Maybe it is, I'm persuadable.

But since I feel like draft-07 will be the first really-close-to-feature-complete version of Hyper-Schema, I'd like to get this nailed down one way or the other. Hopefully we will start seeing implementations now, and I don't want to thrash on the fundamental data structure.

@dret
Copy link
Contributor

dret commented Sep 14, 2017

this is too far into specific details for me to contribute to. but we had similar discussions when resolving dret/I-D#73. the fun fact underneath is that even though the HTTP Link syntax allows multiple rels, the actual model defines this as being a shortcut for defining multiple links (mnot/I-D#245). this might be something to take into account when working on your model and syntax (which you should make sure to keep neatly separate so that you always know what you're discussing).

@handrews
Copy link
Contributor

Thanks, @dret! That all makes sense. It also ties into #350 (how to handle multiple rel values) and #319 (allow $ref for link description objects). One of the main reasons for #319 was as a way to attach different rel values to the same link description object (LDO).

Having links be an object mapping a relation to an LDO optimizes 1:1 relation:templated link. If it's an object mapping a relation to a list of LDOs, we support 1:*. But *:1 and : are not well-supported. Arrays can't be object keys in JSON, and splitting on whitespace in a structured format like JSON is unappealing.

However, if you can $ref an LDO into the array value, then you can $ref the same LDO under multiple link relation keys, so there's your : support. And it doesn't require figuring out an allOf equivalent for LDOs either (which is where #319 got bogged down). So that's a plus.

I think the biggest minus is that LDOs were clearly designed to be usable outside of JSON Hyper-Schema, which is why there is a separate links.json meta-schema for an LDO in isolation. That would no longer be as usable in that form, as the rel, which is a critical piece, would be missing. I suppose we could address that by changing links.json to be the meta-schema for the entire object, with keys and values. There is a nice feeling about each LDO being completely self-contained on its own, though. Hmm....

@geemus
Copy link
Collaborator

geemus commented Sep 15, 2017

@handrews The one-link-per-rel case does seem more common, and referencing by rel-keys is easier than selecting across rel values for all objects in and array. It is a big change for the sake of convenience, which may or may not be good. It does seem like a good idea to say that if it changed from an array of objects to an object, that it should be an object of arrays (instead of having to do the tedious and easy to forget object or array logic). We definitely locally enforce "just always array so that things are easier" in our usage where there are other object-or-array cases.

TLDR: this change sounds nice. I fear the result is a bit sugary/magical though and is perhaps not consistent with other parts of the schema which tend to be rather explicit. I can definitely imagine it being easier to use/reason about though, so I'm torn.

@handrews
Copy link
Contributor

... so I'm torn.

Yeah, me too.

If we don't reach a fairly clear consensus on changing this I'm probably inclined to leave it as-is.

If we want LDOs to each be self-contained, then they should stay individual objects in an array with a rel field. If we're OK with the minimum unit of working with LDOs to be a object with a link relation key and the rest of the information in the value array of objects, then we can change.

The current way definitely feels nicer for working with a single link, and reasoning about the individual LDO as an RFC 5988bis-style link serialization format.

@handrews
Copy link
Contributor

handrews commented Sep 18, 2017

I'd really like to decide this for draft-07. Does anyone want to advocate for making the change? @awwright? @Anthropic?

One thing that might help is to look at how a minimal single link, or multiple links, would look.


For a single, minimal link:

{"rel": "related", "href": "/foos"}

vs

{"related": [{"href": "/foos"}]}

For two links with the same relation:

[
    {"rel": "related", "href": "/foos"},
    {"rel": "related", "href": "/bars"}
]

vs

{
    "related": [
        {"href": "/foos"},
        {"href": "/bars"}
    ]
}

Adding a third link with a different relation:

[
    {"rel": "related", "href": "/foos"},
    {"rel": "related", "href": "/bars"},
    {"rel": "author", "href": "/people/{authorId}"}
]

vs

{
    "related": [
        {"href": "/foos"},
        {"href": "/bars"}
    ],
    "author": [
        {"href": "/people/{authorId}"}
    ]
}

@Relequestual
Copy link
Member

As a number of people are indifferent about this change, it feels like wasted effort to continue discussing it till someone can come up with clear reason why it's "better". It may look slighly cleaner, but I'm more interested in the effect it would have on implementations, which is some.

For now I'd be inclined to close the issue. If anyone can strongly argue why it would be an improvement, please do so, and we can re-open or create a new issue.

@handrews
Copy link
Contributor

@Relequestual that is a really good point about implementations, and has made me think things through again a little differently. I'm going to line up the pros and cons that I see and make a (possibly overly complicated) strawman proposal:

"links" as an array of flat LDO objects

PROS

  • the links# meta-schema describes a single LDO as a flat, usable-as-stand-alone object
  • this sort of object is easy to move around in memory without losing the link relation type
  • at least one other JSON link serialization, linksets, uses this approach (with a multi-valued array for rel), and serialization of HTTP link headers ends up looking more like this, per JFV
  • inertia :-)

CONS

"links" as an object keyed by relation type, with array-of-LDO values

PROS

  • it's more common in other JSON serializations: JSON Home, JSON HAL, JSON API, OpenAPI
  • it enables re-use of the same link with different relation types using $ref without needing allOf (see Allow "$ref" for LDOs #319 for why allOf is challenging in this context)
  • it enforces requiring a relation type more strictly than marking rel as required

CONS

  • other types of re-use (such as re-using the same href+hrefRequired+hrefSchema combination) are no easier that they are now
  • it would be easy for the relation to become detached from the in-memory LDO
  • The links# meta-schema would presumably need to be for the entire object in order to include the relation type, which is fine for sets of links, but slightly awkward for a single link

Strawman proposal

We could try a best of both worlds approach, optimizing for hyper-schema author convenience but with guidance on implementation best practices. This would look something like:

  • "links" would be an object keyed by the relation type, with array values
  • It would be RECOMMENDED that implementations store the link relation type inside of the in-memory LDO, no matter how links are looked up. Note that this is not unlike what implementations are likely to do today, which is build a hash object that looks like the proposed "links"-as-object format for easy lookup.
  • For this reason rel would be reserved in the LDO (the object inside the array values). Schema authors MAY include rel, but if it differs from the relation in the object key, implementations MUST raise an error.
  • the stand-alone links# meta-schema would still include rel as a required field, and just be for a single LDO.

Thoughts? I'm still not sure what I support, but I'm trying to figure out what factors should go into the decision. I agree with @Relequestual that if no one finds an argument for the change compelling, we should leave it as-is.

@awwright awwright self-assigned this Sep 19, 2017
@dlax
Copy link
Member

dlax commented Sep 20, 2017

@handrews I like your strawman proposal and I'd thus be in favor of a change.

@handrews
Copy link
Contributor

I've encountered some other arguments for stand-alone links elsewhere, and we are already changing a lot of things in hyper-schema right now. I am also increasingly leaning against changing this. I would like to defer it for consideration once we have people looking at hyper-schema more seriously, which hopefully this draft will accomplish.

@handrews handrews modified the milestones: draft-07 (wright-*-02), draft-future Oct 16, 2017
@handrews
Copy link
Contributor

Coming back to this, I'm still inclined to stay with the array. While I can't find the right issue comment right now, the Web of Things group ended up going with an array after initially considering an object. And they have a lot more people with a lot more hypermedia experience contributing to their spec than we do :-)

The biggest problem that I see with the array is that it's impossible to reliably address links with a JSON Pointer, should we want to do so. Position-based indexing is inherently fragile when the position has no semantic meaning. In such situations, people often re-order the LDOs for whatever aesthetic reasons (e.g. documentation presentation).

On the other hand, having a defined (if not semantically meaningful) order is very useful for documentation presentation.

An alternative to JSON Pointer addressing could be to extend the ability to declare a plain-name fragment to links. I'm not sure how this would need to work with the core spec and media type definition. I'd be reluctant to allow $id itself in the LDO, as the current expectation is that anything with an $id is a schema object.

Setting aside the mechanism (we can hammer that out if we actually think this is a good idea), being able to give an LDO a plain-name fragment identifier would allow it to be addressed independent of location within the document. Potential use cases for this include:

  • Re-use of complete or partial LDOs (Allow "$ref" for LDOs #319)
  • Along the lines of the OpenAPI Link Object (and Operation ID) some sort of vocabulary that identifies workflows (follow this link, then that link, etc.)

Those use cases aren't strong enough to justify this so far, but I'm putting it out there to hopefully spur some thought.

@handrews
Copy link
Contributor

It's been more than six months since I said I was leaning against changing this, and nearly a month since my last comment. Which has much more to do with issue #319 ($ref for LDOs) and maybe issue #350 anyway.

No one has emerged to champion a clear alternative, so I'm closing this out. We will stick with the array.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

7 participants