-
Notifications
You must be signed in to change notification settings - Fork 13
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
RFC 6906: Also additional structural constraints? #94
Comments
On 2018-01-03 05:12, Ruben Verborgh wrote:
RDF 6906 states:
For the purpose of this specification, a profile can be
described as additional semantics that can be used to process a
resource representation, such as constraints, conventions,
extensions, or any other aspects that do not alter the basic media
type semantics.
May I suggest "additional semantics and/or structural constraints"?
i think i agree, but i am wondering why you're just mentioning
structural constraints? to me a profile might also just constrain values
in some way, leaving the underlying media type's structural properties
as they are.
also, the semantics are just a (non-formalized) side-effect of the
constraints, right? what about this:
"additional semantics (represented through structural and/or value-based
constraints)"
|
Because structural constraints are not necessarily “additional semantics”. I.e., we could imagine creating a profile that imposes certain structural constraints on a JSON file, but no semantics for them. Such a profile would not fall under the definition of “a profile can be described as additional semantics”. |
On 2018-01-06 01:14, Ruben Verborgh wrote:
i am wondering why you're just mentioning structural constraints?
Because structural constraints are not necessarily “additional
semantics”. I.e., we could imagine creating a profile that imposes
certain structural constraints on a JSON file, but no semantics for
them. Such a profile would not fall under the definition of “a profile
can be described as additional semantics”.
that's a good point. but aren't the semantics just the (intentional)
"side-effect" of the constraints? i am mostly thinking that what defines
a profile are the constraints (that i have to adhere to when using that
profile). if these "mean" something and what is important for users of
the profile, but the profile really is just the messenger.
so i am wondering whether it actually would be more accurate to define
profiles by their virtue of adding constraints, and then explaining that
those constraints most often are put into place to back semantics that
are based on those constraints being adhered to.
|
Might be, but not necessarily. Just knowing that data has a certain shape is already useful.
Agree on a profile defining constraints. Here are (non-final) definitions we came up with in the context of the DXWG working group:
So we basically identified three groups of constraints (syntax / structure / semantics), of which media types can influence all three, but profiles only the latter two. |
On 2018-01-08 02:18, Ruben Verborgh wrote:
but aren't the semantics just the (intentional) "side-effect" of the
constraints?
Might be, but not necessarily. Just knowing that data has a certain
shape is already useful.
true. which is why i am suggesting to focus on the constraints, and then
just treat the semantics as the usual explanation of why people are
doing that.
so i am wondering whether it actually would be more accurate to define
profiles by their virtue of adding constraints
Agree on a profile defining constraints. Here are (non-final)
definitions we came up with in the context of the DXWG working group:
* A *media type* is a set of syntactic constraints, structural
constraints, and/or semantic interpretations that can be used to
serialize information content.
that's kind of odd. by definition, syntax is a about structure, so at
least for the textbook definitions of these two words it seems odd to
separate those two. what definitions were used to come up with these two
as distinct categories?
https://en.wikipedia.org/wiki/Syntax
* A *profile* is a set of structural constraints and/or semantic
interpretations that can apply to information content in addition to
constraints and interpretations mandated by a media type.
sounds reasonable (minus the general syntax/structure oddity). i am
struggling to understand how a profile could be defined/useful when it's
just semantics, but without making that tangible via any constraints.
then it seems a profile isn't actionable in any shape or form and i fail
to see its utility as a "formal" profile.
|
Syntax: JSON, XML, … For instance, HAL has a JSON syntax with the extra structural constraints of having |
On 2018-01-08 23:01, Ruben Verborgh wrote:
that's kind of odd. by definition, syntax is a about structure, so at
least for the textbook definitions of these two words it seems odd to
separate those two. what definitions were used to come up with these two
as distinct categories?
Syntax: JSON, XML, …
Structure: document with these specific elements
For instance, HAL has a JSON syntax with the extra structural
constraints of having |_links| and |_embedded| elements.
that just seems a rather non-standard use of long established
terminology. syntax defines structure (that's pretty much all it does),
it defines how to use the symbols used in those languages to represent
the structured data that they represent.
it might be a bit easier for others to tune into all of this if it used
words more in the way they are commonly used. HAL uses JSON syntax and
structure and imposes additional constraints on top of that, but that's
very different from saying that syntax and structure are different things.
(i just looked into the DXWG profile-related pages and was amazed by the
fact that they don't even mention RFC 6906. i tend to agree with
@handrews on all of this: at least acknowledge what's out there instead
of reinventing a new terminology world. or if you do, maybe it's easier
for all involved to pick new terms.)
|
Low-level structure, yes. The fact that keys in a JSON document are strings. I'd say that the fact that JSON keys are delimited by The difference is important because regular JSON documents, HAL, JSON-LD, etc. can all be parsed by a JSON parser. So they share the same syntactic constraints. On top of that, HAL and JSON-LD have additional structural constraints.
I'm open to a more accurate naming. But to me this is the crucial thing in profiles: the media type defines what parser to use, and the profile defines what structural and semantic assumptions you're allowed to make. So I want to distinguish somehow in a meaningful way.
They're just draft of what individuals in the group think how profiles should be defined.
Obviously RFC 6906 will have a place in the end result. |
On 2018-01-08 23:20, Ruben Verborgh wrote:
syntax defines structure (that's pretty much all it does),
Low-level structure, yes. The fact that keys in a JSON document are strings.
well, whatever it takes to define the media type. exactly that, and not
more. that's what syntax is all about.
I'd say that the fact that JSON keys are delimited by |"| is a
syntactical constraint, and the fact that HAL requires a |_links|
element is a structural constraint.
apparently that's what you say. all i want to point out is that this is
a rather specific way of using established terminology, and it may not
help if you want other to understand things. syntax is structure. that's
what syntax is all about.
The difference is important because regular JSON documents, HAL,
JSON-LD, etc. can all be parsed by a JSON parser. So they share the same
syntactic constraints. On top of that, HAL and JSON-LD have additional
structural constraints.
they have additional constraints, yes. the term "structural" here
implies a precision and a distinction that doesn't exist. you could
easily have cases with no "structural" constraints and just value ones.
those would be equally valid examples.
anyway. this discussion (and thanks for it!) makes me confident that
profiles should very strictly talk about being constraints, without
qualifying that in any further way.
|
I guess the only thing I wanted to do was to say that “profiles should not require clients to use a different parser”. So (only) the media type determine the parser. Coming back to the same example: HAL and JSON-LD both use a JSON parser, but then make some additional assumptions about the shape of the resulting in-memory representation. (So they can be considered profiles on top of JSON instead of media types; and had the technology been available, at least HAL shouldn't have required its own media type). JSON-LD will not throw a "syntax error" on a valid JSON document—even if that document is invalid JSON-LD. I don't mind using other terminology. I understand we're in disagreement regarding syntax/structure (and I'm willing to change).
So are we also in disagreement about the parser being a meaningful distinction? So that there is some difference between the kind of constraints that differentiates If there's a difference, and I think there is, I'm looking for a term to indicate that. If there is no difference, what distinguishes a profile from a media type? |
On 2018-01-09 00:01, Ruben Verborgh wrote:
If there's a difference, and I think there is, I'm looking for a term to
indicate that. If there is no difference, what distinguishes a profile
from a media type?
that's an easy one. media types are created out of thin air and are
self-contained. profiles are always based on a media type.
|
Is HAL a profile or a media type? JSON-LD? RDF/XML? |
Also, can profiles apply to only a single media type then? |
On 2018-01-09 00:07, Ruben Verborgh wrote:
Is HAL a profile or a media type? JSON-LD? RDF/XML?
clearly these are media types, as they define themselves as such.
|
conflicts with
|
On 2018-01-09 00:08, Ruben Verborgh wrote:
Also, can profiles apply to only a single media type then?
https://tools.ietf.org/html/rfc6906#section-3
"While this specification
associates profiles with resource representations, creators and users
of profiles MAY define and manage them in a way that allows them to
be used across media types; thus, they could be associated with a
resource, independent of their representations (i.e., using the same
profile URI for different media types). However, such a design is
outside of the scope of this specification, and clients SHOULD treat
profiles as being associated with a resource representation."
|
On 2018-01-09 00:19, Ruben Verborgh wrote:
clearly these are media types, as they define themselves as such.
conflicts with
media types are created out of thin air and are
self-contained
not really. whether media types include or transclude things they may be
built on is a pure technicality. you could easily define all of these in
a way that has none of the dependencies that you probably refer to.
|
Okay, then it's clear that we have different starting points. I don't see such transclusion as a technicality: for me, a media type is associated with a parser. So if something does not require a different parser, then it shouldn't be a media type. That's the reason why I think profiles are useful: you can add additional assumptions that can be made after the parsing stage. The benefit of seeing HAL and JSON-LD as profiles of JSON, is that they can be combined (an argument I've discussed here). That is, one can perfectly imagine a JSON document that both adheres to the HAL constraints and the JSON-LD constraints—but using MIME types for these two instead of profiles, prevents a client from using that. I hope this also shows why it's important for me to distinguish between constraints that affect parsing (which I referred to as “syntax”) and others (which I—perhaps inaccurately—referred to as “structure”). However, there's something more fundamental:
With that definition, any profile that is tied to one specific media type, could equally be considered a media type itself, given that it then transcludes the first media type.
I have an issue with that phrasing: it's not because a profile is used across media types that this profile is necessarily associated with the resource. Given media types X, Y and profiles A, B, I might be able to represent a resource as X+A, X+B, Y+A, Y+B. |
On 2018-01-09 00:41, Ruben Verborgh wrote:
Okay, then it's clear that we have different starting points. I don't
see such transclusion as a technicality: for me, a media type is
associated with a parser. So if something does not require a different
parser, then it shouldn't be a media type. That's the reason why I think
profiles are useful: you can add additional assumptions that can be made
after the parsing stage.
what's a "parser" for you? you could argue that atom shouldn't be a
media type because an XML parser is all you need? or you could argue
because you'd actually want a feed to be parsed into feed-level
structures, meaning that you need a parser? again, i think you're
implying a precision/distinction here that doesn't exist (at the level
of clarity you seem to be after).
i've certainly seen people doing both: processing feeds as XML and then
writing their own XPaths assuming that the XML *is* a feed. or
processing feeds with an integrated package that consumes raw XML and
spits out some "feed DOM" that already addresses some of the
peculiarities of feeds, such as how to handle/derive "author" info.
for podcasts for example you would have three levels of parsing/models:
first parse the XML to get the feed. that gives you an XML model. then
you can interpret the feed structures to get to a feed model. and then
you can interpret the podcast structures to get to a podcast model. how
all of this is implemented is opaque. it's what pretty much always goes
on: different levels of abstraction layered on top of each other.
profiles just say that there's a additional level, that's all there is
to it.
The benefit of seeing HAL and JSON-LD as profiles of JSON, is that they
can be combined (an argument I've discussed here
<https://ruben.verborgh.org/articles/fine-grained-content-negotiation/#possible-but-inadequate-workarounds-p-1>).
That is, one can perfectly imagine a JSON document that both adheres to
the HAL constraints and the JSON-LD constraints—but using MIME types for
these two instead of profiles, prevents a client from using that.
but they *are* media types, so there's little you can do.
I hope this also shows why it's important for me to distinguish between
constraints that affect parsing (which I referred to as “syntax”) and
others (which I—perhaps inaccurately—referred to as “structure”).
i still don't get that. i know that you want things to be clear-cut, but
i cannot see a way how to see things that way without redefining what's
out there already, and implying dualities that aren't quite as clear.
for example, you could have an "XML profile" that said attributes always
must use quotes (and not apostrophes). that implies a specific parser
(feature) and has no structural implications (given your definition of
structure). wouldn't that be an acceptable profile?
I have an issue with that phrasing: it's not because a profile is used
across media types that this profile is necessarily associated with the
resource. Given media types X, Y and profiles A, B, I might be able to
represent a resource as X+A, X+B, Y+A, Y+B.
nothing in RFC 6906 keeps you from doing that.
|
Something that processes a representation's stream of bytes into a higher-level model.
Indeed. An Atom document is an XML document conforming to the (to be defined) Atom profile.
But that wouldn't be a parser of the representation sent by the server. It would be a convertor from an XML in-memory model to a list of feeds.
Can you point me to one Atom implementation that doesn't parse the document as XML first? A HAL parser that doesn't parse JSON first? A JSON-LD parser that doesn't parse JSON first?
Yes, and to me that lowest level is the document type, such as XML or JSON. They have common parsers (as in "convertors from bytes to in-memory objects"). All the higher levels are profiles; they do not operate on the bytes in the representation.
I don't intend to fix the past, but rather to make it easier and more flexible to define new things in the future. So if a new HAL 2.0 comes up, that it can be defined as a profile on top of JSON,
It would not be a profile to me, but a media type. Hence my definition of media type as a set of [byte-level] syntactic, [model-level] structural, and semantic constraints, and a profile as only [model-level] structural and semantic constraints but not [byte-level] syntax.
Indeed, but my comment is that the phrasing seems to imply that, when profiles are used across media types, they are associated with the resource instead of the representation. I suggest to change the phrasing, as this is not necessarily the case. Plus, this point is still open:
|
On 2018-01-09 11:13, Ruben Verborgh wrote:
what's a "parser" for you?
Something that processes a representation's stream of bytes into a
higher-level model.
there often are layered higher-level models. given this definition a
parser can also parse bytes into an "feed DOM".
you could argue that atom shouldn't be a
media type because an XML parser is all you need?
Indeed. An Atom document is an XML document conforming to the (to be
defined) Atom profile.
Unfortunately, it's defined differently because profiles didn't exist at
the time. Yet all Atom libraries first parse the regular XML document,
and then only start applying the specific Atom structural and semantic
constraints.
this is not how RFC 6906 defines profiles. you may want to change
reality to this, but (a) reality is different and hard to change, and
(b) this would be some non-6906 profile concept to be used for this.
or you could argue
because you'd actually want a feed to be parsed into feed-level
structures, meaning that you need a parser?
But that wouldn't be a parser of the representation sent by the server.
It would be a convertor from an XML in-memory model to a list of feeds.
maybe. who are we to decide how bits-on-the-wire get parsed into
application models?
different levels of abstraction layered on top of each other.
Yes, and to me that lowest level is the document type, such as XML or
JSON.
wouldn't the lowest level for both be unicode? i'd hope that few XML or
JSON parsers implement unicode from scratch. but i don't know and i
don't have to know.
i cannot see a way how to see things that way without redefining what's
out there already
I don't intend to fix the past, but rather to make it easier and more
flexible to define new things in the future.
So if a new HAL 2.0 comes up, that it can be defined as a profile on top
of JSON,
again, that would be for a non-6906 profile concept.
for example, you could have an "XML profile" that said attributes always
must use quotes (and not apostrophes). that implies a specific parser
(feature) and has no structural implications (given your definition of
structure). wouldn't that be an acceptable profile?
It would not be a profile to me, but a media type. Hence my definition
of media type as a set of [byte-level] syntactic, [model-level]
structural, and semantic constraints, and a profile as only
[model-level] structural and semantic constraints but not [byte-level]
syntax.
seems like our profile concepts are diametrically opposed.
nothing in RFC 6906 keeps you from doing that.
Indeed, but my comment is that the phrasing /seems/ to imply that, when
profiles are used across media types, they are associated with the
resource instead of the representation. I suggest to change the
phrasing, as this is not necessarily the case.
ok, can you maybe raise an issue for that or submit a PR? i think in
most places the text is pretty clear that profiles constrain
representations.
Plus, this point is still open:
whether media types include or transclude things they may be
built on is a pure technicality.
With that definition, any profile that is tied to one specific
media type, could equally be considered a media type itself,
given that it then transcludes the first media type.
very true. you could take any profile and turn it into a media type,
severing its connections with its foundation. but then you cannot
conveniently treat a podcast as a feed anymore, which is why the profile
concept fragments the landscape a little less.
|
Alright, thanks for the discussion, @dret. I've learned that we indeed have something different in mind. The good thing is that I don't see an incompatibility with the phrasing as it currently is in RFC 6906, so I'll keep an eye on that in the future as well.
RFC 6906 does not define a profile at the moment, and the text is compatible with the notion of a profile I propose (and I'm happy with that).
That's a charset matter, and a separate concern with a separate header.
Why? It is not incompatible with anything in 6909.
Done in #95.
Then this is the main reason why that concept of a profile is not of any use to me.
…which I why I'd want future HAL, Atom, etc. all to be profiles. |
On 2018-01-10 02:38, Ruben Verborgh wrote:
this is not how RFC 6906 defines profiles
RFC 6906 does not define a profile at the moment, and the text is
compatible with the notion of a profile I propose (and I'm happy with that).
you keep saying that and i don't understand why. you're hunting for
something i've seen people calling "schema" or "type": an added layer of
abstraction, a model on top of some generic metamodel structure.
but i see that RFC 6906 is not clear enough. i'll try to change that to
make sure things are easier to understand.
you could take any profile and turn it into a media type,
severing its connections with its foundation.
Then this is the main reason why that concept of a profile is not of any
use to me.
The attraction of my notion of profiles is precisely that they offer
something a media type cannot.
Might need another name though then. Perhaps /features/ (as in here
<https://arxiv.org/pdf/1609.07108v2.pdf>).
hmmmm.... i have a really hard time imagining how your alternative
notion of a profile would be any different regarding this aspect. people
could easily ignore it and keep minting media types, and there would be
little you could do about it (other than disliking it).
keep in mind that the main motivation for RFC 6906 was to make media
types more easy to reuse and refine, so that people don't have to create
media types and can create and use profiles instead. but that doesn't
mean anybody can keep them from doing that, if they feel like doing it.
|
trying to make it as clear as possible that a profile is not a schema: it refines a schema (the one of the media type), but doesn't add a completely new abstraction layer.
Part a) "does not define a profile" is because RFC 6906 says "For the purpose of this specification, a profile can be described as…” but never "a profile is". Note that this is not changed by 4efda97, whose commit message says "trying to make it as clear as possible that a profile is not a schema” but the actual RFC text does not state that fact. It says "an easy way to conceptualize profiles is […]", but that does not conclusively say whether or not a profile can be a schema. The clearest way IMHO is to write "a schema is not a profile". I'm not trying to be pedantic here, but either RFC 6906 should use exact wording to say "a profile is" and "a profile is not", or either many interpretations—including mine—will be compatible. If the latter is on purpose, fine (and actually my preference), but then we should not assume a strict definition of a profile based on RFC 6906.
A schema seems to imply something much more strict to me. Profiles can be really light constraints.
Obviously. But the situation now is that people cannot do profiles at all (in the way we need it, with multiple profiles per resource, conneg etc.), so are forced to keep minting media types. I just want to offer an alternative, but I can't and won't force anybody.
Yeah, but the only distinction between a profile (based on a media type) and a media type seems then just whether somebody decides to call it a profile or a media type, especially given that you consider transclusion in a media type definition a technicality. Then it seems also a technically whether we define something as a profile or a media type, really. Nonetheless, this main motivation is something we share, so it is in a sense strange that we seems to have arrived at very different conclusions from it. I seem to be more radical in that everything that is JSON (XML) should for me—in an ideal future—just have a media type of application/json (application/xml), no subtypes required. Instead, the response indicates compliances with one or multiple profiles, which allows the client to make additional assumptions about the shape and semantics of that JSON. This recognizes that fact that all processors of JSON (XML) subtypes indeed start with a JSON (XML) parser, which I do not consider a technicality since I have not heard about a single exception. A secondary motivation for me is that the overwhelming majority of application/json API responses are underspecified: clients make many more assumptions than only application/json. Profiles can make these assumptions explicit, without having to resort to specific media types such as application/vnd.my+json that have no formal relation to application/json. Instead, they are marked as application/json tagged with profile/a and profile/b, which tells the client "use a JSON parser" and "you can make additional assumptions a and b". |
I'm really confused by this (and not just here and with you, @RubenVerborgh, I've encountered it from others at the JSON Schema project and elsewhere). We have the RFC 6906 author telling us the intent of the RFC. And admitting that it needs clarification and working on the clarification. And I agree that having the language be more definitive would help and reduce the tendency of people ot re-interpret this RFC however they please. But given the intended defintion, f we don't find his definition of "profile" useful because we need a somewhat similar but ultimately different concept or behavior, why are we trying to tell him what "profile" means? Why not just make up our own link relation / media type parameter / http preference that does what we want? That is why I am proposing a "schema" relation/profile/preference. @RubenVerborgh I do think you bring up really interesting points about "primary" media types vs structures suffixes vs profiles vs schemas. Which I need to think more on as I just woke up and the caffeine hasn't entirely kicked in yet. I think I like the distinctions you are proposing, whether they work with the "profile" terminology or need a new name. |
I wasn't—just trying to understand 😄
I always try to reuse first. And I still can, if the phrasing of 6906 doesn't fundamentally change.
Schema is too narrow, I think.
I like that notion of "primary"!
More at https://ruben.verborgh.org/articles/fine-grained-content-negotiation/ if you like. Open to other terminology! |
On 2018-01-11 08:42, Henry Andrews wrote:
I cannot find a single sentence in 6906 that contradicts my
interpretation.
I'm really confused by this (and not just here and with you,
@RubenVerborgh <https://github.com/rubenverborgh>, I've encountered it
from others at the JSON Schema project and elsewhere).
thanks for channeling my confusion/frustration, @handrews.
We have the RFC 6906 author telling us the intent of the RFC. And
admitting that it needs clarification and working on the clarification.
And I agree that having the language be more definitive would help and
reduce the tendency of people ot re-interpret this RFC however they please.
the latest commits should be pretty clear. they say that it's not ok to
add a new abstraction layer with a profile, and that's it's only ok to
incrementally add to an existing one. i have a hard time seeing what's
hard to understand there.
4efda97#diff-6023bdc7ab5f5743f9447d322b3846f4
But given the intended defintion, f we don't find his definition of
"profile" useful because we need a somewhat similar but ultimately
different concept or behavior, why are we trying to tell him what
"profile" means? Why not just make up our own link relation / media type
parameter / http preference that does what we want? That is why I am
proposing a "schema" relation/profile/preference.
that makes sense to me, if you want to signal schemas. @RubenVerborgh's
vision seems a bit nebulous so far: some feature that's adding complete
new abstraction layers, but it's not a schema. then how does one know
how anything is represented?
@RubenVerborgh <https://github.com/rubenverborgh> I do think you bring
up really interesting points about "primary" media types vs structures
suffixes vs profiles vs schemas. Which I need to think more on as I just
woke up and the caffeine hasn't entirely kicked in yet. I think I like
the distinctions you are proposing, whether they work with the "profile"
terminology or need a new name.
i'd be more than happy to help with whatever else may crystallize. me
may have a good opportunity here with the rewrite of "profile" and some
momentum behind something that maybe could be made nicely complementary
instead of competing.
|
On 2018-01-11 09:57, Ruben Verborgh wrote:
why are we trying to tell him what "profile" means?
I wasn't—just trying to understand 😄
Conclusion so far: we apparently mean something different, even though
6906 doesn't state so.
i think i am simply giving up here. the draft is as clear as i can
possible make it in saying that it's not intended to be used for
establishing new abstraction layers.
Why not just make up our own link relation / media type parameter /
http preference that does what we want?
I always try to reuse first. And I still can, if the phrasing of 6906
doesn't fundamentally change.
that would be an odd interpretation of "reuse", after the discussions
we've had so far.
|
Truth is, you never know whether a text is clear until you ask others. How about we ask a couple of experts to explain, based on the current text, their understanding of a profile? We could even make this very simple with a set of yes/no questions. If they understand, we can conclude the text is clear.
For one, when is something an "abstraction layer" and when isn't it? But I also don't see how that statement changes anything we have discussed above
Not necessarily a schema—it can be. “My” profile is any set of (high-level) structural or semantic constraints. Let me clear up the nebula by making this very concrete. Quick fictitious examples of profiles:
Note how multiple profiles can apply to the same resource. For instance, both schema-org-book and main-title could apply to a JSON-LD document. For real-world examples, consider that things such as Atom and HAL were defined as profiles rather than new MIME types. Especially the case of HAL is interesting here. Current situation
Proposed situation (I know we can't change the past, but it's more of an “what if HAL were invented after profiles” thing for illustrative purposes)
So the client will see this as a JSON document, that has the HAL structural properties and semantics ( Moreover, both can be reused independently of each other.
Yes, and I honestly don't think we're that far. We have different ideas of what a profile should be, but it is not specified too strictly (as is the case now), it works for both. |
i'd be more than happy to ask others, if that is what it takes to resolve this issue. feel free to reach out and see what we get in response! |
@dret regarding:
the link you supplied is adding this sentence:
I love abstract stuff. I prefer abstract descriptions. As you may have noticed over at JSON Schema, every time someone demands a concrete example I wail and gnash my teeth and bemoan that no one likes or understands my example anyway. That said, I really cannot wrap my head around this at all. A profile is either an augmentation or refinement of a media type? Can it be both? At that point is there even much restriction to it at all? You also bring in the word "schema" which has proven confusing in this context as well. The whole thing is kind of circular and when I try to dig into it there's just not a lot of there there for me. (somewhere, someone who has struggled with my completely abstract ramblings is laughing their head off right now.) As much as I hate to be That Guy, I think an example is in order. And perhaps more importantly, a set of counter-examples. There is value in nebulous definitions, and sometimes the easiest way to achieve that is to set some clear markers and say "these are concepts that often come up that are firmly outside of the definition." There are three words for potentially similar concepts floating around here, all of which at least could be some sort of refinement or augmentation of an existing media type:
Can we put some boundaries around what is appropriate for each? Rather than go back and read prior definitions, I'm going to write down my current intuition off the top of my head. It will likely be hilariously misguided, but a lot of the confusion around RFC 6906 is that people read it, develop an intuition that they don't see contradicted (as @RubenVerborgh noted) and run with it. The attitude is that everything that is not forbidden is allowed. I'm going to stick to JSON just because I have more options to reference there that I understand pretty well. And on the topic of this likely being misguided, @dret I am not trying to impose any of this as a definition for RFC 6906bis. I just want to reset things with another starting point that comes from someone's intuition rather than the needs of another project that is hunting around for a usable concept. Structured Suffix Media TypesA structured suffix allows you to work directly with media type-based content negotiation. They're rather heavyweight to get into the standards tree, but the vendor tree is more accessible. They feel like the most coarse-grained solution, even though some (like Structured suffixes make the most sense to me for non-substitutable alternatives These use cases involve selecting different ways of achieving the same goal. Adding hypermedia with Similarly It's a little harder for me to fit In fact all of these examples do, as does SchemasTo me, schemas are the most specific concept. Just as structure suffix media types can be very specific ( But if I want to express the concept of a DNS record as represented in a REST API, that's definitely not right as a structure suffix media type. The GitHub API notwithstanding, it's far too specific. Unlike So I feel that something that identifies a document as representing a specific concept in a specific way is a schema. Schema-described things do not occupy generic roles in communication, they are descriptions and identifications of what sort of things are are being communicated. ProfilesSo where does that leave us with profiles? I feel like they are kind of in the middle, although I am not all that confident that my view is shared within this conversation :-) Things that feel like profiles to me are things like the expired I-D for a canonicalized form of JSON (that @dret might have used as an example somewhere recently? I've lost track). Or I-JSON, which I know @dret has mentioned and even uses the word "profile" in its description. Both of these profile candidates allow all interoperable uses of JSON, and just avoid problematic or confusing but syntactically correct documents. That'd different from both playing a generic role in communication and from identifying concrete sets of things being communicated. These are refinements on how the document is structured to allow for more assumptions to be made during processing. It wouldn't make sense to make new media types for canonicalized JSON or I-JSON. They don't add any semantics, they just restrict the syntax to something tidier, and remove ambiguous / non-interoperable / undefined semantics. I mentioned that I'd come back to JSON Schema meta-schemas. I can see the as schemas, but I can also see them as profiles of Perhaps this is a good place to stop. It's getting late-ish here and I've rambled my way into a corner. I hope that even if all of these ideas and proposed roles and definitions are completely off base, that by reacting to them we can start to put some boundaries around these concepts somehow. |
On 2018-01-23 08:02, Henry Andrews wrote:
That said, I really cannot wrap my head around this at all. A profile is
either an augmentation or refinement of a media type? Can it be both?
to me these are kind of the same things, so yes. podcasts add new
fields, so you might say they "augment" feeds. whatever it is that
happens, it doesn't define a new thing. it adds to what's there.
At
that point is there even much restriction to it at all? You also bring
in the word "schema" which has proven confusing in this context as well.
yup, true. but people seem to want to see it here.
As much as I hate to be That Guy, I think an example is in order. And
perhaps more importantly, a set of counter-examples. There is value in
nebulous definitions, and sometimes the easiest way to achieve that is
to set some clear markers and say "these are concepts that often come up
that are firmly outside of the definition."
example: feed and podcast, where a podcast simply is a special kind of
feed, and thus each podcast *is* a feed.
counter-example: XML and atom: atom adds a layer on top of XML, and you
cannot meaningfully say "atom *is* XML". it is *represented via XML*,
but when you work with atom what matters are atom abstractions and not
XML abstractions anymore.
There are three words for potentially similar concepts floating around
here, all of which at least /could/ be some sort of refinement or
augmentation of an existing media type:
* Structured suffix media types
well, that seems to be squarely in the "added layer" camp: by saying
application/atom+xml, you make it clear that while the representation is
XML, the actual application-level type is atom.
* Profiles
that's the other thing, for which it seems there still needs to be a
better description than augmentation/refinement. what would work best
for you for something that doesn't add a layer of abstraction, but
instead adds to one that's already there?
* Schemas
that to most is a way how you can validate a document. it's a more
mechanical construct in the sense that for example, for any document
type there might even be multiple schemas, either in terms of various
aspects (DSDL), or in terms of schema strictness (HTML loose/strict).
Can we put some boundaries around what is appropriate for each? Rather
than go back and read prior definitions, I'm going to write down my
current intuition off the top of my head. It will likely be hilariously
misguided, but a lot of the confusion around RFC 6906 is that people
read it, develop an intuition that they don't see contradicted (as
@RubenVerborgh <https://github.com/rubenverborgh> noted) and run with
it. The attitude is that everything that is not forbidden is allowed.
to some extent that's unavoidable. whatever you're doing, it will become
somebody nail they're hammering on, because that's what they see. but i
agree that the particular schema/type hammer may be something that
should be mentioned as not being the right thing to use here.
Structured Suffix Media Types
A structured suffix allows you to work directly with media type-based
content negotiation. They're rather heavyweight to get into the
standards tree, but the vendor tree is more accessible. They feel like
the most coarse-grained solution, even though some (like
|application/problem+json|, |application/json-patch+json|) have very
specific purposes and structure. But others are very general, adding a
broad concept (hyperlinking with |application/json+hal|, semantic
identification with |application/ld+json|).
hm. to me the most important thing to mention here is that in this
model, you're always minting new media types. the structured suffix is
just a model to make your design layers a bit more transparent, but to
be honest i have never seen building actually machinery around this,
other than being happy about the fact that the name is a little bit more
descriptive than a completely opaque identifier.
Structured suffixes make the most sense to me for non-substitutable
alternatives
maybe that's because they are proper media types?
Similarly |application/merge-patch+json| vs
|application/json-patch+json| for two different ways to express how to
edit another JSON document. They each have advantages, they are used for
the same purpose, but they are not compatible with each other.
yes, because they are different media types. they happen to share the
same representation foundation, but that's just interesting to see and
of no practical value.
It's a little harder for me to fit |application/problem+json| into this
view, as I'm not aware of other error-reporting systems. I think the
reason that it feels right as a structured suffix is that it occupies a
very generic role in hypermedia system communication.
the only reason it has one because it's a json format. there also is
application/problem+xml
(https://tools.ietf.org/html/rfc7807#section-6.2) which is an XML
variant. the media type names simply provide an indication that they are
the same model (abstraction layer) on top of different representations.
In fact all of these examples do, as does |application/json+ld|. A
full-featured system needs to be able to express application semantics,
send editing instructions, report errors, and include hyperlinks. Hmm...
I like this concept even if I'm not confident is sufficient or even
accurate.
to me all these discussions imply a precision and rigor in the media
type system that is not there at all. people chase the dream that
everything is well-defined and well-related in a completely
machine-processable way, and that's never been the case and i am
certainly not holding my breath.
Schemas
So I feel that something that identifies a document as representing a
specific concept in a specific way is a schema. Schema-described things
do not occupy generic roles in communication, they are descriptions and
identifications of what sort of things are are being communicated.
a schema is an implementation vehicle. there can be many schemas for one
media type. again, assuming that a schema is a complete description or a
semantically complete model is very far from how things are, and very
likely it will remain like this. again i am not holding my breath.
Profiles
So where does that leave us with profiles? I feel like they are kind of
in the middle, although I am not all that confident that my view is
shared within this conversation :-)
they are identifiers. they *identify* the fact some a feed claims to be
a podcast. they do not link to a schema. they do not link to anything.
their mere presence is all there is. in that regard they are like media
types, which also are pure identifiers.
Things that feel like profiles to me are things like the expired I-D for
a canonicalized form of JSON
<https://tools.ietf.org/html/draft-staykov-hu-json-canonical-form-00>
(that @dret <https://github.com/dret> might have used as an example
somewhere recently? I've lost track). Or I-JSON
<https://tools.ietf.org/html/rfc7493>, which I know @dret
<https://github.com/dret> has mentioned and even uses the word "profile"
in its description.
yes, these things could identify themselves as profiles if they minted a
profile URI and then used that as a signal.
Both of these profile candidates allow all interoperable uses of JSON,
and just avoid problematic or confusing but syntactically correct
documents. That'd different from both playing a generic role in
communication and from identifying concrete sets of things being
communicated. These are refinements on how the document is structured to
allow for more assumptions to be made during processing.
exactly.
It wouldn't make sense to make new media types for canonicalized JSON or
I-JSON. They don't add any semantics, they just restrict the syntax to
something tidier, and remove ambiguous / non-interoperable / undefined
semantics.
well, you could very well mint media types. but that would probably
defeat the purpose of saying that "I-JSON is still JSON, but a specific
way of using it". a profile is a lightweight way of expressing that,
without incurring the heavyweight change of "media type identity".
I mentioned that I'd come back to JSON Schema meta-schemas. I can see
the as schemas, but I can also see them as profiles of
|application/schema+json| because JSON Schemas ignore what they don't
understand. A meta schema allows you to start understanding parts of a
JSON Schema document while continuing to ignore those parts that are
unrecognizable. I'm not quite sure where I"m going with this paragraph.
I think I've surprised myself by saying that schemas are not profiles,
but maybe meta-schemas are?
i need to understand more about schemas and meta-schemas to be able to
comment on this. if a meta-schema *is* a schema, then maybe they indeed
they are profiles. but i would have to dig deeper.
Perhaps this is a good place to stop. It's getting late-ish here and
I've rambled my way into a corner. I hope that even if all of these
ideas and proposed roles and definitions are completely off base, that
by reacting to them we can start to put some boundaries around these
concepts somehow.
it's all good, and thanks for spending the time. i think all of this
could much better we discussed f2f, it seems like these long threads
help little to get us closer to a shared understanding. but let's keep
trying, and thanks for doing it!
|
I have created the following form: https://goo.gl/forms/Ql9rvnYigQvHXWZE3 Any edits or suggestions? If not, I propose to send it on the mailing lists. |
On 2018-02-16 11:24, Ruben Verborgh wrote:
Any edits or suggestions?
If not, I propose to send it on the mailing lists.
feel free to do so. i just hope we'll get constructive comments,
otherwise we'll still be where we are now.
|
Good point, I'll ask for suggestions as well. |
Unfortunately, the survey only attracted 4 responses so far. I have attached That said, these results seem to confirm my suspicions about the text being not fully clear. For instance, 1 person did not know whether there is a difference between media types and profiles according to the RFC, and only 1 person found the RFC to explicitly state what that difference is. I know you were not just looking for feedback on clarity, but also for constructive feedback—and there are suggestions in the responses. The most constructive feedback that I can give is to write in the RFC directly and literally what you mean. As an example, I refer to my earlier message that the commit message of 4efda97 is crystal clear ("a profile is not a schema"), whereas this direct phrasing is nowhere to be found in the actually committed diff. |
RDF 6906 states:
May I suggest "additional semantics and/or structural constraints"?
The text was updated successfully, but these errors were encountered: