-
Notifications
You must be signed in to change notification settings - Fork 45
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
PUT RDFa, then GET turtle #195
Comments
A more tricky case is |
@RubenVerborgh - I'd suggest To my mind, When the target resource is actually a Graph in an RDF-store, any RDF type might be acceptable as a When the target is a materialized document in a specific serialization (which might include non-RDF data, such as comments in a Turtle file, or the HTML portion of RDFa), only the same serialization, and probably generally only line-based DIFFs, should be acceptable as a |
@TallTed I agree, the handling of comments is a problem, as far as PATCHes and translating between RDF serializations. |
Does it mean that text/html sometimes gets handled as RDFSource and sometimes as Non-RDFSource? Should plain HTML with RDF embedded in script tags also get handled as RDFSource then? |
"Just when I thought I was out, they pull me back in." If a server implementation can't return a Turtle representation of a requested URI eg. expressing a WebID Profile Document, LDP RDFSource, then the implementation is non-conforming with respect to the conformance criteria of those specs. This seems to happen when for instance, HTML, SVG, or MathML documents enter a system, however, they can't be called back out successfully. It has to do with content negotiation failing than the actual serialisation that may be on the filesystem for a given implementation (which is irrelevant any way). From the LDP perspective, RDFa in host languages is an RDFSource (see also http://www.w3.org/TR/ldp/#dfn-linked-data-platform-rdf-source -> https://www.w3.org/TR/rdf11-concepts/#dfn-rdf-source , "any web document that has an RDF-bearing representation may be considered an RDF source"). Aside: LDP non-normatively comments about "non-RDF formats like HTML", but the context is important. When LDP Rec came out, RDFa Rec didn't exist. What LDP spec most likely meant was HTML without RDFa. LDP's normative usage of RDF sources as per RDF 1.1 holds, and so actually there are no conflicts here. Given that, I would punt this issue over to an implementation and say that "you're doing it wrong" (tm). Although I don't think it is necessary to couple PUT RDFa and GET Turtle (or other) for a conformance criteria, I am interested in understanding whether it is needed and for RDFa in a host language. Got further thoughts on that? I prefer to move the discussion on modification/PATCH of RDFa-based resource to separate issue (as it is outside the scope of this issue). It is super interesting and challenging. For now: PATCH is done on the "resource identified by the Request-URI". So, as I see, Linked Data Patch Format can be extended and be more comprehensive in what and how resources can be modified. In the case here, that's having PATCH with (X)(HT)ML payloads incorporated into the Solid specs alongside SPARQL Update. Just as SPARQL Update is intended to operate on the Requested-URI, we may have to frame something like XML-Patch (or another approach) to be as well - as opposed to strictly on XML documents. It needs further investigation; determining whether something like XML-Patch is suitable to begin with. In the end, how the PATCH was done should not matter on the underlying RDF graph. If the frame we are operating in is about the RDF graph, the spec doesn't need to - as far as I can see at the moment - preserve the structure of (X)(HT)ML, but that the underlying RDF (encoded in RDFa in the host language) is. That is, an implementation may want to track or be capable of returning the intended or ideal document structure including the RDFa, or come up with its own generic pattern eg. RDFa in an HTML table or unordered list, or in an SVG g etc. Alternatively, respond with 4xx. |
Indeed, and this is fine. We cannot expect from servers to convert arbitrary representations into RDF. We have to draw the line somewhere, or implementing a server will be so complex that no one will do it. I'm not saying where that line should be, but requiring Turtle conversion for JSON-LD, HTML5+RDFa, XML+RDFa, N-Quads, TriG, RDF/XML, and whatever we can think of would be a big requirement. So as far as I am concerned, I would spec that:
I could support a SHOULD, but likely not a MUST. |
Thinking about this purely from an implementation perspective, I find the feature very interesting, but there are also a lot of complications. I would definitely like to support it, but there are all sorts of edge cases that would need to be considered. @csarven and I have discussed specifically how this could work, and I would like to experiment with those. If the specification uses Just as a note, LDP requires that a conforming server support (at a minimum) both Turtle and JSON-LD. |
The context of the paragraph was RDFSource, not any arbitrary representation into RDF. The spec doesn't need to over-specify. If anything, if a server accepts text/html, image/svg+xml, I would suggest that it MUST be able to serve it back out as text/turtle or application/ld+json. This is not particularly different than accepting application/ld+json and returning text/turtle when requested, and that's already within LDP. |
It is different. Parsing HTML5 is significantly more complex. And currently
JSON-LD isn’t even a MUST (in all fairness, nothing is yet).
|
Parsing complexity may be a factor to check and help decide whether a content type is part the ecosystem, but it is not the only or the most important factor. We can't categorically dismiss a content type based on the parsing complexity either, especially when each have fairly well-defined areas of use - IMO, driving decision making. JSON-LD is a MUST in Linked Data Platform, Web Annotation Protocol, Linked Data Notifications, Activity Pub (if you want the LD take on it),.. of the top of my head from recent specs, as well as some contemporary publishing practices (like JSON-LD in HTML.. but that's a weird case in and itself and we can come back to that discussion later). JSON-LD ought to be a MUST for applicable Solid specs, otherwise it will distance itself from what relatively plays well out there. Similarly, RDFa in common host languages is important if "content" publishing means anything for the Solid ecosystem. If RDFa - parsing at the very least - is not a MUST, there is not a whole point in incorporating HTML, XML Family and so forth. We could dismiss RDFa because it is "too complex", but in reality only shifted the complexity to another location in the system. Again, if a server is willing to accept content types that can potentially include RDFa, why would it disservice by not making the underlying graph serialised as Turtle, JSON-LD.. to consumers? The Solid specs are not mandating the implementation of RDF parsers; only their use. If we want to use https://www.w3.org/TR/html-design-principles/#priority-of-constituencies as a rough guideline to orient ourselves (as stated in Culture), this is an excellent timing to apply it, otherwise we can remove that effective immediately. |
That’s not an applicable argument, because a Solid server is a file server that will accept any content type for storage. The question is thus not about acceptance, but rather about what conversions MUST be supported. Some of these can perfectly be optional; the client can do them if needed.
I agree with the principle, but it unfortunately can be used as an argument for adding many features, so we need to take more factors into account. Ultimately, it’s about the experience the user gets, not about whether something should be in the server or the client. |
That's part of it, but not strictly the only thing a Solid server does. Obviously conversions and modifications happen from the perspective of resources identified at requested URIs. Needless to say, conversion only happens if 1) possible 2) we want it to. 1 is indeed possible, and I see no particular reason it can/should/must not happen when it is perfectly meaningful and useful to do so. Again, if the conversion possibility to JSON-LD is not a MUST for a Solid server, then it is not an LDP conforming server. Any client application that is able to talk to LDP (and the list of specs I mentioned above) will not be able to via JSON-LD. If there is a perfectly useful graph in HTML, why prevent its reuse? How do you envision human and machine-readable documents to be published and consumed on the Web? Dependency on JavaScript, requires DOM handling under a GUI? Archivable? How accessible exactly? Did anyone actually experiment with that under the Solid paradigm? Well, FWIW, I have. Publishing "web pages" - I'm being as generic as I can be on this one - what drives the Web today. What assurance (if at all) given that a reasonably smooth transition from what exists today to what's arguably semantically richer without bringing a minimal step through RDFa onboard for some use cases? Aside: I feel that I need to yet again make the case for "why RDFa".. and this is a discussion that I've been personally involved in since at least 2015 in context of Solid. And, after all these years, I have yet to see FWIW anything vaguely equivalent to Web authoring/publishing/consuming applications (along the lines of dokieli - and that's only an argument for a starting point, not how everything should be for Web publishing). If the alternative approaches are great and possible, where are they? I'm not interested in the hypotheticals here. Until that happens, I am arguing from the point of actual publishing and implementation experience... and maybe that can be classified under the "experience the user gets". The use case of this issue is that, someone with an existing website (let's say non-Solid-based - whatever that means), can't successfully migrate to a Solid based server - unless drastic maneuvers are taken in order to participate. Why? Because what I'm hearing is HTML(+RDFa) is "too complex" and that a Solid server decides to not conform to LDP.. and possible kick itself out of interoping with stuff... for starters. |
I'd like to emphasise that so far the discussion revolved around whether conversion from RDFa to Turtle should happen on a server. Another mutually possible option is that clients/applications MUST be able to parse RDFa. The end result is the same in that an RDF graph gets passed around. In order to make use of the original document structure (if available) to some degree - thereby being lossless - I think applications being able to parse RDFa is more important than a server converting to something else - thereby being lossy. There are different set of advantages to RDFa parsing in either direction as you can see. From my own experience, I think both should be in place. |
No straw man arguments please 🙂 The argument put forward is:
Conformance to LDP is another matter entirely. The inverse is not true: the spec says that "LDP servers MUST provide an RDF representation for LDP-RSs", but not that resources created though any RDF representation must be LDP-RSs.
I don't think anyone needs to be convinced about that; we understand the utility of RDFa. |
Strawman? You literally said "significantly more complex"!
Pardon me but I find this line of argument narrow and we need to have a holistic look at the situation. I've given several explanations as to why that's narrow (from distancing itself from interop with other things out there to constituencies..) but you are still dismissing the reasons put forward and want to carry on with your original argument. I'm happy to come back to this discussion if you like by first agreeing on Use Cases and then we can see which arguments to dismiss them are actually stronger. Or maybe just explain how you think arbitrary information fragments in human-readable/actionable content can be better handled than non-RDFa. Outline the complexity of that alternative approach and compare. Convince me that is not reinventing or patch work and having more utility than what's available to us right now out of the box. As I've already mentioned, it shifts the complexity elsewhere, doesn't eliminate it.
Hairsplitting. There is no strong reason as to why a system needs to ignore HTML+RDFa, JSON-LD.. as perfectly normal "candidates" for an RDFSource. The minimal assumption and understanding is that, they are in fact RDFSources. It takes a stronger argument to suggest that they are not. So, I'm all ears on that. Moreover, we are not arguing about any arbitrary formats being classified as an RDFSource, but only the obvious candidates. "Complexity" doesn't hold because the development and use of an RDF parser by a machine just simply doesn't weigh more than the utility. Why should it? I would add that there are two "complexity" arguments being passed around. One is about the actual parsing of the format and the other is about having a "simpler" agreement on what a server and/or a client should be capable of.
Because interop. Not going to interop through MAYs, and SHOULDs can be ignored as they are not the fundamental for interop when it comes down to it. Either or both the server or the client has to MUST, otherwise I fail see the point in continuing this discussion.. or bother to build "Web publishing" tools, or failing to take advantage of a class applications that are built on other specs which are relatively at our fingertips. If the requirement is optional/weak, implementations are not expected to do it, and so it defeats the purpose of saying anything about it in the first place. I think a reasonable position for a server and a client to be in is: if a server or a client accepts application/ld+json, text/html, image/svg+xml,.. they MUST be able to handle it as an RDFSource (or as something RDF-bearing as per RDF 1.1 Concepts). I think that actually addresses your concern on "whether all Solid servers..". Given the condition it sets (ie. the if), then the answer is no, not all. If a server/client has no use or interest in working with them, they can ignore the requirement. For everyone else, I think that's a fair expectation and an application of the principle of least astonishment. |
I wrote that as a reply to a point; it was never the general argument, which was about the burden of implementing a high number of conversions.
Because I think we are being dragged into details here, losing track of the high-level argument.
Because it hasn't really been refuted yet.
Top of my head, I can think about these:
This is a very heavy MUST, so heavy that I fear it will simply not happen.
Apparently it does: in the entire world, there exists exactly one RDFa parser that is proven to be compatible with the spec—and we had to write it ourselves. So I theoretically fully agree with the users over implementer part; in practice, I see that implementers don't bother, so users aren't helped.
Ignoring seems to contradict MUST?
So not really hairsplitting, because I cannot conclude a MUST reject from that.
Not necessarily. Some clients, like the Web publishing tools you mention, can choose to be compatible. No need to make this a MUST for all clients. |
@csarven - Please consider that "RDF Source" might include all manner of future document formats. Your argument suggests that you think that a server shipped today should magically support translation of all those future serializations into Turtle. I submit that it is not reasonable to require any server -- LDP, Solid, or otherwise -- to translate every document format -- including those not yet invented! -- that might contain RDF (and hence be "an RDF source") into Turtle or JSON-LD or otherwise. And as one of those on the LDP WG who helped produce those specs, it was certainly not my intent to require that of LDP Servers. Translation between JSON-LD and Turtle was mandated, was not considered heavy-lifting, and was expected to be lossless as both serializations are "pure" RDF. Translation between those and other non-RDF serializations (including "RDF plus ..." or "... plus RDF" formats like RDFa) was not considered lossless, could be considered heavy-lifting, and was not mandated. "Sidecar" methods of attaching RDF to non-RDF formats (like JPG, PNG, etc.) do mandate that both JSON-LD and Turtle be supported, but again, this is about the "pure RDF" sidecar documents, not about the JPG or whatever. It's great if the server can do these things, and that would surely be a major mark in its favor in a competitive marketplace, but I do not think it should be mandated (so, not a MUST), and I do not think it should be considered a significant failure not to do so (so, not even a SHOULD). |
Strawman. I explicitly mentioned the precondition. That is, if one accepts a format that's highly likely - for all intents and purposes - to be RDF-bearing, it should parse/transformed/manipulated/whathaveyou as RDF. I fail to see how you went from that to Sarven said "all things, anything under the sun, and for eternity to work magically" and then argue against that. If your server doesn't want to work with dirty complicated O(n!^n!^n!..) RDFa or JSON-LD or whatever, fine, don't even bother to take it in. Full stop. Why? Because they are deemed to be "RDF-bearing" documents. No one will come after you for refusing to accept them. But, just don't take it and then say I can't give the graph back to you in a different format because I don't want to... Yes, of course that's all within the server's right to do that, but that's just being a jerkServer. Now, one could always argue edge cases how they might pretend to be RDFaInSomething and JSON-LD, but have no triples on sight. Big deal. Parser will halt or return an empty resource - whatever the finer details are for those formats. Do I need to stress the point on the importance of having a server that can communicate with applications which are coming from other nearby specs? Do you seriously want to tell them to go away because there is some hypothetical vacuumed world that's preferable for Solid? Why on Earth would you not want existing applications implementing WAP, AP, LDN... to be able to work with a Solid server pretty much out of the box?! |
@csarven is making a very specific point that follows directly from the LDP specification. If a client attempts to create an |
@acoburn What is the source statement for that in the LDP spec? Couldn't find it earlier.
So |
@csarven: Take a walk. You sound like you would get angry. Not easy to discuss while emotionally loaded. Okay? Regarding this discussion here...
If they can be turned from one format into another without loosing information... Can we leave non-RDF formats out of scope for now? |
It would still be good to have the explicit list though, perhaps in table form with MUST/SHOULD/MAY? Then we can understand the actual scope. Would my suggested 8-point list cover everything?
No one was arguing for that either… we're making things hard for ourselves here 🙂 I see two distinct questions:
For 1), I'd like to see the list. For 2), my answer is MUST NOT. |
@RubenVerborgh the LDP specification describes this in section 5.2.3.4:
That covers the case where an explicit link header (defining an LDP interaction model) is provided by the client and where that interaction model is The case where a client provides no Link header (and defines no interaction model) is left undefined in the LDP spec. In the case of Trellis (given no link header), there are some heuristics based on inspecting the content type of the incoming HTTP body: if it's a type of RDF that can be parsed, treat it as RDF; otherwise, treat it as a NonRDFSource. A well-behaved LDP client will be explicit about the resource type it tries to create and it should provide that header, but realistically (and to support the widest array of HTTP clients) some well-defined heuristic seems reasonable, even if it's non-normative text. |
Thanks @acoburn. It means that the client can find out whether an RDF interpretation is supported by including the header, or just treat the Solid server as a file store when it wants to. @csarven What are your thoughts about this? I especially like the latter, given that I like to think about a Solid server as "a file store, with special functionalities for RDF files". Some file types would then be a MUST, others a SHOULD or MAY. |
Typically, the This is a mechanism that allows clients to discover what formats are supported by the server. |
Myeah, but wouldn't that be An aspect of the Solid philosophy is to see it as a standard file server (Apache) with special support for RDF around it (LDP) and then access control (WebACL). This was the notion I got at least, so less LDP-centric. |
What I have chosen to do with Trellis is this: explicitly name the RDF formats that are supported, but then also include |
@RubenVerborgh I think the Link header mechanism is sufficiently expressive. And, yes, OPTIONS w/ Accept-Post is a great way to nail it down. Background: I vaguely remember (read: my memory may be completely failing right now) that Gold and NSS only acknowledged ldp:Resource and Container. So, we just went along with that and worked it through the Content-Type. I do remember trying to send RDFSource with dokieli, but it would get rejected. So, defaulted to just sentding ldp:Resource with whatever Content-Type (depending on what's being sent and specs involved, something like: text/html, text/turtle, or application/ld+json) What I would like to see is an interaction along these lines:
|
As an aside: I don't think LDP defines an |
@acoburn I agree, the specs should. I think we have already experimented with that at least in NSS .. again, can't remember now what happened to the issues (but they are there) and whether the server adheres to it. Same goes for Accept-=Patch, but there wasn't anyhting besides application/sparql-update to work with any way - so low priority. |
It looks like Trellis doesn't include the wildcard type in the |
I haven't fully thought this through but wanted to throw it out there nevertheless: do we need something like Accept-Link to help disambiguate the (Non)RDFSource instead of working through the wildcard in Accept-Post? That way, servers can be specific about their capability/promise and clients can decide if/how to follow-up. Is it adding more complication? Edit: If Accept-P* mixes media types, Accept-Link doesn't help to distinguish which (Non)RDFSource corresponds to the media types. So, scratch that idea. |
After storing an RDFa document it should be retrievable as Turtle or JSON-LD.
Two use cases:
The text was updated successfully, but these errors were encountered: