Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

zero current implementations of Turtle/TriG/N-Triples/N-Quads functionality are spec compliant, because none can "ensure that malignant strings may not be used to mislead the reader" — there's just no way to do so! #11

Closed
TallTed opened this issue Jan 31, 2023 · 9 comments · Fixed by #16
Assignees
Labels
spec:enhancement Change to enhance the spec without affecting conformance (class 2) –see also spec:editorial

Comments

@TallTed
Copy link
Member

TallTed commented Jan 31, 2023

Originally posted by @TallTed in #7 (comment)

    Application rendering strings retrieved from untrusted RDF documents must ensure
    that malignant strings may not be used to mislead the reader.

@TallTed 18 hours ago

I don't believe this requirement can be satisfied by any applications available today, if ever.

As things stand, humans must take care to load varyingly trustworthy RDF into appropriately partitioned data stores (typically, using named graphs), and to execute queries across "all data in the store" only when recognizing that results may include malicious, erroneous, or otherwise undesirable statements (whether triples or quads).

Such undesirable results are not the responsibility of any software in the stack.

@gkellogg 18 hours ago

It's taken right out of the Turtle IANA Considerations section

Turtle can express data which is presented to the user, for example, RDF Schema labels. Application rendering strings retrieved from untrusted Turtle documents must ensure that malignant strings may not be used to mislead the reader. The security considerations in the media type registration for XML ([RFC3023] section 10) provide additional guidance around the expression of arbitrary data and markup.

I didn't want to try to tread new ground here, but ReSpec demands that specs have such a section, and this seemed to apply generally to RDF serializations. I also don't want to revisit the IANA registrations of Turtle/TriG/N-Triples/N-Quads which already have such considerations, and it would be inconsistent to say something different.

@TallTed now

Wow. As I understand things, that means that zero current implementations of Turtle/TriG/N-Triples/N-Quads functionality are spec compliant, because none can "ensure that malignant strings may not be used to mislead the reader" — there's just no way to do so!

@afs
Copy link
Contributor

afs commented Jan 31, 2023

I don't think it says that.

It is in a section "Security considerations". It is something applications (not parsers) should be aware of. The word "compliance" is not mentioned.

#7 (comment)

@TallTed
Copy link
Member Author

TallTed commented Jan 31, 2023

@afs — Do you consider "parsers" to be disjoint from "applications"? I consider "parsers" to be a subtype of "applications"...

This section may well be informative, not normative, in all cases; that's great, it means that existing apps are no less compliant by not "[ensuring] that malignant strings may not be used to mislead the reader".

Still, that sentence is quite likely to mislead many readers. It definitely over-promises the capabilities of all "Application rendering strings retrieved from untrusted RDF documents", and I strongly feel that it should be changed, in all locations.

@afs
Copy link
Contributor

afs commented Jan 31, 2023

If you have improvements, great.

The title of this issue introduces the word "compliant" - not the spec text.

Parsers are not string-rendering applications and often library code. They do not know the end-intent. What is safe in one situation is not safe in another.

"considerations" are important advice such as the Turtle text and in context such as
"Applications interpreting data expressed in Turtle should address the security issues of ..."
I find it clear.

s/Application rendering strings/Applications rendering strings/ would be better and I consider it editorial.

@TallTed
Copy link
Member Author

TallTed commented Feb 1, 2023

s/Application rendering strings/Applications rendering strings/

There's no change, there, to be better.

What is safe in one situation is not safe in another.

Yes, that's so. Tradeoffs are necessary all the time. This does not change the MUST in the spec text, which may not have been intended as RFC2119 defines, but I guarantee that I am not the first, and won't be the last, to read it as I have.

Again —

Application rendering strings retrieved from untrusted RDF documents must ensure that malignant strings may not be used to mislead the reader.

Far better would be to put the onus of determining trust where it belongs — on the reader, based in part on the source — and to say something like —

Readers should be appropriately cautious when working with strings retrieved from RDF documents from potentially untrustworthy sources, not to treat them as guaranteed to be safe nor true.

@afs
Copy link
Contributor

afs commented Feb 1, 2023

I don't understand who "the reader" is.

Framed as widely as your suggestion, it might be better in RDF Concepts.

We can look at other specs such as HTML and XML to see what they say.

@gkellogg
Copy link
Member

gkellogg commented Feb 1, 2023

Actually, I think #11 will do; I'll update the section with an issue reference.

@TallTed
Copy link
Member Author

TallTed commented Feb 1, 2023

I don't understand who "the reader" is.

As I see it, "the reader" (which I took from the original phrasing) is the agent — whether social agent, software agent, or otherwise — that consumes the RDF documents and/or any of the strings (which I read as "literals", whether typed or untyped, though this might well include URLs/URIs/IRIs) they contain.

(If you disagree with my interpretation, or even if you agree, you're pretty damn knowledgable in this arena, so I'd expect you to understand the intended meaning of a worthwhile warning without any trouble. Some rephrasing, whether mine above or some further revision, seems called for.)

Framed as widely as your suggestion, it might be better in RDF Concepts.

I don't immediately disagree. Whether it should only be in RDF Concepts is another question to be considered.

@gkellogg
Copy link
Member

On the subject of general security considerations, and of particular interest for N-Triples and N-Quads canonicalization, is the ability for RDF Literals to include un-escaped control characters, which may obfuscate the content in presentation.

See w3c/rdf-n-quads#16

@TallTed
Copy link
Member Author

TallTed commented Feb 15, 2023

My concern is with the phrase must ensure that malignant strings may not be used to mislead the reader.

Malignant strings may have many origins, many reasons for being presented to the reader; and there's virtually no computational way to ensure the innocuousness of any given string being presented to the reader.

This stricture is entirely unachievable, and it should be removed from all specs that currently contain it.

@gkellogg gkellogg self-assigned this Mar 16, 2023
@gkellogg gkellogg added the spec:enhancement Change to enhance the spec without affecting conformance (class 2) –see also spec:editorial label Mar 16, 2023
gkellogg added a commit that referenced this issue Mar 30, 2023
* Update security considerations based on work in N-Quads.

Fixes #11.

---------

Co-authored-by: Ted Thibodeau Jr <tthibodeau@openlinksw.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
spec:enhancement Change to enhance the spec without affecting conformance (class 2) –see also spec:editorial
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants