Various errata from 1.1 #2

gkellogg · 2023-01-21T21:31:40Z

gkellogg · 2023-02-14T06:46:16Z

In an email relevant to erratum 32, @kasei writes:

Within STRING_LITERAL_QUOTE, only the characters U+0022, U+005C, U+000A, U+000D are encoded using ECHAR. ECHAR must not be used for characters that are allowed directly in STRING_LITERAL_QUOTE.

Does this really mean that control characters must be written directly without escaping or encoding (e.g. NULL, BELL, BACKSPACE, etc.)? While their use probably isn’t common in N-Triples documents, the idea of a canonical representation requiring these to be written directly strikes me as ill-advised, as it makes handling of this data more difficult (e.g. having to carefully handle NULL characters vs. NULL terminators, not being able to copy-paste data containing unprintable control characters, etc.).

The ECHAR range is '\' [tbnrf"'\], which doesn't really cover control characters; those would have needed to be represented using UCHAR, which have been explicitly prohibited. We would need to add them back and require them for control characters, to be consistent, but this may be going too far far. But, this would presumably cover \u0000 through \u001F exclusive of those covered by allowed ECHAR.

I suggest we not consider this, unless there is a demonstrated need, as it was considered and resolved in 1.1.

afs · 2023-02-14T09:04:27Z

We should consider this.

The original motivation for a canonical form was simple processing by text tools - e.g. regex of an NT line.
Now it is for RDF canonicalization and signing. Anything that can be used to confuse makes it a security issue.

The RDF 1.1 NT text includes "Implementers are encouraged to produce this form." so the format was not mandatory. We have some room for improvements.

A change to consider is

Characters in the codepoint range U+0020 to U+10FFFF MUST NOT be represented by UCHAR.

Characters in the codepoint range U+0000 to U+001F MUST be represented by ECHAR or represented by UCHAR where ECHAR is not available.

All UCHAR would be better but we are where we are.

This is an outline to show something is possible - the text needs refining.

Process-wise:

I suggest creating an issue for this, label security, and close the errata.
There should be something in the security section.

(We need a better way to track cross document concerns.)

kasei · 2023-02-14T16:36:42Z

I suggest we not consider this, unless there is a demonstrated need, as it was considered and resolved in 1.1.

@gkellogg – Do you have any pointers to the previous discussions? From the outside of the WG, the handling of the canonical form seemed a bit rushed, and I wasn't left with the feeling that it got a lot of consideration. Would like to look into the reasoning used during 1.1 to end up with the decisions that were made.

gkellogg · 2023-02-14T16:48:47Z

The discussion in the RDF WG was before my time. Looking through the RDF WG mail archives doesn't provide much, either.

@ericprud was likely involved in the C14N discussions. But, @afs's points about security certainly make a case for revisiting this. @dlongley may have a view on the implications for https://github.com/w3c/rdf-canon, but I suspect that there won't be any tests that overlap with the problem areas.

See #2 (comment) for a suggested change to using ECHAR and UCHAR for canonical N-Quads/Triples.

gkellogg · 2023-02-14T22:17:43Z

Chatted with @ericprud on Skype. The main motivation for canonicalization in N-Triples was for testing. Best is to create an issue specific to escaping in literals, and note as an issue in the C14N section and in a new Security Considerations section.

gkellogg mentioned this issue Jan 23, 2023

ambiguity about canonical N-Triples / N-Quads w3c/rdf-canon#66

Closed

domel mentioned this issue Feb 7, 2023

resolved issue 16 #9

Merged

gkellogg mentioned this issue Feb 14, 2023

Re-consider use of escapes in canonical N-Quads #16

Closed

gkellogg added the spec:substantive Change in the spec affecting its normative content (class 3) –see also spec:bug, spec:new-feature label Mar 22, 2023

gkellogg mentioned this issue Mar 22, 2023

Canonicalization #17

Merged

gkellogg added the propose closing Proposed for closing label Apr 5, 2023

gkellogg closed this as completed Apr 15, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Various errata from 1.1 #2

Various errata from 1.1 #2

gkellogg commented Jan 21, 2023 •

edited

Loading

gkellogg commented Feb 14, 2023

afs commented Feb 14, 2023 •

edited

Loading

kasei commented Feb 14, 2023

gkellogg commented Feb 14, 2023

gkellogg commented Feb 14, 2023

Various errata from 1.1 #2

Various errata from 1.1 #2

Comments

gkellogg commented Jan 21, 2023 • edited Loading

gkellogg commented Feb 14, 2023

afs commented Feb 14, 2023 • edited Loading

kasei commented Feb 14, 2023

gkellogg commented Feb 14, 2023

gkellogg commented Feb 14, 2023

gkellogg commented Jan 21, 2023 •

edited

Loading

afs commented Feb 14, 2023 •

edited

Loading