Canonicalization #17

gkellogg · 2023-02-14T23:38:59Z

Improve canonicalization section.
Separate out Security Considerations from media type and add a Privacy Considerations stub.
Reference issue Re-consider use of escapes in canonical N-Quads #16 as a future direction for canonicalization.
Add prohibition on using a datatype IRI if the datatype is xsd:string when canonicalizing (see ambiguity of canonical N-Triples rdf-star-wg#9).

This will ultimately make its way into N-Triples, as well.

Note separate of Security Considerations and a phrase about the potential for unescaped characters to obfuscate a string presentation.

For #2.

Preview | Diff

domel

Keywords, from RFC 2119, in this section are in capital letters but in other parts are in small letters. I think that it should unified.

gkellogg · 2023-02-15T18:27:00Z

Keywords, from RFC 2119, in this section are in capital letters but in other parts are in small letters. I think that it should unified.

In non-normative sections the lower case "may" and "must" words are typically used so that they don't invoke RFC 2119; this would be meaningless in non-normative sections anyway, but it can be confusing.

Lower case versions are avoided in normative sections so they don't look like they are not, in fact, normative.

TallTed

Minor nits

spec/index.html

domel

After @TallTed editions, looks good!

afs

Commentary as well as changes. Maybe we should have an issue for canonical form because RCH has a much stronger requirement on the cannoical form.

spec/index.html

afs · 2023-02-16T08:48:53Z

If the goals of the canonical form is to be strictly canonical (for RCH and other signing, hashing uses), we should remove the text

  <p class="note">Even when not explicitly serializing
    canonical N-Quads, implementers are encouraged to produce this form.</p>

The choice of escape rules for the canonical form will likely be chooses for the strict canonical goal.

This is different to the original motivation for the canonical NT (text tools, specifically so a test suite can test the outcome of NT processing without itself being an NT processor). There, for example, raw control characters are more useful than escaped.

Change to a paragraph for the motivation being a canonical form of a quad being unique for a given choice of blank node labels and a unique document except for the order of quads.

spec/index.html

TallTed

Small tweaks for clarity, and one key question that will likely lead to another small change.

spec/index.html

TallTed · 2023-02-20T14:42:22Z

spec/index.html

-  <p class="note">Even when not explicitly serializing
-    canonical N-Quads, implementers are encouraged to produce this form.</p>
+  <p class="note">A canonical form for N-Quads can be used to ensure
+    that the form of a quad is unique for a given choice of


the form of a quad doesn't seem right, if my understanding is correct, that this form of N-Quads should result in reduction of semantically identical but syntactically different quads to a single canonical quad...

"syntactically different" needs some work. We don't want to imply the writing of the RDF terms themselves is affected.

"presentation"?

With the provision that this can't really be fully accomplished until #16 is also considered, the intention of this note is to limit choice of representing code points in the resulting RDF term (literal, in this case).

How about something like the following:

A canonical form of N-Quads can be used to ensure
that variations in the syntactic representation of terms
within that quad is determined; each code point
can be represented by only one of
UCHAR, ECHAR and unencoded character
where the relevant production allows for a choice in representation.

See updated note.

spec/index.html

pfps · 2023-02-20T14:55:04Z

I don't believe that the working group has decided to take up this technical work.

afs · 2023-02-20T15:58:09Z

@pfps - yes and no. These non-editorial errata are hard to gauge.

There is an errata which leads to https://lists.w3.org/Archives/Public/public-rdf-comments/2022Nov/0000.html

rdf-canon has a need for a more canonical "canonical form" which isn't editorial errata.

One thing that would help is for the WG to say which ones can proceed to the point of proposal, and which need WG discussion on whether to address at all and whether it is ready for a proposal.

The problem as I see it is that FPWDs, and also the continuous publishing of working drafts, do not distinguish "proposal" from statements that suggest intended direction. (I thought we were going to use feature branches but I may have misremembered.)

cc @rdfguy, @ktk

spec/index.html

gkellogg · 2023-02-21T16:30:01Z

This PR is languishing and as some of the changes to the Security Considerations, at least, are gating w3c/rdf-concepts#16 as well as other repos, I'd like to get consensus. If you've provided review comments previously, please either ask for changes or approve. Otherwise, I suggest we just merge and deal with any other changes in subsequent PRs. (Note #16 still to be considered).

pfps · 2023-02-21T16:51:47Z

My view is that this PR should not be merged until the WG has determined that it will take up the notion of a canonical form for N-quads.

gkellogg · 2023-02-21T17:00:43Z

@rdfguy and @ktk. Don't want to take time on Thursday's call, perhaps if either of you are on the Editor's call tomorrow we can discuss how to move forward on N-Quads canonicalization, which is gating for the RDF Dataset and Canonicalization WG. It is also an erratum against N-Quads.

Next step would be to split the PR between Security Considerations and Canonicalization so at least the Security Considerations part can move forward. (Generally a good idea, but these PRs have a way of snowballing).

pfps · 2023-02-21T17:45:30Z

Can you describe how the gate you mention works? I don't see how publishing a WD will help.

TallTed

I think that this will look good, after #19 is merged. But they're working on the same large blocks of text, so I'm not sure.

gkellogg · 2023-02-21T19:35:34Z

RDF Canonicalization has an algorithm for creating canonical blank node identifiers which depends on using a canonical form of N-Quads. It’s described in w3.org/TR/rdf-canon. Note the issue markers on required updates to N-Quads.

gkellogg · 2023-02-21T19:40:13Z

@TallTed the changes should be disjoint now.

domel · 2023-02-21T20:50:51Z

What is the reason for the following sentence?

Literals with the datatype http://www.w3.org/2001/XMLSchema#string MUST NOT use the datatype IRI part of the literal, and are represented using only STRING_LITERAL_QUOTE.

Maybe instead of exception, let all (typed) literals have datatype IRI (including string).

yamdan

I support this request, with a suggested change for a very minor typo.

spec/index.html

…y Considerations stub. Reference issue #16 as a future direction for canonicalization.

… when canonicalizing.

Co-authored-by: Ted Thibodeau Jr <tthibodeau@openlinksw.com>

… representation.

Co-authored-by: Andy Seaborne <andy@apache.org>

Co-authored-by: Ted Thibodeau Jr <tthibodeau@openlinksw.com>

…parate PR.

Co-authored-by: Ted Thibodeau Jr <tthibodeau@openlinksw.com>

Co-authored-by: Dan Yamamoto <yamdan@gmail.com>

gkellogg · 2023-03-30T20:58:36Z

Forced merge after rebase.

gkellogg · 2023-03-30T21:06:17Z

It was Resolved in the 30-03-2023 meeting to work on C14N.

RESOLUTION: the WG will work on c14n

gkellogg_: c14n. done for some time. ready to go. related to erratum.
… RCH group depends on it.
… I do think all issues have been discussed.
… the only thing that didn't happen was that it needs discussion on "label"

(?)
… adding section to n-quads. minor changes from what was done in n-triples.
… more changes anticipated.
… that would be the subject of future PRs.

ora: there has been substantial discsion on issues page. see no reason not to move forward.
… anyone need any time to still look at this before it gets merged?

<pfps> I'm happy with the workihg group taking up canonicalization issues. I would like to see a resolution that the working group is going to support this.

ora: hearing no objections.

pfps: I would like to see a resolution that we can point back to that we decided to do this.
… it is a substantial change.

ora: you're saying we should make a resolution that we will work on c14n, then we can merge?

pfps: yes.

ora: any objections to that?

gkellogg_: make a proposed resolution.

<ora> PROPOSAL: the WG will work on c14n

<gkellogg_> +1

<ora> +1

<TallTed> +1

<ktk> +1

<afs> +1

+1

<pfps> +1

<AZ> +1

<doerthe> +1

gkellogg requested review from afs, domel and pchampin February 14, 2023 23:38

domel reviewed Feb 15, 2023

View reviewed changes

TallTed suggested changes Feb 15, 2023

View reviewed changes

spec/index.html Outdated Show resolved Hide resolved

spec/index.html Outdated Show resolved Hide resolved

spec/index.html Outdated Show resolved Hide resolved

domel self-requested a review February 15, 2023 22:18

domel approved these changes Feb 15, 2023

View reviewed changes

afs reviewed Feb 16, 2023

View reviewed changes

spec/index.html Outdated Show resolved Hide resolved

spec/index.html Outdated Show resolved Hide resolved

spec/index.html Outdated Show resolved Hide resolved

spec/index.html Show resolved Hide resolved

afs reviewed Feb 16, 2023

View reviewed changes

spec/index.html Outdated Show resolved Hide resolved

spec/index.html Outdated Show resolved Hide resolved

spec/index.html Outdated Show resolved Hide resolved

spec/index.html Outdated Show resolved Hide resolved

domel self-requested a review February 16, 2023 16:05

gkellogg commented Feb 16, 2023

View reviewed changes

spec/index.html Outdated Show resolved Hide resolved

TallTed reviewed Feb 17, 2023

View reviewed changes

spec/index.html Outdated Show resolved Hide resolved

gkellogg requested a review from afs February 17, 2023 21:27

TallTed suggested changes Feb 20, 2023

View reviewed changes

gkellogg mentioned this pull request Feb 20, 2023

Security considerations w3c/rdf-concepts#16

Merged

TallTed reviewed Feb 20, 2023

View reviewed changes

spec/index.html Outdated Show resolved Hide resolved

gkellogg added a commit that referenced this pull request Feb 21, 2023

Extract Security Considerations from PR #17.

a91857e

gkellogg mentioned this pull request Feb 21, 2023

Extract Security Considerations from Media Type #19

Merged

TallTed reviewed Feb 21, 2023

View reviewed changes

pchampin mentioned this pull request Mar 9, 2023

Canonical form - avoid redundancy w3c/rdf-n-triples#11

Closed

yamdan approved these changes Mar 15, 2023

View reviewed changes

spec/index.html Outdated Show resolved Hide resolved

gkellogg added the spec:substantive Change in the spec affecting its normative content (class 3) –see also spec:bug, spec:new-feature label Mar 16, 2023

pfps added the needs discussion Proposed for discussion in an upcoming meeting label Mar 23, 2023

pfps removed the needs discussion Proposed for discussion in an upcoming meeting label Mar 30, 2023

gkellogg and others added 20 commits March 30, 2023 13:44

Improve canonicalization section.

4b3f743

Separate out Security Considerations from media type and add a Privac…

62ed776

…y Considerations stub. Reference issue #16 as a future direction for canonicalization.

Add prohibition on using a datatype IRI if the datatype is xsd:string…

725e3f4

… when canonicalizing.

Apply suggestions from code review

082ad16

Co-authored-by: Ted Thibodeau Jr <tthibodeau@openlinksw.com>

Spelling

48a0978

Update change note on PN_CHARS_U to describe the change in blank node…

bab2478

… representation.

Remove issue marker related to blank node labels.

3d9dfb5

Apply suggestions from code review

3ac7927

Co-authored-by: Andy Seaborne <andy@apache.org>

White space updates.

9fce01a

Apply suggestions from code review

c8b9ac8

Co-authored-by: Ted Thibodeau Jr <tthibodeau@openlinksw.com>

Change note motivating the use of canonical N-Quads.

78c5498

Sync recent changes to w3c/rdf-concepts#16.

57972e2

Apply suggestions from code review

5d3955a

Co-authored-by: Ted Thibodeau Jr <tthibodeau@openlinksw.com>

Fix IRI term references.

b50aff6

Apply suggestions from code review

f2e518e

Update note motivating canonical N-Quads.

30e0365

Update spec/index.html

a9082d1

Co-authored-by: Ted Thibodeau Jr <tthibodeau@openlinksw.com>

Remove updates to security considerations and media type, now in a se…

b5e10c8

…parate PR.

Apply suggestions from code review

7665e24

Co-authored-by: Ted Thibodeau Jr <tthibodeau@openlinksw.com>

Update spec/index.html

ef93071

Co-authored-by: Dan Yamamoto <yamdan@gmail.com>

gkellogg force-pushed the c14n-document branch from b2daa54 to ef93071 Compare March 30, 2023 20:58

gkellogg merged commit ac9ad81 into main Mar 30, 2023

gkellogg deleted the c14n-document branch March 30, 2023 21:08

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Canonicalization #17

Canonicalization #17

gkellogg commented Feb 14, 2023 •

edited by pr-preview bot

Loading

domel left a comment

gkellogg commented Feb 15, 2023

TallTed left a comment

domel left a comment •

edited

Loading

afs left a comment

afs commented Feb 16, 2023

TallTed left a comment

TallTed Feb 20, 2023

afs Feb 20, 2023

gkellogg Feb 20, 2023

gkellogg Feb 20, 2023

pfps commented Feb 20, 2023

afs commented Feb 20, 2023

gkellogg commented Feb 21, 2023

pfps commented Feb 21, 2023

gkellogg commented Feb 21, 2023

pfps commented Feb 21, 2023

TallTed left a comment

gkellogg commented Feb 21, 2023

gkellogg commented Feb 21, 2023

domel commented Feb 21, 2023

yamdan left a comment

gkellogg commented Mar 30, 2023

gkellogg commented Mar 30, 2023

Canonicalization #17

Canonicalization #17

Conversation

gkellogg commented Feb 14, 2023 • edited by pr-preview bot Loading

domel left a comment

Choose a reason for hiding this comment

gkellogg commented Feb 15, 2023

TallTed left a comment

Choose a reason for hiding this comment

domel left a comment • edited Loading

Choose a reason for hiding this comment

afs left a comment

Choose a reason for hiding this comment

afs commented Feb 16, 2023

TallTed left a comment

Choose a reason for hiding this comment

TallTed Feb 20, 2023

Choose a reason for hiding this comment

afs Feb 20, 2023

Choose a reason for hiding this comment

gkellogg Feb 20, 2023

Choose a reason for hiding this comment

gkellogg Feb 20, 2023

Choose a reason for hiding this comment

pfps commented Feb 20, 2023

afs commented Feb 20, 2023

gkellogg commented Feb 21, 2023

pfps commented Feb 21, 2023

gkellogg commented Feb 21, 2023

pfps commented Feb 21, 2023

TallTed left a comment

Choose a reason for hiding this comment

gkellogg commented Feb 21, 2023

gkellogg commented Feb 21, 2023

domel commented Feb 21, 2023

yamdan left a comment

Choose a reason for hiding this comment

gkellogg commented Mar 30, 2023

gkellogg commented Mar 30, 2023

gkellogg commented Feb 14, 2023 •

edited by pr-preview bot

Loading

domel left a comment •

edited

Loading