-
Notifications
You must be signed in to change notification settings - Fork 5
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
improve definition of Literal #162
base: main
Are you sure you want to change the base?
Conversation
</ul> | ||
<p>Comparison is performed using | ||
<p>Comparison of the [=lexical forms=] and of the [=datatype IRIs=] is performed using |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For the datatype IRIs, shouldn't this better be covered by IRI equality?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fair point. I reused existing language, which didn't mention IRI equality. This is equivalent, because IRI equality is also based on string comparison, but this would be clearer.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Minor points.
Co-authored-by: Olaf Hartig <olaf.hartig@liu.se> Co-authored-by: Gregg Kellogg <gregg@greggkellogg.net>
spec/index.html
Outdated
In RDF 1.1, `"chat"@fr` and `"chat"@FR` were representing two distinct terms, but implementations had license to replace one with the other (which most did). | ||
In RDF 1.2, they are now representing the exact same literal, i.e., the case difference in the concrete syntax does not propagate into the abstract syntax. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reword: RDF 1.1 still exists:
In RDF 1.1, `"chat"@fr` and `"chat"@FR` were representing two distinct terms, but implementations had license to replace one with the other (which most did). | |
In RDF 1.2, they are now representing the exact same literal, i.e., the case difference in the concrete syntax does not propagate into the abstract syntax. | |
In RDF 1.1, `"chat"@fr` and `"chat"@FR` represent two distinct terms, but implementations may replace one with the other (which many did). | |
In RDF 1.2, they represent the same literal, i.e., the case difference in the concrete syntax does not propagate into the abstract syntax. |
Co-authored-by: Andy Seaborne <andy@apache.org>
Co-authored-by: Andy Seaborne <andy@apache.org>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A language tag is not a string. BCP 47 does not provide a good foundation for RDF language tags.
RDF Concepts could say that a language tag is a lowercase string that meets the requirements of BCP 47 or it could say that a language tag is a sequence of ASCII case-insensitive characters where the string constructed by taking any of members of the equivalence sets in sequence meets the requirements of BCP 47. ASCII case-insensitive characters are then equivalence sets of characters under the equivalence relation that treats two characeters as equivalent if they are both the same when converted to lower case using ASCII case conversion. The former is simpler but the latter provides guidance on how to treat surface syntax language tags.
Saying that language tags are strings and then going on to define an equality over them is like saying that language tags are cats and then going on to say that two language tags are the same if they have the same colour - the right way here is to say either that language tag strings are cat colours or that they are equivalence classes of cats under the same-colour equivalence.
In what way does RDF Concepts not say that? A change might be saying that language tags are represented by strings conforming to RFC 5646. |
Not say what? |
What you describe. What is the concrete proposal (PR, or suggested change to this PR) for changing RDF Concepts? |
RDF Concepts says this, as far as I can tell:
RDF Concepts does not say either of these, as far as I can tell:
|
Not exactly. The text in this PR says
The goal is to convey the idea that RDF language tags are an abstraction of the string complying with BCP-47, without using such scary language. But ok, maybe it's too handwavy. I would be happy with changing the definition of language tags to lower-case BCP47-compliant strings (as proposed by @pfps). The 3rd paragraph of 3.4.1, in my opinion, explains clearly enough that concrete syntaxes and implementations are free to use the case they want (as long as they ignore it when comparing language tags). Note that I'm off for 1 week starting 1h ago, so this will not progress unless another editor takes custody of this PR. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some tweaks for clarity, grammar, and consistency.
@@ -733,125 +733,140 @@ <h3>Literals</h3> | |||
<p>Literals are used for values such as strings, numbers, and dates.</p> | |||
|
|||
<p>A <dfn data-local-lt="RDF literal">literal</dfn> in an <a>RDF graph</a> consists of | |||
two, three, or four elements, as follow:</p> | |||
two, three, or four elements, as follow.</p> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This was intentionally a colon. A full-stop puts a bit too much break.
two, three, or four elements, as follow.</p> | |
two, three, or four elements, as follow:</p> |
to a <a>literal value</a>.</li> | ||
<li>If and only if the <a>datatype IRI</a> is | ||
<code>http://www.w3.org/1999/02/22-rdf-syntax-ns#langString</code> or | ||
<code>http://www.w3.org/1999/02/22-rdf-syntax-ns#dirLangString</code>, a | ||
non-empty <dfn>language tag</dfn> as defined by [[!BCP47]]. The | ||
language tag MUST be well-formed according to | ||
<a data-cite="bcp47#section-2.2.9">section 2.2.9</a> | ||
of [[!BCP47]], | ||
and MUST be treated consistently, that is, in a case insensitive manner. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
and MUST be treated consistently, that is, in a case insensitive manner. | |
and MUST be treated consistently in a case insensitive manner. |
a <dfn>base direction</dfn> that MUST be either<ul> | ||
<li>`ltr`, indicating that the initial text direction is set to left-to-right, or</li> | ||
<li>`rtl`, indicating that the initial text direction is set to right-to-left.</li> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
a <dfn>base direction</dfn> that MUST be either<ul> | |
<li>`ltr`, indicating that the initial text direction is set to left-to-right, or</li> | |
<li>`rtl`, indicating that the initial text direction is set to right-to-left.</li> | |
a <dfn>base direction</dfn> that MUST be one of the following:<ul> | |
<li>`ltr`, indicating that the initial text direction is set to left-to-right</li> | |
<li>`rtl`, indicating that the initial text direction is set to right-to-left</li> |
|
||
<p><dfn data-local-lt="term-equal">Literal term equality</dfn>: | ||
Two literals are term-equal (the same <a>RDF literal</a>) | ||
two literals are term-equal (the same <a>RDF term</a>) | ||
if and only if:</p> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if and only if:</p> | |
if and only if the following are all true:</p> |
<li>the two <a>lexical forms</a> compare equal,</li> | ||
<li>the two <a>datatype IRIs</a> compare equal,</li> | ||
<li>the two <a>language tags</a> are either both absent, or both present and compare equal,</li> | ||
<li>the two <a>base directions</a> are either both absent, both `ltr`, or both `rtl`.</li> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
<li>the two <a>lexical forms</a> compare equal,</li> | |
<li>the two <a>datatype IRIs</a> compare equal,</li> | |
<li>the two <a>language tags</a> are either both absent, or both present and compare equal,</li> | |
<li>the two <a>base directions</a> are either both absent, both `ltr`, or both `rtl`.</li> | |
<li>The two <a>lexical forms</a> compare equal.</li> | |
<li>The two <a>datatype IRIs</a> compare equal.</li> | |
<li>The two <a>language tags</a> are either both absent, or both present and compare equal.</li> | |
<li>The two <a>base directions</a> are either both absent, both `ltr`, or both `rtl`.</li> |
In RDF 1.1, `"chat"@fr` and `"chat"@FR` theoretically represent two distinct terms, but implementations may replace one with the other via some form of normalization. | ||
In RDF 1.2, they represent the exact same literal, i.e., the case difference in the concrete syntax does not propagate into the abstract syntax. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In RDF 1.1, `"chat"@fr` and `"chat"@FR` theoretically represent two distinct terms, but implementations may replace one with the other via some form of normalization. | |
In RDF 1.2, they represent the exact same literal, i.e., the case difference in the concrete syntax does not propagate into the abstract syntax. | |
In RDF 1.1, `"chat"@fr` and `"chat"@FR` represent two distinct terms, | |
but implementations may replace either with the other via some form of normalization. | |
In RDF 1.2, they represent the exact same literal, | |
i.e., the case difference in the concrete syntax does not propagate into the abstract syntax. |
<li>If the literal is a <a>directional language-tagged string</a>, then the literal value is | ||
a tuple of its <a>lexical form</a>, its <a>language tag</a>, and its <a>base direction</a>, | ||
likewise in that order.</li> | ||
<li>If the literal's <a>datatype</a> is handled by an RDF implementation, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
<li>If the literal's <a>datatype</a> is handled by an RDF implementation, | |
<li>If the literal's <a>datatype</a> is handled by an RDF implementation, then one of the following applies: |
<li>if the literal's <a>lexical form</a> is in the <a>lexical space</a> | ||
of the <a>datatype</a>, then the literal value is the result of applying | ||
the <a>lexical-to-value mapping</a> of the datatype to the | ||
<a>lexical form</a>.</li> | ||
<li>otherwise, the literal is <dfn data-lt-no-plural>ill-typed</dfn> and no literal value can be | ||
associated with the literal. Such a case produces a semantic | ||
inconsistency but is not <em>syntactically</em> ill-formed. | ||
Implementations SHOULD accept [=ill-typed=] literals and produce RDF | ||
graphs from them. Implementations MAY produce warnings when | ||
encountering [=ill-typed=] literals.</li> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
<li>if the literal's <a>lexical form</a> is in the <a>lexical space</a> | |
of the <a>datatype</a>, then the literal value is the result of applying | |
the <a>lexical-to-value mapping</a> of the datatype to the | |
<a>lexical form</a>.</li> | |
<li>otherwise, the literal is <dfn data-lt-no-plural>ill-typed</dfn> and no literal value can be | |
associated with the literal. Such a case produces a semantic | |
inconsistency but is not <em>syntactically</em> ill-formed. | |
Implementations SHOULD accept [=ill-typed=] literals and produce RDF | |
graphs from them. Implementations MAY produce warnings when | |
encountering [=ill-typed=] literals.</li> | |
<li>If the literal's <a>lexical form</a> is in the <a>lexical space</a> | |
of the <a>datatype</a>, then the literal value is the result of applying | |
the <a>lexical-to-value mapping</a> of the datatype to the | |
<a>lexical form</a>.</li> | |
<li>Otherwise, the literal is <dfn data-lt-no-plural>ill-typed</dfn> and no literal value can be | |
associated with the literal. Such a case produces a semantic | |
inconsistency, but it is not <em>syntactically</em> ill-formed. | |
Implementations SHOULD accept [=ill-typed=] literals and produce RDF | |
graphs from them. Implementations MAY produce warnings when | |
encountering [=ill-typed=] literals.</li> |
</ul> | ||
|
||
<p> | ||
Thus, two literals can have the same value |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thus, two literals can have the same value | |
It follows from the above that two literals can have the same value |
</pre> | ||
|
||
<p>denote the same <a data-lt="literal value">value</a>, but are not the | ||
same literal <a>RDF terms</a> because their |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same literal <a>RDF terms</a> because their | |
same literal <a>RDF term</a> because their |
This PR was motivated by the problem raised here, aiming to fix the definition of "literal term equality".
But ended up in a more involved refactoring of the definition of Literal.
Below is a summary of the changes
Preview | Diff