Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adds RDF term equality definitions #161

Open
wants to merge 7 commits into
base: main
Choose a base branch
from
Open

Adds RDF term equality definitions #161

wants to merge 7 commits into from

Conversation

hartig
Copy link
Contributor

@hartig hartig commented Feb 25, 2025

Closes #154 by adding definitions of 'blank node equality', 'triple term equality', 'RDF term equality', and 'triple equality'. Additionally, this PR makes the definitions of graph comparison and dataset comparison more explicit by using these notions of equality.


Preview | Diff

@@ -925,8 +952,6 @@ <h3>Graph Comparison</h3>
the triple (|s|, |p|, |o|) is in |G| if and only if
the triple ( |M|(|s|), |M|(|p|), |M|(|o|) ) is in <var>G'</var>.</p>

<p>See also: <a>IRI equality</a>, <a>literal term equality</a>.</p>
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note that I have removed this one because the cross-references to these definitions are integrated directly into the definition above now.

Copy link
Contributor

@pchampin pchampin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The only change that I really "request" is the one in 'Graph Comparison', because (unless I'm missing something) it introduces an error.

The others are merely an expression of my preferences, but I can live without them.

spec/index.html Outdated
Comment on lines 943 to 944
<li>For every [=literal=] |lit|, |M|(|lit|) is a [=literal=] that is [=literal term equality|equal=] to |lit|.</li>
<li>For every [=IRI=] |iri|, |M|(|iri|) is an [=IRI=] that is [=IRI equality|equal=] to |iri|.</li>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I could live with that, but I find it needlessly verbose and confusing. The point here is not to produce a new value that happens to be equal to the argument, the point is to return the argument itself...
I would slightly prefer to keep '=' here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

.. The point here is not to produce a new value that happens to be equal to the argument, the point is to return the argument itself...

In this case, it would actually be a better idea to replace the = by "is" (e.g., "M(iri) is iri" instead of "M(iri) = iri").

Notice, however, that this wording makes a difference for literals: Consider two literals, lit1 and lit2, which both have the same lexical form, both have rdf:langString as their datatype, and one of them has "EN" as its language tag whereas the other one has "en" instead. In this case, lit1 is not lit2, but they are equal according to literal term equality. So, if we say "M(lit) is lit" in this definition here, then M(lit1) cannot return lit2 but must return lit1; in contrast, if the definition says "M(lit) = lit" (and assuming = means literal term equality), then M(lit1) may also return lit2 (as an alternative to returning lit1).

I am not even sure which of these two cases we actually want.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In this case, it would actually be a better idea to replace the = by "is" (e.g., "M(iri) is iri" instead of "M(iri) = iri").

Yes, I like that.

Notice, however, that this wording makes a difference for literals:

You gave me a lot to think about with this puzzle :) My conclusion (which I will explain in more detail in the main conversation of this PR) is that this is not (or should not be) an issue.

@afs
Copy link
Contributor

afs commented Feb 25, 2025

There isn't a preview/diff? Is this because the boilerplate was not included in the description?

Copy link
Contributor

@afs afs left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One suggestion, non-blocking.

@hartig
Copy link
Contributor Author

hartig commented Feb 25, 2025

There isn't a preview/diff? Is this because the boilerplate was not included in the description?

Strange. I have never put anything special in my PRs before. What would this boilerplate be?

Co-authored-by: Pierre-Antoine Champin <github-100614@champin.net>
@afs
Copy link
Contributor

afs commented Feb 25, 2025

There isn't a preview/diff? Is this because the boilerplate was not included in the description?

Strange. I have never put anything special in my PRs before. What would this boilerplate be?

It should be added when the PR is created via the github UI.

I don't know if it can be retrospectively added.
For this PR, it isn't to hard to get the PR branch and look at that. But if text is altered in several places, or text moved around, I find it useful to see the diff.

#159 for example has ("edit" the description to see it):

<!--
    This comment and the below content is programmatically generated.
    You may add a comma-separated list of anchors you'd like a
    direct link to below (e.g. #idl-serializers, #idl-sequence):

    Don't remove this comment or modify anything below this line.
    If you don't want a preview generated for this pull request,
    just replace the whole of this comment's content by "no preview"
    and remove what's below.
-->
***
<a href="https://pr-preview.s3.amazonaws.com/w3c/rdf-concepts/pull/159.html" title="Last updated on Feb 14, 2025, 9:18 AM UTC (fcb12f0)">Preview</a> | <a href="https://pr-preview.s3.amazonaws.com/w3c/rdf-concepts/159/df7b9db...fcb12f0.html" title="Last updated on Feb 14, 2025, 9:18 AM UTC (fcb12f0)">Diff</a>

Co-authored-by: Pierre-Antoine Champin <github-100614@champin.net>
@hartig
Copy link
Contributor Author

hartig commented Feb 25, 2025

It should be added when the PR is created via the github UI.

That's what is strange. I did create the PR via the GitHub UI, exactly as I have done it for earlier PRs. But this time the link to the preview was not added.

Oh, wait, the only thing that I can imagine being the reason is that I edited the PR text a few minutes after having created the PR. Maybe that edit made a concurrently running auto-edit fail?

@afs
Copy link
Contributor

afs commented Feb 25, 2025

Could well be!

Co-authored-by: Ted Thibodeau Jr <tthibodeau@openlinksw.com>
@pchampin
Copy link
Contributor

pchampin commented Feb 26, 2025

This PR got me thinking (and re-reading the specs) for a long time... but I think I finally put my finger on what bothers me.

My major issue is that it approaches "term equality" (for each category of terms) as if the abstract syntax had a notion of equality that was different from the notion of identity. And in fact, it does not. And after my long mulling, I'm very much convinced that it must not make such a distinction (see pathological example below).

Those "term equality" sections exist mostly to emphasize how variations in concrete syntaxes and internal representations are not relevant for the abstract syntax. But granted, the pre-existing sections (esp. the one about literals) were already not very clear about that. (I just noticed how the last sentence of the definition of language tag is baroque in that respect: "Two language tags are the same if they only differ by case." -- and I may very well be responsible for this wording...).

The sections added in this PR are going further in the direction of "distinct but equal", with wording such as "are considered equal".

I'll try to suggest changes to this PR to clarify all this, or possibly make a counter-proposal.


Pathological example showing that the distinction between identity and equality in the abstract syntax is detrimental:

Consider the following two graphs

# G1
:s :p1 "chat"@en-US.
:s :p2 "chat"@en-US.
# G2
:s :p1 "chat"@en-us.
:s :p2 "chat"@EN-US.

I would like to consider that they are isomorphic, right? Because all the objects are equal. But if we consider the literals to be distinct in the abstract syntax, then per our definition of isomorphism, they are not isomorphic. We would need a mapping M that maps "chat"@en-US sometimes to "chat"@en-us and sometimes to "chat@EN-US!...

Note also that in RDF 1.1, the two graphs above are not isomorphic, nor are they simply-equivalent (but they are D-equivalent if D contains rdf:langString, and therefore RDF-equivalent).

@afs

This comment was marked as outdated.

@afs
Copy link
Contributor

afs commented Feb 26, 2025

Wording (not in this PR) that I found potentially weak:

"the two language tags (if any) compare equal"

what if one has a language and one does not? "abc" and "abc"@en. Only one language tag.

@pchampin
Copy link
Contributor

pchampin commented Feb 26, 2025

Wording (not in this PR) that I found potentially weak:

"the two language tags (if any) compare equal"

what if one has a language and one does not? "abc" and "abc"@en. Only one language tag.

I'll prepare a PR to improve the section on literals, in the light of my remarks above. For the rest, I think the suggestions I just made on this PR are enough.

Co-authored-by: Pierre-Antoine Champin <github-100614@champin.net>
@hartig
Copy link
Contributor Author

hartig commented Feb 26, 2025

@pchampin I applied your three edit suggestions as I agree that the artificial distinction between equality and identity is confusing and useless. I will wait, however, with changing the definition in the 'Graph Comparison' section until I have seen your PR for improving the part about literal equality. Related to that PR that you are planning, notice that the definition of literal term equality has been changed already by our WG. The tricky part is probably not to improve the wording of the four bullet points but the paragraph that follows after the bullet points. There is some explanation for this paragraph in the 'Changes' section (see the point that begins with: "Implementations were previously allowed to normalize ..."). Some parts of this came in with PRs #48, #59, #74, but the main one then was PR #105 with the corresponding issue #100

… use identity for the cases of IRIs and literals
@hartig
Copy link
Contributor Author

hartig commented Feb 27, 2025

@pchampin Given your PR #162 with the improved definition of literal term equality, I have now pushed the remaining change to this PR here to change the definition in the 'Graph Comparison' section as per my proposal that you liked (i.e., replacing "M(lit) = lit" by "M(lit) is lit", and likewise for the case of IRIs). That should address the remaining point of your previous review of this PR.

Copy link
Contributor

@pchampin pchampin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks a lot. I'm very happy with this PR, now, modulo what I believe to be a typo.

Co-authored-by: Pierre-Antoine Champin <github-100614@champin.net>
@hartig
Copy link
Contributor Author

hartig commented Feb 28, 2025

@TallTed I applied your edit suggestions. Are you okay with this PR now?

the following are true:
<ul>
<li>|M|(|n|) is [=RDF term equality|equal=] to <var>n'</var>.</li>
<li>The triple (|s|, |p|, |o|) is in |G| if and only if
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See #161 (comment)

Suggested change
<li>The triple (|s|, |p|, |o|) is in |G| if and only if
<li>The triple (|s|, |p|, |o|) is in |G| and

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

RDF term equality definitions
5 participants