Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Question about URN notation #1

Open
ToshihikoMakita opened this issue Nov 23, 2014 · 11 comments
Open

Question about URN notation #1

ToshihikoMakita opened this issue Nov 23, 2014 · 11 comments

Comments

@ToshihikoMakita
Copy link

After the DITA-OT Day held on Nov 20, 2014 I have been searching the DITA specialization example written in RELAX NG and finally I reached this repository.

Seeing specialization in topic.rng is very useful. But I have one question in URN notation of this specialization of the following file.

org.dita-community.doctypes / doctypes / topic / rng / catalog.xml

In this catalog file there are following URN notation.

<uri name="urn:pubid:dita-community.org:doctypes:dita:rng:topic.rng"
    uri="topic.rng"
 />

The feature of this notation is using “pubid” as NID. Generally URN is defined as following BNF notaion.

<URN> ::= "urn:" <NID> ":" <NSS>

http://tools.ietf.org/html/rfc2141

But as far as I know using “pubid” as NID is not found in Uniform Resource Names (URN) Namespaces.

http://www.iana.org/assignments/urn-namespaces/urn-namespaces.xhtml

Is using “pubid” officially recognized as NID?

As I’m new to RELAX NG specialization and URN, please comment me if my understanding has any mistakes.

Regards,

@drmacro
Copy link
Contributor

drmacro commented Nov 23, 2014

Makita-san,

I had not been aware of the NID aspect of URNs (or if I ever was aware of it, I had forgotten it).

"pubid" is my private convention and is not a registered URI namespace. I didn't realize that I was violating URN rules (I had though that whatever followed "urn:" was pretty much unconstrained).

However, in investigating this I did learn that "publicid" IS a registered namespace:

http://tools.ietf.org/html/rfc3151

I will start using "publicid" going forward.

If you read the RFC for "publicid" you will see that it defines rules for transcribing SGML-style public IDs into URN syntax. However, I can't find any requirement in the XML specification that requires public IDs to use the SGML syntax.

So I think that means I'm safe in using "publicid" as the URN namespace and then using any string (that otherwise conforms to the URN syntax rules) for the name.

Thank you for bringing this to my attention--it is difficult to keep up with all the various Internet standards and this detail had clearly escaped me.

Cheers,

Eliot

@drmacro
Copy link
Contributor

drmacro commented Nov 23, 2014

It looks like I was wrong, at least in terms of how Norm Walsh interprets "publicid" handling: the Xerces resolver code treats URNs with NID's of "publicid" specially, translating the name back into public ID syntax before doing catalog resolution.

That means that "publicid" cannot be used for normal URN-syntax strings (e.g., "foo:bar:baz").

I will have to consider what this means. I may pursue registering "public" or "pubid" for use as an alternative to "publicid" where the only rule would be that the first colon-delimited token be the owner identifier, e.g., "example.org".

@robertnthomas
Copy link

Perhaps registering a more general term, such as "doc", would be better
since the urns often show up in systemId attributes within OASIS catalogs.

On Sun, Nov 23, 2014 at 8:08 AM, Eliot Kimber notifications@github.com
wrote:

It looks like I was wrong, at least in terms of how Norm Walsh interprets
"publicid" handling: the Xerces resolver code treats URNs with NID's of
"publicid" specially, translating the name back into public ID syntax
before doing catalog resolution.

That means that "publicid" cannot be used for normal URN-syntax strings
(e.g., "foo:bar:baz").

I will have to consider what this means. I may pursue registering "public"
or "pubid" for use as an alternative to "publicid" where the only rule
would be that the first colon-delimited token be the owner identifier,
e.g., "example.org".


Reply to this email directly or view it on GitHub
#1 (comment)
.

Bob Thomas
+1 720 201 8260
Skype: bob.thomas.colorado
Instant messaging: Gmail chat (bob.thomas@tagsmiths.com) or Skype
Time zone: Mountain (GMT-7)

@drmacro
Copy link
Contributor

drmacro commented Nov 24, 2014

I'm not sure "doc" would be right since the public IDs generally don't identify documents (in the general sense) but non-document things (DTD components, grammar components, etc.).

For now I'm using "X-pubid" since that is allowed by the RFCs.

@drmacro
Copy link
Contributor

drmacro commented Nov 24, 2014

It's even more interesting than I realized because the "X-" convention has been formally deprecated:

http://tools.ietf.org/html/rfc6648

So I think I'm back to just using "pubid" as I was, but I'll attempt to use it consistently and consider registering it sooner rather than later.

@ToshihikoMakita
Copy link
Author

Mr. Kimber,

Thank you for your detailed analysis.

That means that "publicid" cannot be used for normal URN-syntax strings

Yes, I confirmed that following two document type declaration are the same when validating a topic against the DTD bundled with DITA-OT 1.8.5.

<!DOCTYPE topic PUBLIC "-//OASIS//DTD DITA Topic//EN" "topic.dtd">
<!DOCTYPE topic PUBLIC "urn:publicid:-:OASIS:DTD+DITA+Topic:EN" "topic.dtd">

It's even more interesting than I realized because the "X-" convention has been formally
deprecated:

I didn't know that there is such RFC. In the past when I tried to make specialization with XML Schema, I used the following URN notation.

<uri name="urn:x-antennahouse:dita:xsd:topic.xsd" uri="ah_topic.xsd"/>

I’m not still sure what is the best way to make own URN for XSD or RNG DITA specialization.

Regards,

@ToshihikoMakita
Copy link
Author

RFC6648 refers “X-“ prefix as “Application Protocols”. But RFC3406 defines X-<NID> as “Experimental Namespaces”. Are they surely intended the same purpose?

@drmacro
Copy link
Contributor

drmacro commented Nov 24, 2014

I saw the reference to RFC 6648 specifically in the context of URNs and I think it does apply--the justification is primarily about avoiding the need to migrate from using "X-" to just "", which leads to interoperation issues and confusion.

The purpose of the NID field of URNs seems a little ambiguous to me: it could be seen as a general protocol indicator (as "publicid" is), governing names regardless of ownership, or it could be seen as an "owner identifier", which is how the the "oasis" NID is used: it confers ownership of all such names to OASIS.

My personal analysis is that the NID is more appropriately used for protocol distinction, not ownership, so I think using something like "pubid" and then having the next field be the owner identifier is the best approach, e.g.:

urn:pubid:example.org:{whatever you want for a specific resource}

And that's the convention I've been using.

Thus I would recommend something like:

urn:pubid:antennahouse.com:dita:xsd:topic.xsd

I think the intent of the URN is made clear by the "pubid" NID, ownership is made clear by "antennahouse.com", and the rest follows the OASIS convention (which otherwise has no particular magic, it's just a unique string value).

It seems highly unlikely that anyone that is not Antenna House would accidentally use "pubid:antennahouse.com" in any URN, so I think uniqueness is effectively guaranteed.

@robertnthomas
Copy link

pubid will work; although including "id" in the string is somewhat
redundant (like saying "PIN number"). Using something like "public" or
"vocabulary" seems like a better fit. However, I don't feel strongly about
this, and I would be quite satisfied if you simply registered pubid.

On Mon, Nov 24, 2014 at 7:48 AM, Eliot Kimber notifications@github.com
wrote:

I saw the reference to RFC 6648 specifically in the context of URNs and I
think it does apply--the justification is primarily about avoiding the need
to migrate from using "X-" to just "", which leads to interoperation
issues and confusion.

The purpose of the NID field of URNs seems a little ambiguous to me: it
could be seen as a general protocol indicator (as "publicid" is), governing
names regardless of ownership, or it could be seen as an "owner
identifier", which is how the the "oasis" NID is used: it confers ownership
of all such names to OASIS.

My personal analysis is that the NID is more appropriately used for
protocol distinction, not ownership, so I think using something like
"pubid" and then having the next field be the owner identifier is the best
approach, e.g.:

urn:pubid:example.org:{whatever you want for a specific resource}

And that's the convention I've been using.

Thus I would recommend something like:

urn:pubid:antennahouse.com:dita:xsd:topic.xsd

I think the intent of the URN is made clear by the "pubid" NID, ownership
is made clear by "antennahouse.com", and the rest follows the OASIS
convention (which otherwise has no particular magic, it's just a unique
string value).

It seems highly unlikely that anyone that is not Antenna House would
accidentally use "pubid:antennahouse.com" in any URN, so I think
uniqueness is effectively guaranteed.


Reply to this email directly or view it on GitHub
#1 (comment)
.

Bob Thomas
+1 720 201 8260
Skype: bob.thomas.colorado
Instant messaging: Gmail chat (bob.thomas@tagsmiths.com) or Skype
Time zone: Mountain (GMT-7)

@drmacro
Copy link
Contributor

drmacro commented Nov 24, 2014

The reason I prefer "pubid" over "public" is that "public" is very generic. I would have used "publicid" myself if it wasn't already taken.

Could also use "xmlpublic" or "xmlpubid" I guess to make it clear that the use domain is XML stuff. That might be a better name.

@robertnthomas
Copy link

I have a slight preference for "xmlpublic". However, I am so used to
thinking about this form of indirection as a "public identifier" that
"xmlpubid" would work well for me.

On Mon, Nov 24, 2014 at 9:30 AM, Eliot Kimber notifications@github.com
wrote:

The reason I prefer "pubid" over "public" is that "public" is very
generic. I would have used "publicid" myself if it wasn't already taken.

Could also use "xmlpublic" or "xmlpubid" I guess to make it clear that the
use domain is XML stuff. That might be a better name.


Reply to this email directly or view it on GitHub
#1 (comment)
.

Bob Thomas
+1 720 201 8260
Skype: bob.thomas.colorado
Instant messaging: Gmail chat (bob.thomas@tagsmiths.com) or Skype
Time zone: Mountain (GMT-7)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants