Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

WP-dependent URIs of resources #18

Closed
murata2makoto opened this issue Aug 4, 2017 · 23 comments
Closed

WP-dependent URIs of resources #18

murata2makoto opened this issue Aug 4, 2017 · 23 comments

Comments

@murata2makoto
Copy link

We might want to give different URIs for a resource in the context of one WP and the same resource in
the context of a different WP. Such URIs can allow different "next primary resources" depending on context WPs.

@murata2makoto murata2makoto changed the title WP-dependent URIs of resouces WP-dependent URIs of resources Aug 4, 2017
@HadrienGardeur
Copy link

I'm not sure I fully understand the point of this issue.

If the same resource is hosted at two different locations, then of course it will have two different URLs.

But if this is a request/requirement for arbitrarily minting a new URL for a resource (simply because it belongs to multiple WPs), then I strongly object against this idea.

IMO the context will either have to:

  • be guessed from discovery (link to a manifest)
  • or provided separately (I'm pretty sure that the JSON-LD based model for Web Annotations can handle that, but I'll let @BigBlueHat chime in)

@murata2makoto
Copy link
Author

murata2makoto commented Aug 5, 2017

But if this is a request/requirement for arbitrarily minting a new URL for a resource (simply because it belongs to multiple WPs), then I strongly object against this idea.

I am suggesting (1) generation a WP-dependent URI from a manifest URI and a resource URI, and (2) obtaining the manifest URI from a WP-dependent URI.

Why do you object? Because you have a different mechanism in mind?

@mattgarrish
Copy link
Member

Is this all that different from having a start_url for the manifest?

If start_url is just the first document to load, you have the context of the publication from the manifest link.

@mattgarrish
Copy link
Member

mattgarrish commented Aug 5, 2017

Except it's hard-coded in the manifest... Nevermind.

I was thinking it was a parameter you could use to dynamically load a first page.

@murata2makoto
Copy link
Author

Is this all that different from having a start_url for the manifest?

Yes, it is different.

To allow a resource to belong to multiple WPs and to allow existing
resources to be included in WPs, we cannot embed anything in resources.

@mattgarrish
Copy link
Member

we cannot embed anything in resources

Could you clarify what you mean by this statement? I'm not sure I understand what you're referring to. Are you trying to avoid link elements to manifests, data urls, iframes, something else?

I thought you were asking about referencing a primary resource within the context of the publication it belongs to, which could be useful. If you could reference the manifest with an arbitrary start_url, you could give an unambiguous context:

See <a href="http://www.example.com/journal/manifest.json?start_url=article.html">some article</a>.

It would be an interesting feature of a publication, but probably not critical technology. It wouldn't work well as a linking mechanism for user agents that aren't wp-aware, either.

If a resource identifies that it belongs to more than one publication, presumably the user agent should present the available options to the user to select from, so resources would only be completely context-less if they don't identify that they belong to a publication at all.

@murata2makoto
Copy link
Author

Could you clarify what you mean by this statement? I'm not sure I understand what you're referring to. Are you trying to avoid link elements to manifests, data urls, iframes, something else?

I am trying to avoid link elements to manifests.

It would be an interesting feature of a publication, but probably not critical technology. It wouldn't work well as a linking mechanism for user agents that aren't wp-aware, either.

I think that it is critical, since users do not want be asked to choose the next primary resource from the available options.

@mattgarrish
Copy link
Member

It's interesting as far as referencing within the context of a specific publication goes, but how does it not become quickly impracticable?

For example, how does a search engine discover what publication a resource belongs to if there is no information encoded within the document?

And how does a user discover a resource belongs to a publication except through an authored link?

@murata2makoto
Copy link
Author

murata2makoto commented Aug 6, 2017

First, although I am skeptical about embedded links, nothing stops using WP-dependent URIs as well as embedded links (or HTTP headers).

Second, even if we do not use embedded links, I think that we have sensible outcome. Suppose that a primary resource X belongs to WP A and WP B. Then , there will be three URIs: (1) standalone X, (2) X in the context of A, and (3) X in the context of B. If search engines find all three URIs, they will appear in the result.

@HadrienGardeur
Copy link

A number of points are being raised, I'm not sure if they're all relevant now (probably not).

1) Including links in resources that are present in multiple publications

I agree with @murata0204 that for resources that are part of multiple WP, we can't expect links back to every single manifest.

This means that such a link cannot be the only way that we discover that a resource is included in a WP.

This also means that a link to a manifest should be a recommendation but not a requirement for resources in a WP.

2) Creating new URLs for "resource x in the context of y"

That said, I don't think that trying to contain such information in a URL is a good idea either.

CFI attempted the same thing by having a left-most part (going through an EPUB's spine) in addition to a right-most part (going through the document's tree), and it was confusing, likely to break whenever a publication is updated and barely supported by EPUB reading systems.

If we create a new URL scheme that doesn't fallback to the resource or isn't supported by a normal client (any UA that exists today), we're pretty much building something incompatible with the Web.

Instead of inventing a new URI scheme or URI template for dealing with that problem, I think that we should look into the cases where we'll need to link to a resource in a WP and figure out specific solutions.

For example with Web Annotations, it's possible to provide a scope for the annotation: https://www.w3.org/TR/annotation-model/#scope-of-a-resource

In the case of a WP, the scope could be the WP itself while the target is the resource:

{
  "@context": "http://www.w3.org/ns/anno.jsonld",
  "id": "http://example.org/anno37",
  "type": "Annotation",
  "body": "http://example.org/note1",
  "target": {
    "source": "http://example.org/image1",
    "scope": "http://example.org/publication"
  }
}

@mattgarrish
Copy link
Member

we can't expect links back to every single manifest.

No argument here.

This means that such a link cannot be the only way that we discover that a resource is included in a WP.

Agree. All I'm questioning is whether a new scheme is really going to obviate embedding. Of that, I'm sceptical. At most, it seems like another complement.

Being able to link to primary resources within the context of their specific publication does strike me as a useful feature to provide, but I'm not overly sold that sharing of primary resources will be a major feature of publications. Secondary resources, like style sheets and scripts and images, are more commonly shared, and are also less likely to be directly referenced.

@lrosenthol
Copy link

lrosenthol commented Aug 6, 2017 via email

@murata2makoto
Copy link
Author

@HadrienGardeur wrote:

  1. Creating new URLs for "resource x in the context of y"

That said, I don't think that trying to contain such information in a URL is a good idea either.

CFI attempted the same thing by having a left-most part (going through an EPUB's spine) in addition to a right-most part (going through the document's tree), and it was confusing, likely to break whenever a publication is updated and barely supported by EPUB reading systems.

CFI uses fragment identifiers. By definition, everything referenceable by a CFI is a second-class citizen. I think that this is big problem of CFI.

If we create a new URL scheme that doesn't fallback to the resource or isn't supported by a normal client (any UA that exists today), we're pretty much building something incompatible with the Web.

It is true that a new URL scheme allows no fallback. The only thing we can do is to provide a different URI (context-independent URI).

I think that this is a hard problem. At this stage, we should probably create a list of possible solutions.

@HadrienGardeur suggested that a scope (or WP) URI should be used together with a resource URI. This is a possibility, although it implies that a single URI is not good enough for a WP resource.

Another possibility is to heavily use cookies so that user agents are aware of the current WP. Alas, this is also against the Web architecture.

@murata2makoto
Copy link
Author

@lrosenthol

While I agree that this is an interesting use case - it's a huge security
problem as you have no single origin for the content against which to
validate its authenticity.

Given one URI for a resource-in-WP, we will have one origin. But what it is? The answer depends on how you construct a resource-in-WP URI from a resource URI and a WP manifest URI.

@lrosenthol
Copy link

lrosenthol commented Aug 7, 2017 via email

@murata2makoto
Copy link
Author

You are assuming that the single URI for a given resource is constructed
based on the manifest URI - and that's not necessary true if we wish to
allow items from different origins to be valid inside a single (P)WP.

Even if a manifest URI and a resource URI are from different origins,
it is possible to create a resource-in-WP URI from the URIs. I
dot understand your comment.

@lrosenthol
Copy link

lrosenthol commented Aug 7, 2017 via email

@murata2makoto
Copy link
Author

If they are coming from different origins, then you cannot "create one from
them". They each have unique ones to start with.

I do not think so. We only have to escape some characters so that
domain names within resource URIs are not recognized as
domain names.

@lrosenthol
Copy link

lrosenthol commented Aug 7, 2017 via email

@TzviyaSiegman
Copy link
Contributor

@murata0204 if I understand correctly, you are suggesting that every resource in a WP be identified with a URL that is not a fragment ID, and if a resource exists in more than one WP, the URL should change. Is that correct?
I think we need to first establish a mechanism for identifying resources, then discuss what happens if a resource appears in more than one WP.
We should also pay attention to https://www.w3.org/2001/tag/doc/distributed-content/.

@HadrienGardeur
Copy link

@TzviyaSiegman @murata0204 and quite a few of the concerns expressed in that doc are relevant here:

  • URL fragmentation/pollution
  • Origin policy
  • Less canonical source referencing

@TzviyaSiegman
Copy link
Contributor

This issue is superseded by #44

@iherman
Copy link
Member

iherman commented Aug 29, 2017

See telco discussion on closure.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants