Skip to content

Preserve code 104 for backward compatability #85

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
aakilfernandes opened this issue Feb 4, 2019 · 8 comments
Closed

Preserve code 104 for backward compatability #85

aakilfernandes opened this issue Feb 4, 2019 · 8 comments

Comments

@aakilfernandes
Copy link

104 is the ascii letter for h, which will be the first letter in http and https urls.

I've been suggesting the use of multiaddr in EIPs, but they've mostly been rejected. The reason given is that multiaddr is not "stable". Of course, mutliaddr will probably never be truly stable, but thats a discussion for another day.

All that said, preserving 104 would allow for wider multiaddr use in protocols which currently only allow for http and https urls

@Stebalien
Copy link
Member

This is similar to multiformats/multicodec#93.

Note: Multiaddr is for addressing endpoints not content.

@ntninja
Copy link
Contributor

ntninja commented Feb 5, 2019

I suggest reserving all of 97–122 (lowercase ASCII letters) and 65 – 90 (uppercase ASCII letters), since an URIs, and by extension also URLs, scheme may start with any of these characters (RFC 3986#3.1). Additionally 47 (ASCII forward-slash) should be reserved since that's how text-encoded multiaddrs start. This way it should be possible to non-ambiguously determine whether the given string is an URI/URL, binary multiaddress or text multiaddress.

Note: Multiaddr is for addressing endpoints not content.

Yes, but I know plenty of applications that use http(s)://<hostname>:<port>/<base> for compactly describing a protocol+hostname+port+endpoint_path quadruplet. An it's not just HTTP-stuff, I know of at least 2 mail clients that use imap:// and smtp:// “URLs” internally for describing the server address.
Hence It's a URL does not imply It (tries to) point to content.

@aakilfernandes
Copy link
Author

Sorry, I'm a little lost on endpoint v content. Can someone explain?

But yes, in general I agree that preserving all ascii letter codes would be a good step

@ntninja
Copy link
Contributor

ntninja commented Feb 6, 2019

@aakilfernandes:

Sorry, I'm a little lost on endpoint v content. Can someone explain?

I'll try to explain using the IPFS HTTP API as an example:

When you want to do a request (such as /add) you need to know (1) the location of the server, that is the hostname or IP address, (2) the listening port and (3) the application path base/prefix. This third part tends to get people confused, since all mentionings of HTTP paths in the scope of MultiAddr only refer to that HTTP path base component.

So what is it referring to?

While often times when starting standalone HTTP applications, the base will be / (ie: no prefix), an example where this not usually the case is when referring to API endpoints: The IPFS API server, for instance, exposes all of its methods below the path /api/v0, thereby requiring API requests to /add to use the HTTP path /api/v0/add when doing the request instead. If you now add a reverse proxy in front of this API Server however, that forwards all requests for /ipfs-api to your local API server at http://localhost:5001/api/v0 your base would now be /ipfs-api when talking to that server, but you'd still get access to the same API server (endpoint) through that path on that server.

Note that in all of this we only referred to the specific resource we wanted to interact with (the /add command) only by example, we could used any of the many other commands and where they are placed inside HTTP would not affect that we can group all of these commands to an abstract entity commonly referred to as the IPFS HTTP API. How these groups are subdivided exactly is up to interpretation, MultiAddr however will never refer to the specific resource like /api/v0/add but only to services (or endpoints) such the IPFS HTTP API Server at Bob's place not the particular content they host.

Hopefully that makes it clearer what the HTTP paths are, and aren't, referring to in the context of MultiAddr.

@aakilfernandes
Copy link
Author

Got it, that makes sense.

@Stebalien
Copy link
Member

I suggest reserving all of 97–122 (lowercase ASCII letters) and 65 – 90 (uppercase ASCII letters), since an URIs, and by extension also URLs, scheme may start with any of these characters (RFC 3986#3.1).

Unfortunately, we can't do that. We only have 128 single-byte multicodecs available (and are already using some of those codes).

0x47 is multiformats/multicodec#93 (which I completely agree with).

At the end of the day, I'm not convinced trying to disambiguate between multiaddr and URLs is useful. Users should use one or the other or disambiguate themselves.

Yes, but I know plenty of applications that use http(s)://:/ for compactly describing a protocol+hostname+port+endpoint_path quadruplet. An it's not just HTTP-stuff, I know of at least 2 mail clients that use imap:// and smtp:// “URLs” internally for describing the server address.
Hence It's a URL does not imply It (tries to) point to content.

Yep, I agree. I was just trying to get everyone on the same page. OP asked about EIP and HTTP so I assumed the use-case was content in web-browsers. This came up when Swarm tried to use multiaddrs with EIP: #73. In the end, they we went with a new thing entirely: a multicodec prefixed path (multiformats/multicodec#104).

This is actually a problem with how we currently specify the "http" protocol. For example, the unix protocol actually accepts a path. Unfortunately, this means it has to be a terminal protocol because it ends up consuming the rest of the path

I see you've found the discussion here: #63

@ntninja
Copy link
Contributor

ntninja commented Mar 13, 2019

@Stebalien: So I guess this can be closed as WONTFIX, with the conclusion that disambiguating between binary multipaths and URL locators is an application-level affair, then?

@ghost
Copy link

ghost commented Mar 16, 2019

@Stebalien: So I guess this can be closed as WONTFIX, with the conclusion that disambiguating between binary multipaths and URL locators is an application-level affair, then?

I would say so too. If you're in a context where you could actually confuse URLs and multiaddrs, make the multiaddrs into URIs a la maddr:/ip4/127.0.0.1.

Or work with the string representation which will always start in /. It seems a bit funky to me to have a context where you have URL strings, but multiaddr byte arrays.

@ghost ghost closed this as completed Mar 16, 2019
This issue was closed.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants