Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

IPIP: Gateway _redirects File #290

Merged
merged 32 commits into from
Sep 23, 2022
Merged

IPIP: Gateway _redirects File #290

merged 32 commits into from
Sep 23, 2022

Conversation

justindotpub
Copy link
Contributor

@justindotpub justindotpub commented Jun 15, 2022

This is part of #257, working code in ipfs/kubo#8890

👀 Preview

For reference, working code in Kubo PR: ipfs/kubo#8890

@lidel lidel changed the title Gateway Redirects RFC and spec IPIP: Gateway _redirects File Jun 24, 2022
Copy link
Member

@lidel lidel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for filling this!

Will circle back to this when we land gateway specs and IPIP process, but made a quick pass from IPIP perspective (figuring out what works, what not, and what we can simplify), so thank you for your patience.

Left some comments inline (but feel free to ignore/park until final IPIP template lands)

RFC/0000-gateway-redirects.md Outdated Show resolved Hide resolved
http-gateways/REDIRECTS_FILE.md Show resolved Hide resolved
RFC/0000-gateway-redirects.md Outdated Show resolved Hide resolved
RFC/0000-gateway-redirects.md Outdated Show resolved Hide resolved
RFC/0000-gateway-redirects.md Outdated Show resolved Hide resolved
@lidel lidel deleted the branch ipfs:main July 1, 2022 21:55
@lidel lidel closed this Jul 1, 2022
@justindotpub
Copy link
Contributor Author

@lidel out of curiosity, did you mean to close this?

@lidel
Copy link
Member

lidel commented Jul 2, 2022

@justincjohnson no, I never closed this, thanks for the ping 🙏

It looks like a bug in Github – your PR should be switched to main after I merged #290 , but instead it closed this PR. I'll reopen and fix it manually, sorry for the noise.

ps. I'll be going over IPIP reviews next week.

@lidel lidel reopened this Jul 2, 2022
@lidel lidel changed the base branch from feat/gateway-specs to main July 2, 2022 15:10
@justindotpub justindotpub marked this pull request as ready for review July 2, 2022 16:31
@justindotpub justindotpub requested a review from lidel July 2, 2022 16:31
@justindotpub justindotpub requested a review from a team as a code owner July 11, 2022 13:22
@lidel lidel self-assigned this Aug 4, 2022
Copy link
Member

@lidel lidel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for your angelic patience @justincjohnson ❤️

I did the first pass review of this along with your working code in ipfs/kubo#8890 – some questions / asks apply to both – details inline.

IPIP/0000-gateway-redirects.md Outdated Show resolved Hide resolved
IPIP/0000-gateway-redirects.md Outdated Show resolved Hide resolved
IPIP/0000-gateway-redirects.md Outdated Show resolved Hide resolved
IPIP/0000-gateway-redirects.md Outdated Show resolved Hide resolved
IPIP/0000-gateway-redirects.md Outdated Show resolved Hide resolved
http-gateways/REDIRECTS_FILE.md Outdated Show resolved Hide resolved
Comment on lines 63 to 64
- `301` - Permanent Redirect (default)
- `302` - Temporary Redirect
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@justincjohnson thoughts on also supporting these niche redirect codes?

The rationale is that the spec should state which codes are supported,
and implementation should error on unsupported ones.

By allowing all 301-308 variants here, we make it easier for people to move static website hosting to IPFS,
while keeping dynamic backends that process POST still around.

Suggested change
- `301` - Permanent Redirect (default)
- `302` - Temporary Redirect
- `301` - Permanent Redirect (default)
- `302` - Found (commonly used for Temporary Redirect)
- `303` - See Other (replacing PUT and POST with GET)
- `307` - Temporary Redirect (explicitly preserving body and HTTP method of original request)
- `308` - Permanent Redirect (explicitly preserving body and HTTP method of original request)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@lidel I'm not sure this makes sense. If I understand correctly, these additional methods assume we will be receiving requests with PUT and POST and then redirect rules can redirect those requests to a separate dynamic backend for processing, using the same PUT or POST method. The problem I see with this is the gateway currently isn't handling PUT and POST requests. Any requests with PUT or POST result in an error that the gateway is read-only. So any redirect rules that are processed will only be for GET or HEAD requests.

Are you wanting these changes in anticipation of future rearchitecting of the gateway to support writes?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@justincjohnson no, this is purely to maximize interop with HTTP semantics.

Supporting these codes makes it easier for people who have a regular web2 website to move hosting of that website to IPFS. They can redirect legacy POST endpoints to a different hostname (307, 308), or turn them to GET (303) and show user-friendly error page.

By making the migration less painful, we allow for migrating bigger websites in stages.

Copy link
Contributor Author

@justindotpub justindotpub Aug 18, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I assume I'm missing something here. 😅

307 and 308 redirect using the same method as the request, so to redirect to a web2 backend using those codes we are assuming the request came to us as a POST.

303 can redirect a POST to a GET, which again assumes the request came to us as a POST.

My point is that currently the gateway won't even serve these requests, thus they will never reach the redirect code.

$  curl -X POST http://localhost:8080/ipfs/$CID             
Method POST not allowed: read only access

But please do enlighten me. 🙏

Copy link
Member

@lidel lidel Aug 18, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You are correct, but the behavior you described is a bug/limitation in Kubo implementation, not a technical limitation blocking us from having them in this spec.

fwiw prior art for _redirects implementations is that only 301 302 work everywhere:

  • Cloudflare Pages docs state they support 301, 302, 303, 307, 308
  • Netlify only supports 301 and 302, explicitly does not support 307, does not mention others

My suggestion is to include all HTTP redirect codes in the _redirects spec for completeness, and fix Kubo at some point, but I don't feel strongly about this: at the end of the day, docs, examples, most users will use 301 and 302 and don't even know other codes exist. They have a very niche utility.

Should we keep 303, 307, 308 in this spec?

@thibmeu thoughts from Cloudflare side?
@dignifiedquire thoughts from Iroh's gateway side?

(I'll bring this up during today's IPFS Implementers call)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Got it. That makes sense to me. 👌

http-gateways/REDIRECTS_FILE.md Outdated Show resolved Hide resolved
http-gateways/REDIRECTS_FILE.md Outdated Show resolved Hide resolved
http-gateways/REDIRECTS_FILE.md Show resolved Hide resolved
IPIP/0000-gateway-redirects.md Outdated Show resolved Hide resolved
IPIP/0000-gateway-redirects.md Outdated Show resolved Hide resolved
@justindotpub
Copy link
Contributor Author

justindotpub commented Aug 11, 2022

@lidel I believe I've addressed all of your feedback on the spec and IPIP. In case it got lost in the resolved comments above, the car file fixture is attached again here.

redirects.car.zip

@lidel lidel self-requested a review August 18, 2022 21:19
Copy link
Member

@lidel lidel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Big fan of comprehensive text fixtures, that implementers can use in their tests, thank you!
I think it is ready for final reviews.

We were missing some folks at IPFS Implementers Sync call today, so pinging here:

@fabricedesre
Copy link

I have a couple of questions:

  1. Why not use a /.well-known/ url instead of _redirect ? There are many use cases that seem similar (see https://en.wikipedia.org/wiki/Well-known_URI)
  2. Is this applicable to gateways only? I guess we will expect native ipfs:// and ipns:// protocol handlers to behave the same in various contexts.

@bmann
Copy link

bmann commented Aug 19, 2022

@fabricedesre the way I think about this is it mirrors a Netlify or Cloudflare Pages deploy. So, in part, we are following cow paths that are already paved, so using _redirects and the literal same syntax that the two mentioned systems already use.

We'll see where other use cases end up, but the one to look at is PWAs / SPAs that need redirects support like this to be able to be deployed correctly and then viewed / function correctly across many IPFS contexts.

So yes, this is "gateways only" to support publish your app to IPFS but load in an HTTP browser context.

@fabricedesre
Copy link

I would argue that netlify / cloudflare made the wrong choice of location here, but that ship has sailed :)

Where I'm a bit more worried is focusing on a "HTTP browser context". Accessing IPFS through http is a temporary workaround until ipfs:// protocol handlers are available. Once this is the case (we have one in https://capyloon.org), this proposal adds a constraint to the code that resolves ipfs:// urls, turning it in some kind of application server to keep feature parity with "ipfs over http" urls. This is somewhat worrisome, depending on the implementation details.

@rvalle
Copy link

rvalle commented Aug 19, 2022

@fabricedesre I want to add my POV as an DAPP developer that has been waiting for this functionality for long time, and following this PR closely.

Bringing these comments at this stage, when the PR was open for months is slightly unfair to the effort of contributors. I am not saying that your points are not valid, but they could have been introduced earlier in the process.

For the part of IPFS fundamentals and HTTP workaround. I fully understand the IPFS design principles and see your point. However HTTP is not going anywhere anytime soon and coexistence and migration paths will be required for decades. To make a simile from the telco world SMS is alive and kicking in an industry that is already in its 6th generation.

From an DAPP developer point of view, the rewrite feature is a must. My use case is having multiple URIs been served from the same "text" (template), a core feature of internet's content architecture for which a migration path to IPFS will be required.

An ipfs:// protocol handlers alone would not fix the problem. And populating millions of CIDs for slightly different content (templating) would be as wasteful as creating an HTTP resource for each slightly different content, it is not going to happen.

Will there be an efficient, IPFS native, way to implement a substitute for content templating? how would an ipfs:// entrypoint to templated content look like? That is a very fair question to make, but would probably require research both at IPFS level then followed by DAPP application stack adoption. Not happening anytime soon.

All points to the fact that IPFS stack needs to offer a migration path, otherwise, why an HTTP gateway at all?

I think the contributor has made an exceptional effort in providing this feature, carefully integrating with the current state of art and minimizing the impact on the HTTP gateway functionality, respecting its "transitional" nature.

We'll see where other use cases end up,

This is a wise statement, for many of those use cases will need an IPFS native alternative.

@lidel
Copy link
Member

lidel commented Aug 19, 2022

Late feedback is ok, I've pinged folks on discussion forums for this very reason :)
There is a solid, real life problem raised by @fabricedesre: native ipfs:// in browsers like Brave.

Let's explore ramifications.

How _redirects leaks into browsers with native URIs

[..] redirects support like this to be able to be deployed correctly and then viewed / function correctly across many IPFS contexts

This is important.

Today, ipfs:// and ipns:// are backed by HTTP Gateway provided by Kubo, so _redirects will "just work".
But what happens when Brave decides to replace Kubo with built-in IPFS support? Or some other browser?

This is a real question with real ramifications: we want to have native ipfs:// in Brave Mobile next, and we can't bundle Kubo there. Based on initial discussions, most likely we will have block/car fetches from gateway in trustless mode, with UnixFS re-assembly and CID verification on the client.

Should Brave Mobile ignore _redirects and break websites, which work fine on Desktop version? I'd say that is a big no. User's expectation is to load the same website they see on gateway, with the same behaviors, but with additional integrity guarantees.

👉 This means _redirects, for pragmatic reasons, will be something that user agents focused on websites have to support on ipfs:// and ipns://, even if it got born as "HTTP interop thing".

Should we make any changes to this spec to account for native URIs?

I know additional bikeshed is frustrating, but we have some open questions to resolve now, and perhaps solve some long-standing problems around _redirects standard:

  • 👉 How to deal with native URIs?
    • I believe we should add a section to this spec. Clearly define behavior on native ipfs:// and ipns:// (one way or another)
      • Sounds like we have to support redirects in these contexts, does not seem like we can ignore _redirects there, but would like to hear arguments against.
  • 👉 Should we move _redirects to .well-known/.. URI?
    • pros
      • If ever is a time to do it, it is now.
      • UX is better:
        • it has a nice property of being clearly separate from the website content, makes it less awkward in ipfs:// and ipns:// URI contexts
        • removes expectation that lookup for _redirects resolution happens on every level of deeper directory trees (our spec is limited to root only, which may be confusing for people who don't believe in reading docs)
        • creates clear space for other things, e.g. - .well-known/ipfs/ could be used for IPFS-specific things, e.g. a custom directory listing template .well-known/ipfs/dir-index.html
    • cons
      • redirect paths are defined in a file which is one or two directories away from the content root
    • If we do this, when the specification is finalized I'd like to register it at IANA
      • Should it be IPFS-specific .well-known/ipfs/redirects, or a generic .well-known/redirects that other projects could adopt as an alternative location of _redirects files?
        • My preference would be to go with generic .well-known/redirects location, maximizing the utility of the introduced convention, and allowing for .well-known/headers to be added in the future.

I'd be ok either way, ipns://example.com/_redirects or ipns://example.com/.well-known/redirects, the latter is a bit more explicit hint for user agents accessing the data, making it more clear for implementers how to keep the same behavior in both HTTP Gateway and native URI contexts.
Web browsers and Web gateways will leverage it, other agents are free to ignore it.

Would appreciate a sanity check on this approach.

@justindotpub
Copy link
Contributor Author

justindotpub commented Aug 19, 2022

Thinking aloud here, but I'm open to any needed changes, so this isn't push back per se.

Are many developers familiar with well-known URIs and the .well-known/ folder?
I would argue that most aren't, and exposing it to them feels a bit jarring.

Are there other projects outside the IPFS ecosystem that desire to use a well known, shared format for redirect specifications?
I'm not familiar with any, but perhaps this is a known need that truly has demand?
If there are, a downside to being part of this standard is that IPFS implementations and software would need to be updated to keep up with changes to this standard instigated outside the IPFS ecosystem.
Also, given the constraints of IPFS, there may be features that make perfect sense for other web servers that don't for IPFS (e.g. force redirects, which we currently do not support for performance reasons).

Is there value in simply consolidating config into a single directory?
As mentioned, it makes clear that the config content isn't to be served.
If so, would .ipfs/ be better and less jarring for developers? To me this feels more developer friendly and consistent with how config works in other projects (e.g. .github/).

Are many developers likely to assume they can configure redirects at any level of a project?
In my opinion, no (can't even think of a modern counter example), and docs are there for a reason if anyone is confused.
In short, I don't feel like this should weigh heavily in a decision.

@justindotpub
Copy link
Contributor Author

Also to be clear, I appreciate @fabricedesre commenting on the implications for serving web content over ipfs:// and ipns://. I don't have much input there, and those challenges feel broader than this spec, but I'm listening.

@rvalle
Copy link

rvalle commented Aug 19, 2022

I don't know if this will very useful to you, but I am going to relate how I would expect this to work when developing a DAPP front-end meant to be served via IPFS:

It is normally regarded as good practice, i.e. the community is trying to convert development stacks so that the deliverable always works as relative content (there is no URL prefix kown)

So, a DAPP front-end should work when "mounted" on a domain with a _dnslink, but should also work when accessed via CID directly, like:

https://my-dapp.com works
https://ipfs.io/ipfs/ROOT_CID works also
ipfs://ROOT_CID works

So that the application is not dependent no how is accessed in the IPFS-verse, which makes it feel more "decentralized".

I am not too familiar with .well-known but I associate it with metadata relative to the serving domain. A quick look at wikipedia states that the prefix "/well-known" is used. that would imply that there can be only one /well-known per domain.

In my opinion _redirect medatadata should be maintained by the same person that maintains the content itself, as they are the ones that know which redirects or rewrites are required. There should not be any assumption of where it is going to be placed: i.e. should work in a "relative" fashion, working regardless of the level at which the content is finally placed.

For example, I can think, imagine I want my application to serve maps/region/region_name content, but all of them are going to be served by a script called map_reagion.js in that folder. Then I could rewrite anything under maps/region/* into map_region.js living under the maps/region folder. map_region.js will use the information on the URL ifself to workout what exactly to render (i.e. zoom in a SVG, for example). Thanks to this setup, I can distribute links pointing to particular regions across the application of third parties. then:

https://my_dapp.com/maps/reagion/US would work thanks to _dnslink
https://ipfs.io/ipfs/ROOT_CID/maps/region/US would work by generic content gateway access
ipfs://ROOT_CID/maps/region/US would also work

Of course It would be possible to call it ./well-known instead of _redirects, but I believe /well-known is meant to be at the root, and that would be strange for the generic content gateway access as the resource would be https://ipfs.io/ipfs/CID/.well-known/ i guess.

The case of _redirects is specified as living in the "publishing folder", whatever that means, seems more like ROOT_CID. or?

@justindotpub
Copy link
Contributor Author

justindotpub commented Aug 19, 2022

Thanks for sharing that background info @rvalle.

As mentioned in the spec, for security reasons...

This functionality will only be evaluated for Subdomain or DNSLink Gateways, to ensure that redirect paths are relative to the root CID hosted at the specified domain name.

So the _redirects file is always at the root, and so would .well-known or .ipfs or whatever if we decided to switch.

It seems to me that making web apps securely accessible from any (possibly nested) IPFS path or from ipfs:// and ipns:// URLs is outside the scope of this work, though it certainly makes sense to discuss potential impact.

Justin Johnson and others added 19 commits September 16, 2022 14:10
Co-authored-by: Marcin Rataj <lidel@lidel.org>
Co-authored-by: Marcin Rataj <lidel@lidel.org>
Co-authored-by: Marcin Rataj <lidel@lidel.org>
Co-authored-by: Marcin Rataj <lidel@lidel.org>
Co-authored-by: Marcin Rataj <lidel@lidel.org>
Co-authored-by: Marcin Rataj <lidel@lidel.org>
Co-authored-by: Marcin Rataj <lidel@lidel.org>
- Add max file size
- Update test fixture CID, due to new tests for max file size
lidel pushed a commit to ipfs/kubo that referenced this pull request Sep 23, 2022
cross referencing from DNSLink and  Subdomain specs,
making is clear it is about _redirects file.
@lidel
Copy link
Member

lidel commented Sep 23, 2022

Thank you @justincjohnson for driving this IPIP, and everyone for feedback ❤️

Quick summary:

  • This is the first real IPIP that passed improvement process, it has both spec and working implementation.
  • We've rattified this spec last week, during the bi-weekly IPFS Implementers call.
  • The first Implementation just landed in Kubo (feat(gateway): _redirects file support kubo#8890), and is scheduled to ship next week in Kubo 0.16 RC, allowing us to start using this in web contexts.
  • We've updated test fixtures to include learning from tests and fuzzing done in Kubo.
  • If anyone is interested, we are also preparing end-user docs in _redirects file support doc ipfs-docs#1275

Thanks again, and see you on future IPIPs! 🚢

@lidel lidel merged commit 16cc443 into ipfs:main Sep 23, 2022
@justindotpub justindotpub deleted the feat/gateway-redirects branch September 23, 2022 18:58
hacdias pushed a commit to ipfs/boxo that referenced this pull request Jan 27, 2023
hannahhoward pushed a commit to filecoin-project/kubo-api-client that referenced this pull request Jun 19, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: Ratified
Development

Successfully merging this pull request may close these issues.

8 participants