Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: add explicit support for subdomain gateways #439

Open
wants to merge 5 commits into
base: main
Choose a base branch
from

Conversation

2color
Copy link
Member

@2color 2color commented Feb 20, 2024

Title

The main goal of this PR is to explicitly support subdomain gateways to avoid getting redirects like we currently do from https://4everland.io which no longer supports path gateways.

Since the TrustlessGatewayBlockBrokerInit type has changed this is a breaking change

Change checklist

  • I have performed a self-review of my own code
  • I have made corresponding changes to the documentation if necessary (this includes comments as well)
  • I have added tests that prove my fix is effective or that my feature works

@2color 2color requested a review from a team as a code owner February 20, 2024 12:24
Copy link
Member

@lidel lidel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Loose idea: fetch supports disabling redirect via redirect: 'manual', I wonder if we could avoid hardcoding subdomain status entirely by doing a quick test somewhere in constructor to detect which of implicit or user-configured gateways redirects to subdomain (before using any of them):

const response = await fetch('https://ipfs.io/ipfs/bafkqaaa?format=raw', { redirect: 'manual' })
return response.redirected || response.type == "opaqueredirect" // was it a redirect?

The benefit of doing this is that this will avoid redirect for all other third-party subdomain gateway user may pass, and will also correctly handle mid-life changes like the 4everland.io one

@2color
Copy link
Member Author

2color commented Feb 21, 2024

@lidel I'm not thrilled about making HTTP calls to the gateways in the constructor just to test for redirection because it introduces a side-effect. But I can see the value in it. Especially for user passed gateways.

The other problem is that you have to pay a runtime cost of making an HTTP call every time it's instantiated, rather than configuring it correctly once.

When you say "avoid hardcoding subdomain status", do you mean maintain the logic differentiating between subdomain and path gateways but autodetect in the code by making an HTTP request instead of passing it through?

@achingbrain
Copy link
Member

Unless I'm missing something, looking at the responses from http delegated routers for get provs, this check might be something we need to do?

The peer schema for providers includes protocols like "transport-ipfs-gateway-http" but it doesn't tell you if it's a subdomain or a path gateway.

@2color
Copy link
Member Author

2color commented Feb 21, 2024

Unless I'm missing something, looking at the responses from http delegated routers for get provs, this check might be something we need to do?

I think it's even more complex than that either way and I'm not sure if it's in the scope of this PR. For example, https://delegated-ipfs.dev/routing/v1/providers/bafybeicklkqcnlvtiscr2hzkubjwnwjinvskffn4xorqeduft3wq7vm5u4 returns two of these:

[     {
      "Addrs": [
        "/ip4/212.6.53.91/tcp/80/http"
      ],
      "ID": "12D3KooWHEzPJNmo4shWendFFrxDNttYf8DW4eLC7M2JzuXHC1hE",
      "Metadata": "oBIA",
      "Protocol": "transport-ipfs-gateway-http",
      "Schema": "unknown"
    }, 
    {
      "Addrs": [
        "/dns4/dag.w3s.link/tcp/443/https"
      ],
      "ID": "QmUA9D3H7HeCYsirB3KmPSvZh3dNXMZas6Lwgr4fv1HTTp",
      "Metadata": "oBIA",
      "Protocol": "transport-ipfs-gateway-http",
      "Schema": "unknown"
    }]

The first one isn't helpful because there's no TLS cert, but the second one isnt' really helpful either because it only supports cars:

curl -i -H "Accept: application/vnd.ipld.raw"  "https://dag.w3s.link/ipfs/bafybeicklkqcnlvtiscr2hzkubjwnwjinvskffn4xorqeduft3wq7vm5u4
HTTP/2 406
date: Wed, 21 Feb 2024 17:34:46 GMT
content-type: text/plain;charset=UTF-8
content-length: 14
server: cloudflare
cf-ray: 8590bdf19a9344fe-TXL

not acceptable⏎
curl -H "Accept: application/vnd.ipld.car" -i  "https://dag.w3s.link/ipfs/bafybeicklkqcnlvtiscr2hzkubjwnwjinvskffn4xorqeduft3wq7vm5u4"
HTTP/2 200
date: Wed, 21 Feb 2024 17:35:40 GMT
content-type: application/vnd.ipld.car; version=1; order=undefined; dups=y
cf-ray: 8590bf3fbef84528-TXL
accept-ranges: none
access-control-allow-origin: *
cache-control: public, max-age=29030400, immutable
content-disposition: attachment; filename="bafybeicklkqcnlvtiscr2hzkubjwnwjinvskffn4xorqeduft3wq7vm5u4.car"; filename*=UTF-8''bafybeicklkqcnlvtiscr2hzkubjwnwjinvskffn4xorqeduft3wq7vm5u4.car
etag: W/"bafybeicklkqcnlvtiscr2hzkubjwnwjinvskffn4xorqeduft3wq7vm5u4.car"
vary: Accept, Accept-Encoding
access-control-allow-methods: GET
access-control-expose-headers: Content-Length
x-content-type-options: nosniff
x-freeway-version: 2.15.0
server: cloudflare

@SgtPooki SgtPooki changed the title feat!: add explicit support for subdomain gateways feat: add explicit support for subdomain gateways Feb 29, 2024
@alanshaw
Copy link
Member

Should be working now for raw blocks, sorry about that:

curl -H "Accept: application/vnd.ipld.raw" https://dag.w3s.link/ipfs/bafybeigdyrzt5sfp7udm7hu76uh7y26nf3efuylqabf3oclgtqy55fbzdi --output block.bin
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100  116k  100  116k    0     0   153k      0 --:--:-- --:--:-- --:--:--  154k

Free free to ping me for dag.w3s.link problems. We set that gateway up very quickly for Saturn and they've never requested raw blocks.

@2color
Copy link
Member Author

2color commented Feb 29, 2024

Following the discussion in Helia-WG, it was brought up whether we want to detect whether a gateway is a subdomain or path at runtime (either leveraging the preflight request and this spec ipfs/specs#425 or using heuristics. I don't believe we made a decision, right @lidel ?

@achingbrain
Copy link
Member

We did not make a decision as such, but we did talk a bit about the constraints.

  1. Turns out the absence of transport-ipfs-gateway-http is not enough to use as a reason to assume that a given peer is not running a path/subdomain gateway
  2. Some CIDs cannot be used with a subdomain gateway (e.g. very long or in case-sensitive encoding)
  3. Gateways may redirect you to a subdomain if it's supported and the CID can be used this way

Given the above, I think there's still value in allowing the user to specify if a gateway is 100% absolutely for sure a subdomain gateway, but if not we should start by having them as a path gateway, examine the CID we are requesting, if it can be used in a subdomain and we receive a redirect to a subdomain URL we can flip that gateway into subdomain mode for future requests.

If the CID cannot be used in a subdomain we should treat it as a path gateway for this request*.

The preflight request should help here but AFAIK it's not available to browser app code. Can we query the cache for it and detect it that way?


* = What if we try convert it to a subdomain-compatible CID? E.g. v0 base58btc to v1 base36? Test for length, etc.

Comment on lines +108 to +112
if (this.subdomainResolution) {
gwUrl.hostname = `${cid.toString(base32)}.ipfs.${gwUrl.hostname}`
} else {
gwUrl.pathname = `/ipfs/${cid.toString()}`
}
Copy link
Member

@achingbrain achingbrain Mar 4, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need to check here that the CID can actually be encoded as base32 (v0 will throw, I think?), and that it's not too long to be used as a DNS label.

If it's v0 we could convert to v1 first to be able to encode it in a case insensitive way? We'll get the same block data back but I don't know if this will cause other problems.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If it's v0 we could convert to v1 first to be able to encode it in a case insensitive way? We'll get the same block data back but I don't know if this will cause other problems.

Right. What other problems that it may cause did you have in mind?

@2color
Copy link
Member Author

2color commented Mar 5, 2024

Given the above, I think there's still value in allowing the user to specify if a gateway is 100% absolutely for sure a subdomain gateway, but if not we should start by having them as a path gateway, examine the CID we are requesting, if it can be used in a subdomain and we receive a redirect to a subdomain URL we can flip that gateway into subdomain mode for future requests

I suppose we'd do this check when a given GatewayBroker requests a block for the first time to avoid an unnecessary request and side effects when instantiating. Do you agree with the broad strokes of this approach?


Also dropping this link where we recently implemented the conversion to subdomain resolution.

@achingbrain
Copy link
Member

Do you agree with the broad strokes of this approach?

Yes, sounds good.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants