IPIP-293: Add /ipld Gateway Specs #293

RangerMauve · 2022-06-25T00:03:31Z

WIP: Drafting up specs for /ipld/ support in gateways

rvagg · 2022-06-27T06:08:49Z

nice

@lidel where do we stand with writable gateways? I don't believe we have writability enabled anywhere (yet) and there's been some debate about pulling the trigger on this. If we put POST and PATCH in the spec, what does that mean for implementations? Does that just become an optional part, and we could handle it in code by allowing the config to turn on writability?

RangerMauve · 2022-06-27T14:52:37Z

I'm actually gonna be working on the writable gateway spec this week based on the stuff we did in the Agregore IPFS Daemon. AgregoreWeb/agregore-ipfs-daemon#10

RangerMauve · 2022-06-27T14:53:53Z

I've been signaling writability support by returning different HTTP method names for HEAD requests (e.g. only show PUT if it's a writable gateway)

RangerMauve · 2022-07-01T21:58:16Z

Err, should I submit a new PR along with the required doc once this is more thoroughly flushed out?

lidel · 2022-07-02T15:05:28Z

@RangerMauve no, this was a bug in Github (closed PRs against my branch instead of rebasing them against main).
I'll fix it manually.

lidel · 2022-07-02T15:09:37Z

@rvagg The idea is to flesh the writable gateways optional. Just want to specify behavior is someone wants to implement it.

Potential user:

Public services such as Gateways or Pinning Services could decide to allow writes only if requests include Authorization: Bearer <token>
localhost gateway provided by things like Kubo, Brave, Agregore could have it enabled by default (tbd)

BigLep · 2022-07-19T22:31:30Z

2022-07-19 conversation: there are things from IPFS Thing that influence this:

Demonstrations by @aschmahmann
Discussions that were had with @mikeal

@RangerMauve wants to understand the WASM story more and how that would impact this.

softwareplumber · 2022-07-21T13:31:12Z

Couple of comments:-

I think the parameters block (enclosed by square braces) is essentially doing what Timbl's old Matrix URI idea: on w3.org was trying to do. Matrix URIs are sparsely supported in some parts of the Web2 ecosystem (the Java JAX-RS API, I believe). Getting implementers of web toolkits (express, etc) to provide native support for IPFS URIs may be easier if we can 'invoke Timbl' rather than attempting to convince them a prori of the rightness of our approach.

More practically, there is an awful lot of infrastructure out there (proxies, reverse proxies, kubernetes ingress controllers, etc) which depends on cruddy regular-expression based parsing of URIs. Re-using '&' as a separator is an admirable idea from the point of view of code re-use but it risks breaking any bad regex that is relying on '&' appearing first in a query string. Moreover using '[' as a meaningful separator in the URI will make any implementer of said cruddy regex parsers cry into their beer as they try to navigate thickets of escape characters.

For these reasons I'd suggest that ';' as a separator, per the Matrix URI spec, is preferable.

aschmahmann · 2022-07-21T17:01:09Z

@softwareplumber IIUC the Matrix URI puts ; only at the end of a path and not in the middle and as a result looks an awful lot like query parameters ?a=b&c=d.... Part of the idea here is that including the escaped path information in the middle allows the paths to be easier to understand.

For example: /ipld/bafyroot/[ADL = HAMT]/entry1/field2/[ADL = FBL] can render the bytes for a picture, video, etc. Where HAMT and FBL are the abstract data layouts described in https://ipld.io/specs/advanced-data-layouts/.

aschmahmann · 2022-07-21T17:01:32Z

Noting that there have been proposals to use other signaling mechanisms than out-of-band including it in the path. I'd recommend those interested take a look at https://ipld.io/docs/advanced-data-layouts/signalling/ for some background (including following through to the naming and dynamic loading sections if you're interested). While none of the opinions on that website are "law" they may provide some useful context in either forming your own opinions or understanding those of others. If folks have other useful resources I'd drop them here as well.

RangerMauve · 2022-07-21T17:08:33Z

Btw, I did a talk last week at the IPFS thing and here are the slides from it: https://blog.mauve.moe/slides/ipld-gateway/#1

You can press p to open up the speaker notes for some of the stuff I said (will link to the recording here once it is published)

softwareplumber · 2022-07-21T17:49:47Z

Well, the JAX-RS spec *definitely* provides for matrix params within the path. I admit it's not so clear from the original Timbl musing, but the Java world is shot through with examples of matix params used in the middle of the path (e.g. https://www.logicbig.com/tutorials/java-ee-tutorial/jax-rs/jaxrs-matrix-param.html). So the example would become: /ipld/bafyroot;ADL=HAMT/entry1/field2;ADL=FBL

RangerMauve · 2022-07-27T16:42:54Z

I'm personally all for reusing an existing standard. One thing I like about this JAX thing is that it disambiguates which "side" the metadata goes on. e.g. ipld://cid/[foo=bar]example/ and ipld://cid/example[foo=bar]/ could serialize to the same thing, and could also lead to confusing situations like ipld://cid/[foo=bar]example[foo=baz]/.

With this syntax, we know for sure that the extra data goes *after the segment name.

Using semicolons to separate bits means that we can't just dump the segment into URLSearchParams, but I think that's easy enough to work around.

We can then say that just ; needs to be escaped rather than [ and ] in path segments.

Also, the thing about proxies seems like a good call since those things have caused issues in the past.

I'm not particularly married to [ ] so I'm down to spec paths with this JAX syntax instead. I'll reference the Matrix URI spec here: https://www.w3.org/DesignIssues/MatrixURIs.html

(I'll do some tests with existing URL parsers to see if they complain about it)

softwareplumber · 2022-07-31T15:17:30Z

@RangerMauve that's awesome; even if the idea doesn't work out I'm really pleased it's being considered.

If I can make one other suggestion, for future-proofing it might be an idea to somehow 'namespace' keywords like 'ADL' (perhaps prefixing with '$') and, maybe, reserve some kind of wildcard character in the spec. I have a gut feeling that eventually we're going to want a path-like syntax to represent selectors (or something that replaces selectors) and providing upward compatibility in the Gateway URI spec so that Gateway URIs are a subset of Selector URIs would be a good thing.

For example, I'm thinking that in the fullness of time a path like <CID>/folder/*;owner=alice might represent all the descendants of the 'folder' node with an attribute 'owner' equal to 'alice'. The path after the CID is of course a human readable representation of a simple selector; if we ever want a gateway to have this functionality it would make sense to reserve '*' and ensure that our operators (such as ADL) can be distinguished from attribute names.

RangerMauve · 2022-08-01T20:38:45Z

Hmm. Extra keywords in the path seem interesting, I wonder if it'd be stepping over some of the use cases of selectors, however.

One of the things I was thinking would be important is that the result for these IPLD URLs / operations should be either a new IPLD data model node, or a URL pointing to such a node. Would using extra wildcards have it return a list node?

Might be good to talk about it on the call.

softwareplumber · 2022-08-01T22:07:25Z

Yes, there is a crossover with selector use cases, but that's fine. What I'm basically saying is that building space in the URI spec so that it could eventually handle some of those use cases without having to be re-written can only be a good thing. ( :-J And, in the meantime, this maybe gives us a way to write down simple selectors that doesn't involve mind-destroying numbers of curly braces ) I agree that the question of exactly what a path with wildcards in it would return is a vexed one. The simple answer ('a list of nodes') may be wrong. But I think that's a design bridge that could be crossed if anyone ever wants to implement the feature. What I'm suggesting is more leaving space (literal namespace) for someone to implement it if they want to. What call? I'm just a newb.

RangerMauve · 2022-08-12T18:52:14Z

Recording from the discussion we had about this at the IPLD thing last month is up here: https://www.youtube.com/watch?v=_uXKIEmJh3g

RangerMauve · 2022-08-18T21:07:09Z

I got some initial ipld:// protocol support into js-ipfs-fetch@4.2.0 😁 So far it supports a basic GET/POST which can do coversion between codecs at the protocol handler layer. https://github.com/RangerMauve/js-ipfs-fetch/#await-fetchipldcidexample-method-get-headers-accept-applicationjson

I've also put together a JS library for parsing and serializing ipld:// URLs with the new matrix parameters syntax for adding extra signaling for path segments

https://github.com/RangerMauve/js-ipld-url

I'm also gonna release it in the Agregore Browser for desktop to make it a bit easier to mess around with.

I got POST more figured out, along with some uses for the ?format parameter.

Next up, I wanna look into sketching up what the schema parameter could mean for path segments.

I'll also update the gateway spec with these new changes as they come. 😁

RangerMauve · 2022-09-13T04:00:51Z

I've got some code going in JavaScript which supports IPLD Schemas in path segment parameters.

RangerMauve/js-ipld-url-resolve#1

I'm feeling pretty comfortable with this one where I've got schema CIDs within the parameters as well as which type to interpret a node as. It will also apply types to any fields that get traversed via linking.

I'll likely need more tests for nested structs that contain Links, but so far so good. 😁

BigLep · 2022-09-13T22:25:40Z

2022-09-13 IPLD triage conversation on next steps:

Add test fixtures
Adding jsdoc typescript hints
Move relevant libraries into the ipld github org

lidel

Thank you @RangerMauve, did a quick pass with initial feedback.

lidel · 2022-10-12T13:07:45Z

http-gateways/IPLD_GATEWAY.md

+The `body` of the request shall be parsed according to the `Content-Type` as IPLD data via standard encodings.
+`/localhost/` is used to support `POST ipld://localhost/` for uploading IPLD data to local nodes in web browsers that support it.
+
+The response will contain an `ipld://{cid}/` URL pointing at your data.


Spec should remove any ambiguity:

Contain it where? (A) plain text in response body? B) a Location header?

What will be content-type of the response? text/plain ?

This is something I'd like to clarify with @fabricedesre since we had a bit of a disagreement.

Right now the precendent within Kubo and Agregore's protocol handlers is that there will be a 201 response with a Location header containing the URL as well as an empty body.

Fabrice was into having a 200 response and the URL inside the response body, which is something I was originally doing in Agregore, but switch when we started extending the writable gateway functionality in Kubo.

Ideally we should settle on the best course of action here during Lisbon. 😅

Ideally, I'd like to use this to inform all the other protocol handlers too.

lidel · 2022-10-12T13:10:05Z

http-gateways/IPLD_GATEWAY.md

+The response will contain an `ipld://{cid}/` URL pointing at your data.
+
+<!--
+TODO: Only allow `/localhost/`? Get rid of `/localhost` from the spec if light clients with protocol handlers don't matter/


Do we have use cases where things other than localhost could be used in the future?
e.g. do we want to support POST to IPNS identifier?

I've been using POST to ipfs://localhost, or a PUT ipfs://cid/ as well as POST ipns://key to update CIDs, or PUT ipns://key in the Agregore IPFS Daemon Spec

http-gateways/IPLD_GATEWAY.md

lidel · 2022-10-12T13:17:18Z

http-gateways/IPLD_GATEWAY.md

+For `/ipld/{cid}/*` paths, the `Accept` header is used to indicate the encoding that should be used to return the data.
+This means that data initially encoded as `dag-json` will be transcoded to `dag-cbor` if the `application/vnd.ipld.dag-cbor` Accept header is used.
+
+- `application/json`: Interpret in the same way as `application/vnd.ipld.dag-json`.


What if data is a valid JSON (and not DAG-JSON) added to ipfs with json codec (and not dag-json)?
Parsing it as dag-json will error, even tho it is a valid JSON.

@hacdias and I discussed this edge case and ended up with requirement to check codec from CID, and if it is json, use generic JSON codec instead of dag-json.

Interesting. Do you have text written up somewhere that I can copy paste here?

@RangerMauve we have some wording here, but it may not be definitive:

specs/http-gateways/PATH_GATEWAY.md

Lines 184 to 187 in 67fab21

- [application/vnd.ipld.dag-json](https://www.iana.org/assignments/media-types/application/vnd.ipld.dag-json) – requests [IPLD Data Model](https://ipld.io/docs/data-model/) representation serialized into [DAG-JSON format](https://ipld.io/docs/codecs/known/dag-json/)

- [application/vnd.ipld.dag-cbor](https://www.iana.org/assignments/media-types/application/vnd.ipld.dag-cbor) – requests [IPLD Data Model](https://ipld.io/docs/data-model/) representation serialized into [DAG-CBOR format](https://ipld.io/docs/codecs/known/dag-cbor/)

- [application/json](https://www.iana.org/assignments/media-types/application/json) – same as `application/vnd.ipld.dag-json`, unless the CID's codec is JSON. Then, the raw JSON block can be returned

- [application/cbor](https://www.iana.org/assignments/media-types/application/cbor) – same as `application/vnd.ipld.dag-cbor`, unless the CID's codec is CBOR. Then, the raw CBOR block can be returned

http-gateways/IPLD_GATEWAY.md

lidel · 2022-10-12T13:32:00Z

http-gateways/IPLD_GATEWAY.md

+} representation tuple
+```
+
+The CID for the DMT of this schema is `bafyreibvheoym4avfsjfw63yhsymovm7o54ftcnxwxovqf5xxcbjddanze`


nit: was unable to inspec this via ipfs dag get --output-codec=dag-json bafyreibvheoym4avfsjfw63yhsymovm7o54ftcnxwxovqf5xxcbjddanze | jq

As a rule of thumb, CIDs used in IPIP should be publicly available and pinned (e.g. to https://estuary.tech and https://web3.storage, do not use Pinata as afaik it does not announce CIDs on DHT).

We will have automation for this, btu for now it is up to IPIP author to handle.

Should I maybe include some CBOR files with the fixtures that are relevant to the spec?

lidel · 2022-10-12T13:35:23Z

http-gateways/IPLD_GATEWAY.md

+For example, given the following schema (note it is written in DSL form, but must be converted to the DMT in order to be refernced):
+
+```ipldschema
+type Example struct {


nit: I've read this section and tbh have no idea what is the value to end user – CBOR traversal and field resolution with extra steps so the output looks a certain way?

In ADL section we have good use case "ADL that's used to represent large maps" – we need similar real world example for schemas.

What would a schema be useful for irl? I feel the spec here needs better Example, so the value is obvious.

@rvagg @warpfork would you be able to comment on real world uses of IPLD Schema that would be relevant here?

One use case for schemas is to use the representation functionality to render data in more human/application readable formats from formats that are more size efficient.

e.g. some things might be using a listpairs representation which would look like an array of arrays by default. But with a schema you can transform the representation to be more human readable.

I'm gonna be doing stuff along this line for the Prolly Tree work where we'll be encoding tree nodes more efficiently, but having a way to put them through a schema before the application code starts working with them.

http-gateways/IPLD_GATEWAY.md

Co-authored-by: Marcin Rataj <lidel@lidel.org>

RangerMauve · 2022-11-10T21:06:32Z

@darobin would also appreciate help on this spec since it's relevant to a bunch of IPLD stuff I'd like to surface. Probably lower priority than the writable gateway stuff.

BigLep · 2022-11-15T23:44:52Z

2022-11-15 IPLD Triage conversation: We're still a ways off on this. This likely wouldn't be a candidate for merge until 2023Q1.

@RangerMauve wants to:

spend more time on the resolver
spec out traversing into paths vs. to CIDs
generate more fixtures
specing out more of the URL spec with IPLD schemas
Review the divergence of "patch" implementation across Go and JS

hannahhoward · 2022-11-22T23:11:40Z

@RangerMauve just curious what's the deal with trustlessness and multi block retrievals here?

It seems like all of Accept formats are dag-json/dag-cbor, but that implies a single block response. So that means this is a trust based protocol I think? (since anything other than root cid is not verifiable)

RangerMauve · 2022-11-22T23:48:16Z

@hannahhoward Yes, this spec currently relies on the same trustful semantics as the IPFS gateway for loading content.

I think if somebody wants to have trustlessness, then downloading CARs and doing traversal / verification at the application level would be the way to go.

I could see there being some more complex API endpoints which could yield CARs with proofs of everything, but tbh I think it'd be overkill unless there's a specific use case folks had in mind.

One goal of specifying stuff in terms of URLs and methods is that this could be abstracted over where the "backend" could literally a backed that runs this gateway, or a library could implement these things using a light IPFS node which just does bitswap with a trustless gateway and does the validation and traversal client-side, or it could be running along side a full local node that does all the p2p bits as well.

RangerMauve · 2022-12-07T19:48:24Z

So, I've been messing with this some more.

Some updates: I've been thinking that instead of using querystring parameters for applying parameters to the root, we could store all that in the hostname. The main reason is that it keeps the parameters being applied to a node beside the node itself, and applying different ADLs/Schemas on something feels like it should modify the "origin" due to the added transformations. It should also make it easier to create relative URLs. example://whatever?something' navigating to /elsewould yieldexample://whatever/elseand lose the?something`. When the parameters are in the hostname they will stay there during relative navigations.

This means that before a URL might have looked like

ipld://cid/path/here?schema=cidhere&type=Example`

Now it'd look like

ipld://cid;schema=cidhere;type=Example/path/here`

As well, I'm thinking of giving the ending / in URLs special meaning to account for traversing to a CID vs traversing into a CID. As encapsulated here: ipld/ipld#250

My proposal is to treat a lack of a trailing / as being to the CID without resolving, but having a trailing / means traversing into the CID.

So ipld://cid/foo would yield {"/": "bafywhatever} but ipld://cid/foo/ would yield the thing that foo points to which is {"hello": "world"}.

RangerMauve · 2023-04-12T02:11:44Z

Sadly I'm not working for the IPLD team anymore, but I recently added the new IPLD Patch stuff and advanced pathing into the latest version of Agregore.

Overall I'm really happy with the ergonomics of applying lenses and patching over them using a simple declarative (and deterministic!) interface.

https://github.com/AgregoreWeb/agregore-browser/releases/tag/v2.1.0

Add initial IPLD Gateway Specs outline

b9d46c4

RangerMauve mentioned this pull request Jun 25, 2022

IPLD support on Gateways ipfs/in-web-browsers#182

Open

aschmahmann mentioned this pull request Jun 28, 2022

gateway/dir-index-html: switch dir listing sizes to Tsize ipfs/kubo#9058

Closed

BigLep assigned RangerMauve Jun 28, 2022

RangerMauve mentioned this pull request Jun 30, 2022

Add talk on IPLD in gateways to data agony track ipfs-shipyard/ipfs-thing-2022#55

Merged

lidel deleted the branch ipfs:main July 1, 2022 21:55

lidel closed this Jul 1, 2022

lidel reopened this Jul 2, 2022

lidel changed the base branch from feat/gateway-specs to main July 2, 2022 15:05

lidel changed the title ~~Add IPLD Gateway Specs~~ IPIP: Add IPLD Gateway Specs Jul 2, 2022

RangerMauve mentioned this pull request Feb 17, 2023

IPLD gateway support n0-computer/beetle#229

Open

Merge branch 'main' into feat/gateway-specs-ipld

2488113

RangerMauve mentioned this pull request Sep 14, 2022

Standardize on having "parameters" for lenses ipld/ipld#242

Open

RangerMauve and others added 2 commits October 4, 2022 19:18

Add more details to spec

5cc59f2

Merge branch 'main' into feat/gateway-specs-ipld

3674381

RangerMauve mentioned this pull request Oct 11, 2022

IPLD Patch support? multiformats/js-multiformats#207

Closed

Merge branch 'main' into feat/gateway-specs-ipld

8e59a2b

lidel reviewed Oct 12, 2022

View reviewed changes

RangerMauve and others added 8 commits October 17, 2022 15:51

Update http-gateways/IPLD_GATEWAY.md

4c9e3ad

Co-authored-by: Marcin Rataj <lidel@lidel.org>

Update http-gateways/IPLD_GATEWAY.md

a471eb4

Co-authored-by: Marcin Rataj <lidel@lidel.org>

Update http-gateways/IPLD_GATEWAY.md

b7e48a3

Co-authored-by: Marcin Rataj <lidel@lidel.org>

Update http-gateways/IPLD_GATEWAY.md

c96abcd

Co-authored-by: Marcin Rataj <lidel@lidel.org>

Update http-gateways/IPLD_GATEWAY.md

0372363

Co-authored-by: Marcin Rataj <lidel@lidel.org>

Update http-gateways/IPLD_GATEWAY.md

5723b06

Co-authored-by: Marcin Rataj <lidel@lidel.org>

Update http-gateways/IPLD_GATEWAY.md

447118c

Co-authored-by: Marcin Rataj <lidel@lidel.org>

Clarify ADL wording

c447f90

RangerMauve mentioned this pull request Oct 17, 2022

Spec writable protocol handlers #335

Open

hacdias mentioned this pull request Oct 20, 2022

feat(gateway): JSON and CBOR response formats (IPIP-328) ipfs/kubo#9335

Merged

7 tasks

lidel changed the title ~~IPIP: Add IPLD Gateway Specs~~ IPIP-293: Add /ipld Gateway Specs Oct 26, 2022

Merge branch 'main' into feat/gateway-specs-ipld

b140bdd

aschmahmann mentioned this pull request Dec 9, 2022

Propose Naam as naming system powered by IPNI ipni/specs#4

Open

lidel mentioned this pull request Jan 18, 2023

Create IPIP with Gateway spec for partial CAR exports #348

Closed

4 tasks

BigLep mentioned this pull request Jan 24, 2023

Specify pathing into a CID vs to a CID ipld/ipld#250

Open

aschmahmann mentioned this pull request Apr 21, 2023

IPIP-402: Partial CAR Support on Trustless Gateways #402

Merged

	- [application/vnd.ipld.dag-json](https://www.iana.org/assignments/media-types/application/vnd.ipld.dag-json) – requests [IPLD Data Model](https://ipld.io/docs/data-model/) representation serialized into [DAG-JSON format](https://ipld.io/docs/codecs/known/dag-json/)
	- [application/vnd.ipld.dag-cbor](https://www.iana.org/assignments/media-types/application/vnd.ipld.dag-cbor) – requests [IPLD Data Model](https://ipld.io/docs/data-model/) representation serialized into [DAG-CBOR format](https://ipld.io/docs/codecs/known/dag-cbor/)
	- [application/json](https://www.iana.org/assignments/media-types/application/json) – same as `application/vnd.ipld.dag-json`, unless the CID's codec is JSON. Then, the raw JSON block can be returned
	- [application/cbor](https://www.iana.org/assignments/media-types/application/cbor) – same as `application/vnd.ipld.dag-cbor`, unless the CID's codec is CBOR. Then, the raw CBOR block can be returned

IPIP-293: Add /ipld Gateway Specs #293

Are you sure you want to change the base?

IPIP-293: Add /ipld Gateway Specs #293

Conversation

RangerMauve commented Jun 25, 2022

rvagg commented Jun 27, 2022

RangerMauve commented Jun 27, 2022

RangerMauve commented Jun 27, 2022

RangerMauve commented Jul 1, 2022

lidel commented Jul 2, 2022

lidel commented Jul 2, 2022

BigLep commented Jul 19, 2022

softwareplumber commented Jul 21, 2022

aschmahmann commented Jul 21, 2022

aschmahmann commented Jul 21, 2022

RangerMauve commented Jul 21, 2022

softwareplumber commented Jul 21, 2022 via email • edited Loading

RangerMauve commented Jul 27, 2022 • edited Loading

softwareplumber commented Jul 31, 2022 • edited Loading

RangerMauve commented Aug 1, 2022

softwareplumber commented Aug 1, 2022 via email • edited by lidel Loading

RangerMauve commented Aug 12, 2022

RangerMauve commented Aug 18, 2022

RangerMauve commented Sep 13, 2022

BigLep commented Sep 13, 2022

lidel left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

RangerMauve commented Nov 10, 2022

BigLep commented Nov 15, 2022

hannahhoward commented Nov 22, 2022

RangerMauve commented Nov 22, 2022

RangerMauve commented Dec 7, 2022 • edited Loading

RangerMauve commented Apr 12, 2023

softwareplumber commented Jul 21, 2022 via email •

edited

Loading

RangerMauve commented Jul 27, 2022 •

edited

Loading

softwareplumber commented Jul 31, 2022 •

edited

Loading

softwareplumber commented Aug 1, 2022 via email •

edited by lidel

Loading

RangerMauve commented Dec 7, 2022 •

edited

Loading