Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: support any IPLD decoder #587

Merged
merged 1 commit into from
Oct 20, 2021
Merged

Conversation

alanshaw
Copy link
Contributor

Not sure if this is a good idea...now that I've opened the PR I'm thinking that maybe we don't want to allow arbitrarily encoded NFTs.

Sister PR to web3-storage/web3.storage#365

@alanshaw alanshaw requested review from Gozala and hugomrdias October 12, 2021 13:46
@cloudflare-workers-and-pages
Copy link

Deploying with  Cloudflare Pages  Cloudflare Pages

Latest commit: d15fdef
Status: ✅  Deploy successful!
Preview URL: https://66a5662a.nft-storage.pages.dev

View logs

@Gozala
Copy link
Contributor

Gozala commented Oct 12, 2021

The fact that you need to supply codecs along with the car seems like a leaky abstraction. I am not yet sure why carbites requires codecs to do the car splitting but seems like it should not. Seems like the whole point of the car format is to import / export DAGs despite codec configuration on nodes. If we can not achieve that without providing codecs maybe we should consider alternative, simpler upload chunking strategy that would be free of this limitation.

@alanshaw
Copy link
Contributor Author

I am not yet sure why carbites requires codecs to do the car splitting but seems like it should not.

Possibly. Just for your knowledge:

The strategy we use divides the tree into sub-graphs. We need codecs to walk the tree - we cannot find links in the current node without decoding it.

I explored multiple strategies, not all of them require decoders: https://github.com/nftstorage/carbites#usage

This presentation may help explain the differences: https://docs.google.com/presentation/d/1DqyJimvgBc_LNHqURVjERpqgOPS4g_51pqKQ07Vb-BY

We used the treewalk strategy because it allows us to associate multiple CAR chunks together by root CID. This means that there's no special handling for uploads that are split or not split. It also ensures that all blocks in the CAR are referenced by the root. It works out of the box for 99% of IPFS data and if you're in the 1% that's encoding data with a non-default encoder then you likely have the decoder to hand (I realise it's not great DX, but this is about enablement). The treewalk strategy also has good implications for garbage collection during a large upload (you won't lose any unreachable blocks) albeit unused.

Please, feedback is great but if it is critical then I'd prefer it acompanied with suggestions that are arguably better than what we have now. I'm also not clear if you're against this PR or not given the current state of the client?

@Gozala
Copy link
Contributor

Gozala commented Oct 15, 2021

Please, feedback is great but if it is critical then I'd prefer it acompanied with suggestions that are arguably better than what we have now. I'm also not clear if you're against this PR or not given the current state of the client?

I apologize for the unactionable feedback (it is the worst). I just address description in the pull, started looking into carbites so I can provide more feedback but then got sidetracked.

@Gozala
Copy link
Contributor

Gozala commented Oct 15, 2021

@alanshaw my feedback here is going to be broader than this specific pull request. Changes themself are fine, but overall direction is wrong in my opinion that I'll try to make arguments for below. Ultimately I think it is your call to decide what direction to take this, and if you choose to stick with current this feedback is not opposing the changes proposed here.

I think it is a mistake to expose cloudfare limitations through our API. While it is tempting to overcome it through clever graph segmentation it is incidental complexity for the user. Furthermore proposed changes make it obvious that this is a leaky abstraction only in place to:

  1. Overcome cloudfare limits
  2. IPFS internal implementation details

    We used the treewalk strategy because it allows us to associate multiple CAR chunks together by root CID. This means that there's no special handling for uploads that are split or not split. It also ensures that all blocks in the CAR are referenced by the root.

While this abstraction is mostly hidden from the user for 99% use cases, it still:

  1. Does not help users using http API instead.
  2. Does not address 1% use cases today, but also prevents that use case from growing beyond 1% mark.

    I would argue that this is also why other codecs beside dag-pb and dag-cbor never got momentum in IPFS ecosystem either. Too great of an upfront cost for adoption.

  3. Fact that I have dag with uncommon coded does not imply that I have decoder written in JS. I could have one in rust, or I might be a middleware that bridges things for other system.

what instead ?

I think it would be much better to overcome cloudfare limitations by using a solution that does not require ability of traverse a graph. One such solution could be simply chunking file into smaller pieces and uploading them one by one or in parallel. While it is less elegant from technical perspective and would require some state handling on the server it would not require client libraries and would be trivial to use across spectrum of languages. It will support 100% use cases despite how use case distribution may change over time.

It may even enable us to develop a middleware on some other host that could take care of even chunking for users.

If in the future we develop alternative to car format it would support that use case as well.

Copy link
Contributor

@Gozala Gozala left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These changes are fine, but I think overall direction is not. Provided details on that in the comment.

@Gozala
Copy link
Contributor

Gozala commented Oct 15, 2021

Wrote more concrete proposal here #620

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants