Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ipfs object patch does not work with HAMT-sharded directories #7421

Closed
lidel opened this issue Jun 5, 2020 · 9 comments · Fixed by #8414
Closed

ipfs object patch does not work with HAMT-sharded directories #7421

lidel opened this issue Jun 5, 2020 · 9 comments · Fixed by #8414
Assignees
Labels
kind/bug A bug in existing code (including security flaws) need/triage Needs initial labeling and prioritization P0 Critical: Tackled by core team ASAP topic/UnixFS Topic UnixFS

Comments

@lidel
Copy link
Member

lidel commented Jun 5, 2020

Version information:

go-ipfs version: 0.6.0-rc1-aa169520c
Repo version: 9
System version: amd64/linux
Golang version: go1.14.3

Description:

Problem 1: object patch does not work with sharded dirs

  • ipfs object patch add-link is not HAMT-aware
  • patched directory was hamt-sharded-directory
  • produced directory is also marked as hamt-sharded-directory
  • the file added to HAMT-sharded dir does not follow HAMT-sharding
  • the CLI output truncates filename and makes things even more confusing

Details / Repro

Given this specific HAMT directory (note stript.js being already present):

$ ipfs ls bafybeibvfzwz5ghenwdzeqrlx7sgzbo3zcdaomscc42ina4mv3ritby7qa
bafkreifpfoilyfbc4lbvkzupje2e7xgti2t5zcpg6ryekzvdyqpmzmovpa 2808 index.html
bafybeiavgkssvw57furrh3tzld2bo6ryjbpki4qpyyvishomwtj3gkxut4 -    ipfs-css/
bafkreigtdgsgv2f3bkhsmxvku3bpnnqzubcxeupf7fff5f7l7tlm2v237a 3232 ipfs-logo.svg
bafkreidgfrdvq7lcifj23cxkr2sxumpakzyhai7qkbgiszktqgr5hnktlm 3193 script.js
bafybeibuo42uthah7qfhcqw4loxz2g2sefay7y3sjrbrfiqtueq3dd5s6y -    tachyons/

Really confusing thing happens when I attempt to ipfs object patch add-link above directory to add a new CID (bafkreidfofyuw4ywqwjjdgxyhjictt7f2iufxto2v5n4rnm4atp57ajjne) using already existing name script.js, and then inspect result via CLI:

$ ipfs object patch add-link bafybeibvfzwz5ghenwdzeqrlx7sgzbo3zcdaomscc42ina4mv3ritby7qa script.js bafkreidfofyuw4ywqwjjdgxyhjictt7f2iufxto2v5n4rnm4atp57ajjne
bafybeigq3ctbwpa6ydvhszc5qumd6pp2ockyp7lmpyynvqywwgo4uu5ffy

$ ipfs ls bafybeigq3ctbwpa6ydvhszc5qumd6pp2ockyp7lmpyynvqywwgo4uu5ffy
bafkreifpfoilyfbc4lbvkzupje2e7xgti2t5zcpg6ryekzvdyqpmzmovpa 2808 index.html
bafybeiavgkssvw57furrh3tzld2bo6ryjbpki4qpyyvishomwtj3gkxut4 -    ipfs-css/
bafkreigtdgsgv2f3bkhsmxvku3bpnnqzubcxeupf7fff5f7l7tlm2v237a 3232 ipfs-logo.svg
bafkreidfofyuw4ywqwjjdgxyhjictt7f2iufxto2v5n4rnm4atp57ajjne 3182 ript.js
bafkreidgfrdvq7lcifj23cxkr2sxumpakzyhai7qkbgiszktqgr5hnktlm 3193 script.js
bafybeibuo42uthah7qfhcqw4loxz2g2sefay7y3sjrbrfiqtueq3dd5s6y -    tachyons/

Namely, unexpected part is ript.js:

bafkreidfofyuw4ywqwjjdgxyhjictt7f2iufxto2v5n4rnm4atp57ajjne 3182 ript.js
bafkreidgfrdvq7lcifj23cxkr2sxumpakzyhai7qkbgiszktqgr5hnktlm 3193 script.js

Problem 2: object patch does not produce sharded dirs

(This requires Problem 1 to be fixed first)

Patching a regular dir and adding new items to it means a block can grow indefinitely.

This is a separate problem, because while everything "works fine locally", the block grows to the point (>1MB) where it can't be exchanged over bitswap, effectively making it unavailable to other peers.

The fix here is to detect when a newly produced dir is over the limit of the maximum block that can be sent over bitswap sesssion, and re-create it using HAMT sharding, and return a CID of hamt-sharded-directory.

@lidel lidel added kind/bug A bug in existing code (including security flaws) need/triage Needs initial labeling and prioritization labels Jun 5, 2020
@lidel lidel changed the title ipfs object patch add-link breaks HAMT-sharded directory ipfs object patch does not work with HAMT-sharded directories Jun 5, 2020
@lidel lidel added the topic/UnixFS Topic UnixFS label May 2, 2021
@lidel lidel added the P0 Critical: Tackled by core team ASAP label May 2, 2021
@BigLep
Copy link
Contributor

BigLep commented May 3, 2021

Per 2021-05-03 verbal, many ipfs commands won't work if you don't observe these limits.

Idea is to print errors and potentially allow a force flag?

We really want #7022 to get in place.

@lidel
Copy link
Member Author

lidel commented Sep 1, 2021

@schomatis I believe we want ipfs object patch add-link|rm-link|... to do automatic sharding/unsharding just like #8114 does for MFS (ipfs files).

Just raising this so we don't miss it.
I did not look at go-ipfs sources, perhaps its already using the same unixfs lib and no additional work is needed?

@aschmahmann
Copy link
Contributor

@lidel can't we close this issue out since ipfs object was deprecated #7936? Also, IIRC ipfs object is defined on DagPB rather than UnixFS data, which is largely why it was deprecated in favor of ipfs dag and ipfs files, and so doing automatic sharding/unsharding doesn't seem like it really makes sense.

If the complaint is that ipfs files is somehow insufficient here then we can add a new issue to improve ipfs files to match the behavior that ipfs object patch would have supported.

@lidel
Copy link
Member Author

lidel commented Sep 1, 2021

Agree that investing significant dev time into deprecated ipfs object is not something we want,
so if we can't do it with minimal effort, then ok to reduce scope here.

Still, there is a footgun gap that I believe we should address somehow:

  • ipfs object patch add-link will silently create DAG that is too big to be transferred over bitswap

If we keep current behavior, people will be producing problematic blocks with the object API, and we will see content routing errors caused by this.

@aschmahmann thoughts on reducing scope here to just checking the size of block produced by ipfs object patch add-link and returning error when result gets too big?

  • Error: object API does not support HAMT-sharding. To create big directories, please use the files API (MFS)

@aschmahmann
Copy link
Contributor

Yep, erroring in that case sounds very reasonable to me.

@schomatis
Copy link
Contributor

schomatis commented Sep 6, 2021

My takeaway from the last couple of comments is to execute only on the error implementation.

(If after landing the sharding work we can incorporate that here with minimal effort will also try to do that but only as a secondary objective.)

@BigLep
Copy link
Contributor

BigLep commented Sep 7, 2021 via email

@lidel
Copy link
Member Author

lidel commented Sep 28, 2021

I've merged the bare minimum implemented in #8414

This way folks who use ipfs object patch add-link will get feedback about potential problem caused by big, non-transferrable blocks and can make an informed decision if they want to switch to MFS or continue using the deprecated object patch API by passing --allow-big-block=true

Given object API is deprecated and we want to invest time into dag and IPLD tooling instead, I don't expect stewards team to do any more work here, but if someone wants to introduce similar checks for other object patch operations, PR welcome.

@schomatis
Copy link
Contributor

@lidel Small nit: #8414 applies the same check for all 4 ipfs object patch commands (--allow-big-block is declared a patch flag and not a specific add-link one), but the sharness test only checks add-link. I can try extending that test to the other commands a well if you like (it will mostly be a copy-paste operation).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug A bug in existing code (including security flaws) need/triage Needs initial labeling and prioritization P0 Critical: Tackled by core team ASAP topic/UnixFS Topic UnixFS
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants