Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactor cleanChild() #57

Merged
merged 2 commits into from
Aug 17, 2020

Conversation

rvagg
Copy link
Member

@rvagg rvagg commented Aug 6, 2020

Closes: https://github.com/ipfs/go-hamt-ipld/issues/56

I've tried to push cleanChild() around in the various ways I was suspecting it might be flawed, but couldn't make it misbehave, so I'm going to call it good. Added more testing for CHAMP compaction along with a printHamt() that can show you the kinds of shapes you have.

(edit: note that the below examples are using identity hash with a bitWidth of 8 so each character is treated as an index at one level, so AAA1 and AAA2 collide for 3 levels, it's a useful way to force perverse structures without having to deal with the annoying randomness of a hash function).

So in this test we do things like:

‣ bafy2bzaceaabidljpjqxkb3bwz2kymapl6w6zazcaehtwh55d3solznpudfji:                                                                                     
  ⇶ [ AAAAAA11, AAAAAA12, AAAAAA21 ]                                                                                                                  

overflows when we add another colliding element to:

‣ bafy2bzacec2e7uxlmpn5bzfjfhrjbvgkgyfbi76k43epdijp626i3baaao436:                                                                                     
  ‣ bafy2bzacebgcvdzkq4glgp2kl65oyz65w2qb6ptbzjwxgbvnw2hwzs25bgeaw:                                                                                   
    ‣ bafy2bzacect7ahd4hdy24w7jfrttzgx5mh264lzobxdq5dyisqeelol6dia5o:                                                                                 
      ‣ bafy2bzacebxdmdcvhpnpsiia632eiov45yynuijl3faw7zzfzbupg4tombhnu:                                                                               
        ‣ bafy2bzacecrptawydwevr4rt3w4mbvuvh6wteh2w3kplwohkzu64zip5lslc2:                                                                             
          ‣ bafy2bzacedxu2dpomrsnqgj45pbkgyxrc4nujwfgjrwlpm3nkjvfomhkmqang:                                                                           
            ‣ bafy2bzaceadtlk6fe4pishrnyakhlz32iiakebraxudz34r4o5um3sgof3azs:                                                                         
              ⇶ [ AAAAAA11, AAAAAA12 ]                                                                                                                
              ⇶ [ AAAAAA21, AAAAAA22 ]                                                                                                                

and then back again.

It also tests a mid-way state where collisions happen at half that tree height with a collection of elements that collide in another way:

‣ bafy2bzaceacaykqvidugqydd2lskm7ajgdionqzgh22odxcmtfo6jkj6legqe:
  ‣ bafy2bzacear4ozwsojhwrjc3yroecudxdchelx74k2h6xlgzwtjsrpgo7sz3o:
    ‣ bafy2bzacec4owi2jruh53fve4vn5tyzg4yw7lfmrsuvshdrlsk4rdaa3nnbga:
      ‣ bafy2bzacedvt2qjfegib2k7bzfj4oxe4nxy46amtfwmybhhqjv4vs56xlv6gq:
        ‣ bafy2bzacecknjra3gxqdsaydwjwhgkh3pvun4kphn3xgpm6ek36mg3kzf3zrc:
          ⇶ [ AAA11AA ]
          ⇶ [ AAA12AA ]
          ⇶ [ AAA13AA ]
          ⇶ [ AAA14AA ]
        ‣ bafy2bzacecrptawydwevr4rt3w4mbvuvh6wteh2w3kplwohkzu64zip5lslc2:
          ‣ bafy2bzacedxu2dpomrsnqgj45pbkgyxrc4nujwfgjrwlpm3nkjvfomhkmqang:
            ‣ bafy2bzaceadtlk6fe4pishrnyakhlz32iiakebraxudz34r4o5um3sgof3azs:
              ⇶ [ AAAAAA11, AAAAAA12 ]
              ⇶ [ AAAAAA21, AAAAAA22 ]

and rolls that back out to see if it behaves properly. And it does.

I've refactored cleanChild() to a state that I think has more clarity about the various states it can be in and reasons it has to take action or not. I think this more closely matches the CHAMP semantics if you want to read the paper and see those semantics in this code. Maybe that's subjective, but I think I've reduced the number of branches here which should in itself clear up some confusion.

@ianopolous
Copy link
Contributor

I would recommend fuzz testing this as well (generate random trees, add and remove a mapping and check the root cid hasn't changed). We found that useful after @rvagg pointed out the same problem in our champ implementation in Peergos.

@rvagg rvagg force-pushed the rvagg/cleanChild-refactor branch from 1308362 to 01a1eaf Compare August 6, 2020 11:03
hamt.go Show resolved Hide resolved
Copy link
Contributor

@warpfork warpfork left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm leaving a couple of blobs of notes that I made as I both reviewed this and wandered around the codebase at large -- so if they're not topical to you, just mark 'em resolved. :)

Love the very explicit way the TestFillAndCollapse function steps through things. It talks its way through a lot of the scenarios I felt I was still looking for as I took stock of the other existing tests.

hamt.go Show resolved Hide resolved
hamt_test.go Outdated Show resolved Hide resolved
@rvagg rvagg force-pushed the rvagg/cleanChild-refactor branch 2 times, most recently from e96eff7 to 853bd9c Compare August 8, 2020 03:56
@rvagg rvagg force-pushed the rvagg/cleanChild-refactor branch from 853bd9c to 18c8cac Compare August 8, 2020 03:56
@rvagg
Copy link
Member Author

rvagg commented Aug 8, 2020

rebased to master and autosquashed

Copy link
Contributor

@warpfork warpfork left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't know why I balked on giving this a thumbs-up earlier, but looking at it more, the tests again look pretty comprehensive, so it seems likely that this is pretty solid.

@Stebalien Stebalien merged commit be2d564 into filecoin-project:master Aug 17, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Fix cleanChild() algorithm
4 participants