Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

HAMTv4 using generics instead of cbg.Deferred for perf gains #298

Closed
wants to merge 2 commits into from

Conversation

rvagg
Copy link
Member

@rvagg rvagg commented Aug 3, 2024

Ref: #285

Continuing on the optimisation journey with the belief that allocations and GC are taking up a significant portion of the time, I experimented with removing the cbg.Deferred use in go-hamt-ipld and using Go generics to replicate that functionality. When the HAMT doesn't know what value type it's got stored, it skips over the CBOR tokens until it finds a whole unit (single scalar token, whole nested object, whatever) and then slices the bytes for that value and passes it on in cbg.Deferred for the caller to decode themselves. This means double CBOR parsing and the allocation and GC of a byte slice that really shouldn't be there. So instead we can pass our serializable type down into the HAMT and it can just defer directly to it to decode its values. That work is in filecoin-project/go-hamt-ipld#122 and this branch depends on that branch as github.com/filecoin-project/go-hamt-ipld/v4, keeping /v3 intact.

Summary of changes so far:

  • builtin.ActorTree is now an interface which only has the ActorsV5 method variants and I've also made builtin.LegacyActorTree which has those plus the ActorsV4 method variants.
  • builtin.NewLegacyTree and builtin.LoadLegacyTree are the new names for the original form, these will give you what was always returned before.
  • GetStore() on both interfaces, and GetMap() on LegacyActorTree—I need the latter for lotus, which only knows about adt.Map for now but with this change adt.Map can become a legacy concern and we can just use hamt.Node[*ActorV5] to get basically the same thing.
  • RunMigration only takes an ActorTree, it doesn't need the v4 actors; it also uses builtin.NewTree to make the new v4 hamt with generics variety
  • v14 migration uses builtin.LoadTree to get the new v4 hamt with generics to pass to RunMigration; the other migrations still use LoadLegacyTree.

So the impact of this is that when running the v14 migration we read and write the actors HAMT with the new v4 HAMT code that uses generics instead of cbg.Deferred.


Results

lotus-shed migrate-state 23 bafy2bzacecnpvunvyytfzmdofrdwk2jr5sf4cuzit6o6uctrzqdltqwzbwwmk for both of them:

Current lotus & go-state-types

migration height 4145482
old cid bafy2bzacedpcjm7wgoq6ft3x7jdjtjdmbaqil2zjd2pa6exl2mwpv5opik4s2
new cid bafy2bzacebpjrptlb44oaj4vyaaa5th3lvqwv5cogdfd6agzvjqr223grwhli
completed round actual (without cache), took 11.323758969s
completed round actual (with cache), took 9.755098713s

This branch and HAMT v4

migration height 4145482
old cid bafy2bzacedpcjm7wgoq6ft3x7jdjtjdmbaqil2zjd2pa6exl2mwpv5opik4s2
new cid bafy2bzacebpjrptlb44oaj4vyaaa5th3lvqwv5cogdfd6agzvjqr223grwhli
completed round actual (without cache), took 10.133880646s
completed round actual (with cache), took 8.987781194s

So we're about 10.5% faster without cache and nearly 8% faster with cache.

Flame graphs

Before:
flame

After:
flame2

You can see in there:

  • ForEach on the actors tree goes from 30.9% to 28.90% (most of the action is captured under here since it nests the Set too
  • Set is 11.25% vs 6.22% (this is dramatic because we get to defer serialization whereas previously we had to serialize the value into a slice and hold onto it till a flush)
  • Flush goes up from 4.97% to 6.51% (this is where the value serialization is moved to, but it's much cheaper because we go directly to the io.Writer rather than a slice allocation [+GC], hold, then write)

Profiles

(click to see full profiles)

Before:
migrate-state cpuprofile

After:
migrate-state cpuprofile2

These are some of the bits I was most interested in:

Screenshot 2024-08-03 at 7 14 16 PM vs Screenshot 2024-08-03 at 7 14 29 PM
Screenshot 2024-08-03 at 7 15 54 PM vs Screenshot 2024-08-03 at 7 16 08 PM

(GC improvements not as dramatic as I hoped, but still there)

There's probably a few more tweaks that could be done to make this nicer. There's some internal mutation code in the HAMT that I think should be changed now with the generics that may speed up the path through the code for inserts and deletes.

@rvagg rvagg changed the title Rvagg/actor tree HAMTv4 using generics instead of cbg.Deferred for perf gains Aug 3, 2024
@codecov-commenter
Copy link

Codecov Report

Attention: Patch coverage is 4.46097% with 257 lines in your changes missing coverage. Please review.

Project coverage is 3.59%. Comparing base (41f131b) to head (3308f25).

Files Patch % Lines
builtin/actor_tree.go 8.88% 82 Missing ⚠️
builtin/v10/check.go 0.00% 23 Missing ⚠️
builtin/v11/check.go 0.00% 23 Missing ⚠️
builtin/v12/check.go 0.00% 23 Missing ⚠️
builtin/v13/check.go 0.00% 23 Missing ⚠️
builtin/v14/check.go 0.00% 23 Missing ⚠️
builtin/v9/check.go 0.00% 22 Missing ⚠️
builtin/v8/check.go 0.00% 20 Missing ⚠️
builtin/v10/init/invariants.go 0.00% 2 Missing ⚠️
builtin/v10/migration/top.go 0.00% 2 Missing ⚠️
... and 9 more
Additional details and impacted files

Impacted file tree graph

@@            Coverage Diff            @@
##           master    #298      +/-   ##
=========================================
+ Coverage    3.57%   3.59%   +0.01%     
=========================================
  Files         520     520              
  Lines       96959   97037      +78     
=========================================
+ Hits         3468    3484      +16     
- Misses      92140   92197      +57     
- Partials     1351    1356       +5     
Files Coverage Δ
builtin/v9/migration/test/util.go 83.33% <100.00%> (ø)
builtin/v9/migration/top.go 59.72% <100.00%> (ø)
builtin/v11/migration/top.go 0.00% <0.00%> (ø)
builtin/v12/migration/top.go 0.00% <0.00%> (ø)
builtin/v13/migration/top.go 0.00% <0.00%> (ø)
migration/runner.go 0.00% <0.00%> (ø)
builtin/v10/init/invariants.go 0.00% <0.00%> (ø)
builtin/v10/migration/top.go 0.00% <0.00%> (ø)
builtin/v11/init/invariants.go 0.00% <0.00%> (ø)
builtin/v12/init/invariants.go 0.00% <0.00%> (ø)
... and 11 more

... and 1 file with indirect coverage changes

@rvagg
Copy link
Member Author

rvagg commented Oct 8, 2024

Bailing on this for now, as per comment @ whyrusleeping/cbor-gen#103 (comment)

@rvagg rvagg closed this Oct 8, 2024
@rvagg rvagg deleted the rvagg/actor-tree branch October 8, 2024 05:14
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants