-
Notifications
You must be signed in to change notification settings - Fork 87
Garbage collect inactive hotspots from the h3dex #1173
Conversation
@Vagabond this looks good and should be straightforward to plug into the POC branch, although the GC side will need further consideration as GWs will not be challenging at that point. Not really considered the performance but given you are limiting the targeting to a subset of hexes it should be much better. Your comments above look to be stale, can you update them with any additional considerations you think are still applicable? I can integrate to the POC branch and have a go at any outstanding work items if you would like. |
Stuff outstanding:
|
As you can see, almost all of the absorb time in assert_location is spent updating the |
I think we could also GC when computing hip17 information? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like this approach, (and I still need to go over andy's PR again), but I have concerns about how often we rework the targeting index. I think that GCing every block is probably OK, but we get several asserts or adds per block now and this approach seems slower than the existing approach? the timings from chat seems to be on the order of a second per recalculation, vs ~40ms for the existing approach. given the number of receipts, maybe it nets out better, but I wonder if we couldn't move it to a periodic recalculation at the cost of some spots not being able to be immediately challenged.
As noted in discord, the recalculation only happens when a hex goes from populated to unpopulated, or vice versa, so at res5 this is pretty rare. |
c9ffb7c
to
907371e
Compare
Notes on aux ledger testing:
Now you can run a targeting run, comparing both modules with the same pubkey and the same random seed:
This outputs a list of 2-tuples which are the pubkey of the targeted hotspot the first element is the current v3 targeting, the second is the v4 targeting against the aux ledger. You can also compare the times taken:
|
4cad127
to
a8701aa
Compare
%% v4 targeting enabled, build the h3dex lookup | ||
{ok, Res} = blockchain_ledger_v1:config(?poc_target_hex_parent_res, Ledger), | ||
blockchain_ledger_v1:build_random_hex_targeting_lookup(Res, Ledger), | ||
ok; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should we delete the old hexes stuff here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
there's a separate var that disables updating hexes that has a hook that deletes the hexes list
var_hook(_Var, _Value, _Ledger) -> | ||
ok. | ||
|
||
-spec unset_hook(Var :: atom(), Ledger :: blockchain_ledger_v1:ledger()) -> ok. | ||
unset_hook(?poc_targeting_version, Ledger) -> | ||
%% going back to the default, which is v3 so remove the h3dex lookup | ||
blockchain_ledger_v1:clean_random_hex_targeting_lookup(Ledger), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
do we need a boostrap hex here too?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In theory we'd enable targeting v4 and leave the hexes being updated for a bit before disabling hexes updating
I think that this is ready to go once we've got approval and some indications that fairness the same or better than the old code. |
Sampling to a file:
|
add a tuning knob for h3dex gc res
Problem to solve: we have 2 problems with targeting. First that it uses the
hexes
which are giant monolithic datastructures that are slow to update and are scaling poorly across 400k+ hotspots. Secondly both the hexes and the h3dex ledger data does not take into account inactive hotspots when giving you a raw count of hotspots in an area.This PR starts work on garbage collecting inactive hotspots from the h3dex. Additional work is needed to rewrite targeting to use the h3index, maybe @andymck has some work he's done here we can reuse. Targeting using the h3dex is a bit more complicated to do performantly than I'd expected and I think we need to explore the idea of dropping the requirement that targeting evaluates every possible res 5 hex (of which we have 14,000 populated out of a possible 2 million). Something that picked a random subset of populated res 5 hexes and then targeted across those might be better?