Skip to content
This repository has been archived by the owner on Mar 5, 2024. It is now read-only.

Garbage collect inactive hotspots from the h3dex #1173

Merged
merged 11 commits into from
Mar 1, 2022
Merged

Conversation

Vagabond
Copy link
Contributor

Problem to solve: we have 2 problems with targeting. First that it uses the hexes which are giant monolithic datastructures that are slow to update and are scaling poorly across 400k+ hotspots. Secondly both the hexes and the h3dex ledger data does not take into account inactive hotspots when giving you a raw count of hotspots in an area.

This PR starts work on garbage collecting inactive hotspots from the h3dex. Additional work is needed to rewrite targeting to use the h3index, maybe @andymck has some work he's done here we can reuse. Targeting using the h3dex is a bit more complicated to do performantly than I'd expected and I think we need to explore the idea of dropping the requirement that targeting evaluates every possible res 5 hex (of which we have 14,000 populated out of a possible 2 million). Something that picked a random subset of populated res 5 hexes and then targeted across those might be better?

@evanmcc evanmcc marked this pull request as ready for review December 29, 2021 03:39
@evanmcc evanmcc marked this pull request as draft December 29, 2021 03:46
@andymck
Copy link
Contributor

andymck commented Jan 13, 2022

@Vagabond this looks good and should be straightforward to plug into the POC branch, although the GC side will need further consideration as GWs will not be challenging at that point. Not really considered the performance but given you are limiting the targeting to a subset of hexes it should be much better.

Your comments above look to be stale, can you update them with any additional considerations you think are still applicable? I can integrate to the POC branch and have a go at any outstanding work items if you would like.

@Vagabond
Copy link
Contributor Author

Vagabond commented Jan 20, 2022

Stuff outstanding:

  • We should review/merge the var hooks PR and use it to re-calculate the h3dex if the targeting resolution changes
  • This needs to be tested to see if it actually works as expected
  • We probably need more chain vars to enable the switch over
  • We need a strategy to drop/stop updating the old hexes data once the switch over is complete and deemed stable because it's very expensive

@Vagabond
Copy link
Contributor Author

2022-01-20 16:36:54.270 1 [info] <0.349.1>@blockchain_utils:maybe_log_duration:681 updated_h3dex took 7880 usec
2022-01-20 16:36:54.270 1 [info] <0.349.1>@blockchain_txn:absorb:{562,21} took 316 ms to absorb blockchain_txn_assert_location_v2

As you can see, almost all of the absorb time in assert_location is spent updating the hexes list, and h3dex is much, much cheaper.

@Vagabond
Copy link
Contributor Author

Vagabond commented Feb 2, 2022

I think we could also GC when computing hip17 information?

Copy link
Contributor

@evanmcc evanmcc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like this approach, (and I still need to go over andy's PR again), but I have concerns about how often we rework the targeting index. I think that GCing every block is probably OK, but we get several asserts or adds per block now and this approach seems slower than the existing approach? the timings from chat seems to be on the order of a second per recalculation, vs ~40ms for the existing approach. given the number of receipts, maybe it nets out better, but I wonder if we couldn't move it to a periodic recalculation at the cost of some spots not being able to be immediately challenged.

src/ledger/v1/blockchain_ledger_v1.erl Outdated Show resolved Hide resolved
@Vagabond Vagabond marked this pull request as ready for review February 7, 2022 14:02
@Vagabond
Copy link
Contributor Author

Vagabond commented Feb 7, 2022

I like this approach, (and I still need to go over andy's PR again), but I have concerns about how often we rework the targeting index. I think that GCing every block is probably OK, but we get several asserts or adds per block now and this approach seems slower than the existing approach? the timings from chat seems to be on the order of a second per recalculation, vs ~40ms for the existing approach. given the number of receipts, maybe it nets out better, but I wonder if we couldn't move it to a periodic recalculation at the cost of some spots not being able to be immediately challenged.

As noted in discord, the recalculation only happens when a hex goes from populated to unpopulated, or vice versa, so at res5 this is pretty rare.

include/blockchain_vars.hrl Outdated Show resolved Hide resolved
@Vagabond
Copy link
Contributor Author

Vagabond commented Feb 8, 2022

Notes on aux ledger testing:

  • Add an aux ledger dir to the blockchain stanza of your config: {aux_ledger_dir, "aux"}
  • Start the node (this might take a moment to bootstrap the aux ledger)
  • miner repair sync_pause
  • Check you have a working aux ledger: miner eval 'blockchain_ledger_v1:has_aux(blockchain:ledger()).
  • blockchain_aux_ledger_v1:set_vars(#{poc_target_pool_size => 100, poc_targeting_version => 4}, blockchain_ledger_v1:mode(aux, blockchain:ledger())).
  • TempLedger = blockchain_ledger_v1:new_context(blockchain_ledger_v1:mode(aux, blockchain:ledger())), blockchain_ledger_v1:build_random_hex_targeting_lookup(5, TempLedger), blockchain_ledger_v1:commit_context(TempLedger).
Vars = [poc_version,poc_v4_target_challenge_age,                                                                                                                                                                                                                                 poc_v5_target_prob_randomness_wt,
 poc_witness_consideration_limit,poc_v4_prob_rssi_wt,
 poc_v4_prob_time_wt,poc_v4_prob_count_wt,
 poc_v4_prob_no_rssi,poc_v4_prob_good_rssi,
 poc_v4_prob_bad_rssi,poc_v4_parent_res,
 poc_v4_exclusion_cells,poc_v4_randomness_wt,
 poc_centrality_wt,poc_good_bucket_low,poc_good_bucket_high,
 poc_max_hop_cells,poc_v4_target_prob_score_wt,
 poc_v4_target_prob_edge_wt,poc_v5_target_prob_randomness_wt,
 poc_v4_target_score_curve,poc_v4_target_exclusion_cells,
 poc_path_limit]

Now you can run a targeting run, comparing both modules with the same pubkey and the same random seed:

f(PubKey), f(PubKeyBin), #{public:=PubKey} = libp2p_crypto:generate_keys(ecc_compact), PubKeyBin = libp2p_crypto:pubkey_to_bin(PubKey), Ledger = blockchain:ledger(), AuxLedger= blockchain_ledger_v1:mode(aux, blockchain:ledger()), [ begin Hash = crypto:strong_rand_bytes(32), {element(1, element(2, blockchain_poc_target_v3:target(PubKeyBin, Hash, Ledger, blockchain_utils:get_vars(Vars, Ledger)))), element(1, element(2, blockchain_poc_target_v4:target(PubKeyBin, Hash, AuxLedger, blockchain_utils:get_vars(Vars, AuxLedger))))} end || _ <- lists:seq(1, 100)].

This outputs a list of 2-tuples which are the pubkey of the targeted hotspot the first element is the current v3 targeting, the second is the v4 targeting against the aux ledger.

You can also compare the times taken:

f(PubKey), f(PubKeyBin), #{public:=PubKey} = libp2p_crypto:generate_keys(ecc_compact), PubKeyBin = libp2p_crypto:pubkey_to_bin(PubKey), Ledger = blockchain:ledger(), AuxLedger= blockchain_ledger_v1:mode(aux, blockchain:ledger()), [ begin Hash = crypto:strong_rand_bytes(32), {timer:tc(fun() -> element(1, element(2, blockchain_poc_target_v3:target(PubKeyBin, Hash, Ledger, blockchain_utils:get_vars(Vars, Ledger))))end), timer:tc(fun() -> element(1, element(2, blockchain_poc_target_v4:target(PubKeyBin, Hash, AuxLedger, blockchain_utils:get_vars(Vars, AuxLedger)))) end)} end || _ <- lists:seq(1, 100)].

@Vagabond
Copy link
Contributor Author

Vagabond commented Feb 8, 2022

include/blockchain_vars.hrl Outdated Show resolved Hide resolved
%% v4 targeting enabled, build the h3dex lookup
{ok, Res} = blockchain_ledger_v1:config(?poc_target_hex_parent_res, Ledger),
blockchain_ledger_v1:build_random_hex_targeting_lookup(Res, Ledger),
ok;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should we delete the old hexes stuff here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

there's a separate var that disables updating hexes that has a hook that deletes the hexes list

var_hook(_Var, _Value, _Ledger) ->
ok.

-spec unset_hook(Var :: atom(), Ledger :: blockchain_ledger_v1:ledger()) -> ok.
unset_hook(?poc_targeting_version, Ledger) ->
%% going back to the default, which is v3 so remove the h3dex lookup
blockchain_ledger_v1:clean_random_hex_targeting_lookup(Ledger),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do we need a boostrap hex here too?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In theory we'd enable targeting v4 and leave the hexes being updated for a bit before disabling hexes updating

@evanmcc
Copy link
Contributor

evanmcc commented Feb 15, 2022

I think that this is ready to go once we've got approval and some indications that fairness the same or better than the old code.

@Vagabond
Copy link
Contributor Author

Sampling to a file:

f(PubKey), f(PubKeyBin), f(F), {ok, F} = file:open("/tmp/target-100", [raw,  write]), #{public:=PubKey} = libp2p_crypto:generate_keys(ecc_compact), PubKeyBin = libp2p_crypto:pubkey_to_bin(PubKey), Ledger = blockchain:ledger(), AuxLedger= blockchain_ledger_v1:mode(aux, blockchain:ledger()), [ begin Hash = crypto:strong_rand_bytes(32), file:write(F, io_lib:format("~s,~s~n", [libp2p_crypto:bin_to_b58(element(1, element(2, blockchain_poc_target_v3:target(PubKeyBin, Hash, Ledger, blockchain_utils:get_vars(Vars, Ledger))))), libp2p_crypto:bin_to_b58(element(1, element(2, blockchain_poc_target_v4:target(PubKeyBin, Hash, AuxLedger, blockchain_utils:get_vars(Vars, AuxLedger)))))])) end || _ <- lists:seq(1, 750000)], file:close(F).

@evanmcc evanmcc merged commit 36dd1b5 into master Mar 1, 2022
@evanmcc evanmcc deleted the adt/gc-h3dex branch March 1, 2022 23:37
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants