Doc and performance followups to #2551 #2774

TheBlueMatt · 2023-12-06T18:34:48Z

A bunch of doc and cleanup changes, followed by a few performance optimizations, cause why not.

tnull

LGTM, mod some doc nits and minor questions. Will run some benches to confirm.

tnull · 2023-12-07T09:53:43Z

lightning/src/routing/router.rs

+		///
+		/// [`find_route`] validates this prior to constructing a [`CandidateRouteHop`].
+		details: &'a ChannelDetails,
+		/// The node id of the router payer, which is also the source side of this candidate route


nit: What is a "router payer"? Maybe at least tick payer?

But payer isn't a reference to a variable or any constant?

Ah, right, thought you were referring to the parameter of find_route, but that's named our_node_pubkey. Then it probably doesn't need ticks, but I'm still not quite sure what a "router payer" is supposed to be? Someone paying a router? ...

I dropped the "router" part :)

lightning/src/routing/router.rs

tnull · 2023-12-07T10:13:42Z

lightning/src/routing/router.rs

@@ -2195,7 +2195,9 @@ where L::Target: Logger {
 			if !skip_node {
 				if let Some(first_channels) = first_hop_targets.get(&$node_id) {
 					for details in first_channels {
-						let candidate = CandidateRouteHop::FirstHop { details, node_id: our_node_id };
+						let candidate = CandidateRouteHop::FirstHop {
+							details, payer_node_id: &our_node_id,


nit: Could take this opportunity to unify variable names here, i.e., rename our_node_id to payer_node_id.

Will look in some of the followups, cause there's already a few PRs stacked on top here.

lightning/src/routing/router.rs

tnull · 2023-12-07T18:27:17Z

Benchmarks finished (n=10000, Ubuntu on Intel(R) Xeon(R) CPU E3-1226 v3 @ 3.30GHz):

Benchmarking generate_routes_with_zero_penalty_scorer: Collecting 10000 samples in estimated 1362.2 s (10k iterations)
Benchmarking generate_routes_with_zero_penalty_scorer: Analyzing
generate_routes_with_zero_penalty_scorer
                        time:   [129.57 ms 130.38 ms 131.18 ms]
                        change: [+1.3864% +2.3659% +3.3959%] (p = 0.00 < 0.05)
                        Performance has regressed.
Found 200 outliers among 10000 measurements (2.00%)
  200 (2.00%) high mild

Benchmarking generate_mpp_routes_with_zero_penalty_scorer
Benchmarking generate_mpp_routes_with_zero_penalty_scorer: Warming up for 3.0000 s

Warning: Unable to complete 10000 samples in 5.0s. You may wish to increase target time to 1484.8s, or reduce sample count to 30.
Benchmarking generate_mpp_routes_with_zero_penalty_scorer: Collecting 10000 samples in estimated 1484.8 s (10k iterations)
Benchmarking generate_mpp_routes_with_zero_penalty_scorer: Analyzing
generate_mpp_routes_with_zero_penalty_scorer
                        time:   [136.90 ms 137.88 ms 138.85 ms]
                        change: [+9.3886% +10.661% +11.922%] (p = 0.00 < 0.05)
                        Performance has regressed.
Found 200 outliers among 10000 measurements (2.00%)
  200 (2.00%) high mild

Benchmarking generate_routes_with_probabilistic_scorer
Benchmarking generate_routes_with_probabilistic_scorer: Warming up for 3.0000 s

Warning: Unable to complete 10000 samples in 5.0s. You may wish to increase target time to 1788.8s, or reduce sample count to 20.
Benchmarking generate_routes_with_probabilistic_scorer: Collecting 10000 samples in estimated 1788.8 s (10k iterations)
Benchmarking generate_routes_with_probabilistic_scorer: Analyzing
generate_routes_with_probabilistic_scorer
                        time:   [180.62 ms 181.74 ms 182.84 ms]
                        change: [-3.3071% -2.6039% -1.8674%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 400 outliers among 10000 measurements (4.00%)
  400 (4.00%) low mild

Benchmarking generate_mpp_routes_with_probabilistic_scorer
Benchmarking generate_mpp_routes_with_probabilistic_scorer: Warming up for 3.0000 s

Warning: Unable to complete 10000 samples in 5.0s. You may wish to increase target time to 1814.4s, or reduce sample count to 20.
Benchmarking generate_mpp_routes_with_probabilistic_scorer: Collecting 10000 samples in estimated 1814.4 s (10k iterations)
Benchmarking generate_mpp_routes_with_probabilistic_scorer: Analyzing
generate_mpp_routes_with_probabilistic_scorer
                        time:   [196.95 ms 198.11 ms 199.26 ms]
                        change: [-1.5444% -0.8202% -0.1784%] (p = 0.02 < 0.05)
                        Change within noise threshold.
Found 600 outliers among 10000 measurements (6.00%)
  400 (4.00%) low mild
  200 (2.00%) high mild

Benchmarking generate_large_mpp_routes_with_probabilistic_scorer
Benchmarking generate_large_mpp_routes_with_probabilistic_scorer: Warming up for 3.0000 s

Warning: Unable to complete 10000 samples in 5.0s. You may wish to increase target time to 6297.6s, or reduce sample count to 10.
Benchmarking generate_large_mpp_routes_with_probabilistic_scorer: Collecting 10000 samples in estimated 6297.6 s (10k iterations)
Benchmarking generate_large_mpp_routes_with_probabilistic_scorer: Analyzing
generate_large_mpp_routes_with_probabilistic_scorer
                        time:   [411.34 ms 416.69 ms 422.07 ms]
                        change: [-26.570% -25.346% -24.193%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 1 outliers among 10000 measurements (0.01%)
  1 (0.01%) high mild

Looks like generate_mpp_routes_with_zero_penalty_scorer is taking a hit, but generally like an improvement?

lightning/src/routing/router.rs

lightning/src/routing/gossip.rs

lightning/src/routing/router.rs

TheBlueMatt · 2023-12-08T00:18:03Z

Looks like generate_mpp_routes_with_zero_penalty_scorer is taking a hit, but generally like an improvement?

Yea, not quite sure I understand why, but I saw something similar. Of course, indeed, we don't really care.

codecov-commenter · 2023-12-08T00:20:34Z

Codecov Report

All modified and coverable lines are covered by tests ✅

Comparison is base (becdf6f) 88.54% compared to head (6df0dd2) 89.53%.
Report is 43 commits behind head on main.

❗ Current head 6df0dd2 differs from pull request most recent head 1171bc1. Consider uploading reports for the commit 1171bc1 to get more accurate results

❗ Your organization needs to install the Codecov GitHub app to enable full functionality.

Additional details and impacted files

@@            Coverage Diff             @@
##             main    #2774      +/-   ##
==========================================
+ Coverage   88.54%   89.53%   +0.98%     
==========================================
  Files         115      115              
  Lines       90653    96279    +5626     
  Branches    90653    96279    +5626     
==========================================
+ Hits        80268    86202    +5934     
+ Misses       7974     7745     -229     
+ Partials     2411     2332      -79

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

tnull

Feel free to squash, at least from my side.

lightning/src/routing/router.rs

...to give a bit more readability on accessing sites.

f0ecc3e introduced a regression in the `remembers_historical_failures` test, and disabled it by simply removing the `#[test]` annotation. This fixes the test and marks it as a test again.

Short channel "ID"s are not globally unique when they come from a BOLT 11 route hint or a first hop (which can be an outbound SCID alias). In those cases, its rather confusing that we have a `short_channel_id` method which mixes them all together, and even more confusing that we have a `CandidateHopId` which is not, in fact returning a unique identifier. In our routing logic this is mostly fine - the cost of a collision isn't super high and we should still do just fine finding a route, however the same can't be true for downstream users, as they may or may not rely on the apparent guarantees. Thus, here, we privatise the SCID and id accessors.

These are used in the performance-critical routing and scoring operations, which may happen outside of our crate. Thus, we really need to allow downstream crates to inline these accessors into their code, which we do here.

Rather than calling `CandidateRouteHop::FirstHop::node_id` just `node_id`, we should call it `payer_node_id` to provide more context. We also take this opportunity to make it a reference, avoiding bloating `CandidateRouteHop`.

`TestRouter` tries to make scoring calls that mimic what an actual router would do, but the changes in f0ecc3e failed to make scoring calls for private hints or if we take a public hop for the last hop. This fixes those regressions, though no tests currently depend on this behavior.

This avoids bloating `CandidateRouteHop` with a full 33-byte node_id (and avoids repeated public key serialization when we do multiple pathfinding passes).

We'd previously aggressively cached elements in the `PathBuildingHop` struct (and its sub-structs), which resulted in a rather bloated size. This implied cache misses as we read from and write to multiple cache lines during processing of a single channel. Here, we reduce caching in `DirectedChannelInfo`, fitting the `(NodeId, PathBuildingHop)` tuple in exactly 128 bytes. While this should fit in a single cache line, it sadly does not generally lie in only two lines, as glibc returns large buffers from `malloc` which are very well aligned, plus 16 bytes (for its own allocation tracking). Thus, we try to avoid reading from the last 16 bytes of a `PathBuildingHop`, but luckily that isn't super hard. Note that here we make accessing `DirectedChannelInfo::effective_capacity` somewhat slower, but that's okay as its only ever done once per `DirectedChannelInfo` anyway. While our routing benchmarks are quite noisy, this appears to result in between a 5% and 15% performance improvement in the probabilistic scoring benchmarks.

Given `PathBuildingHop` is now an even multiple of cache lines, we can pick which fields "fall off" the cache line we have visible when dealing with hops, which we do here.

`RouteGraphNode` currently recalculates scores in its `Ord` implementation, wasting time while sorting the main Dijkstra's heap. Further, some time ago, when implementing the `htlc_maximum_msat` amount reduction while walking the graph, we added `PathBuildingHop::was_processed`, looking up the source node in `dist` each time we pop'ed an element off of the binary heap. As a result, we now have a reference to our `PathBuildingHop` when processing a best-node's channels, leading to several fields in `RouteGraphNode` being entirely redundant. Here we drop those fields, but add a pre-calculated score field, as well as force a suboptimal `RouteGraphNode` layout, retaining its existing 64 byte size. Without the suboptimal layout, performance is very mixed, but with it performance is mostly improved, by around 10% in most tests.

TheBlueMatt · 2023-12-08T20:45:51Z

Squashed with one minor further tweak:

$ git diff-tree -U2 6df0dd23 1171bc19
diff --git a/lightning/src/routing/router.rs b/lightning/src/routing/router.rs
index 2c9247049..942307874 100644
--- a/lightning/src/routing/router.rs
+++ b/lightning/src/routing/router.rs
@@ -941,5 +941,6 @@ impl Readable for RouteHint {
 ///
 /// While this generally comes from BOLT 11's `r` field, this struct includes more fields than are
-/// available in BOLT 11.
+/// available in BOLT 11. Thus, encoding and decoding this via `lightning-invoice` is lossy, as
+/// fields not supported in BOLT 11 will be stripped.
 #[derive(Clone, Debug, Hash, Eq, PartialEq, Ord, PartialOrd)]
 pub struct RouteHintHop {

TheBlueMatt mentioned this pull request Dec 6, 2023

Stop decaying liquidity information during scoring #2656

Merged

tnull reviewed Dec 7, 2023

View reviewed changes

jkczyz self-requested a review December 7, 2023 18:31

jkczyz reviewed Dec 7, 2023

View reviewed changes

TheBlueMatt force-pushed the 2023-12-2551-followups branch from d4f56c1 to 6df0dd2 Compare December 8, 2023 00:13

tnull reviewed Dec 8, 2023

View reviewed changes

lightning/src/routing/router.rs Outdated Show resolved Hide resolved

TheBlueMatt added 14 commits December 8, 2023 20:45

Rewrite docs in CandidateRouteHop to be somewhat more descriptive

9973331

Rename DirectedChannelInfo::outbound to from_node_one

98ed285

...to give a bit more readability on accessing sites.

Fix and re-enable the remembers_historical_failures test

2caccc5

f0ecc3e introduced a regression in the `remembers_historical_failures` test, and disabled it by simply removing the `#[test]` annotation. This fixes the test and marks it as a test again.

Fix new unused warnings in scoring.rs

fc44e84

#[inline] CandidateRouteHop accessors

57857fd

These are used in the performance-critical routing and scoring operations, which may happen outside of our crate. Thus, we really need to allow downstream crates to inline these accessors into their code, which we do here.

Fix indentation in router.rs broken in a1d15ac

e9bad1b

Make CandidateRouteHop method docs somewhat more descriptive

6ae0516

Make CandidateRouteHop::PrivateHop::target_node_id a reference

d0084c2

This avoids bloating `CandidateRouteHop` with a full 33-byte node_id (and avoids repeated public key serialization when we do multiple pathfinding passes).

Reorder PathBuildingHop fields somewhat

8ba3e83

Given `PathBuildingHop` is now an even multiple of cache lines, we can pick which fields "fall off" the cache line we have visible when dealing with hops, which we do here.

TheBlueMatt force-pushed the 2023-12-2551-followups branch from 6df0dd2 to 1171bc1 Compare December 8, 2023 20:45

jkczyz approved these changes Dec 8, 2023

View reviewed changes

tnull approved these changes Dec 8, 2023

View reviewed changes

tnull merged commit e94af0c into lightningdevkit:main Dec 8, 2023

Doc and performance followups to #2551 #2774

Doc and performance followups to #2551 #2774

Uh oh!

Conversation

TheBlueMatt commented Dec 6, 2023

Uh oh!

tnull left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

tnull Dec 7, 2023

Choose a reason for hiding this comment

Uh oh!

TheBlueMatt Dec 8, 2023

Choose a reason for hiding this comment

Uh oh!

tnull Dec 8, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

TheBlueMatt Dec 8, 2023

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

tnull Dec 7, 2023

Choose a reason for hiding this comment

Uh oh!

TheBlueMatt Dec 8, 2023

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

tnull commented Dec 7, 2023

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

TheBlueMatt commented Dec 8, 2023

Uh oh!

codecov-commenter commented Dec 8, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

tnull left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

TheBlueMatt commented Dec 8, 2023

Uh oh!

Uh oh!

tnull left a comment •

edited

Loading

tnull Dec 8, 2023 •

edited

Loading

codecov-commenter commented Dec 8, 2023 •

edited

Loading