-
Notifications
You must be signed in to change notification settings - Fork 380
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Doc and performance followups to #2551 #2774
Doc and performance followups to #2551 #2774
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, mod some doc nits and minor questions. Will run some benches to confirm.
lightning/src/routing/router.rs
Outdated
/// | ||
/// [`find_route`] validates this prior to constructing a [`CandidateRouteHop`]. | ||
details: &'a ChannelDetails, | ||
/// The node id of the router payer, which is also the source side of this candidate route |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: What is a "router payer"? Maybe at least tick payer
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But payer
isn't a reference to a variable or any constant?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, right, thought you were referring to the parameter of find_route
, but that's named our_node_pubkey
. Then it probably doesn't need ticks, but I'm still not quite sure what a "router payer" is supposed to be? Someone paying a router? ...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I dropped the "router" part :)
@@ -2195,7 +2195,9 @@ where L::Target: Logger { | |||
if !skip_node { | |||
if let Some(first_channels) = first_hop_targets.get(&$node_id) { | |||
for details in first_channels { | |||
let candidate = CandidateRouteHop::FirstHop { details, node_id: our_node_id }; | |||
let candidate = CandidateRouteHop::FirstHop { | |||
details, payer_node_id: &our_node_id, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: Could take this opportunity to unify variable names here, i.e., rename our_node_id
to payer_node_id
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will look in some of the followups, cause there's already a few PRs stacked on top here.
Benchmarks finished (n=10000, Ubuntu on Intel(R) Xeon(R) CPU E3-1226 v3 @ 3.30GHz):
Looks like |
d4f56c1
to
6df0dd2
Compare
Yea, not quite sure I understand why, but I saw something similar. Of course, indeed, we don't really care. |
Codecov ReportAll modified and coverable lines are covered by tests ✅
❗ Your organization needs to install the Codecov GitHub app to enable full functionality. Additional details and impacted files@@ Coverage Diff @@
## main #2774 +/- ##
==========================================
+ Coverage 88.54% 89.53% +0.98%
==========================================
Files 115 115
Lines 90653 96279 +5626
Branches 90653 96279 +5626
==========================================
+ Hits 80268 86202 +5934
+ Misses 7974 7745 -229
+ Partials 2411 2332 -79 ☔ View full report in Codecov by Sentry. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Feel free to squash, at least from my side.
...to give a bit more readability on accessing sites.
f0ecc3e introduced a regression in the `remembers_historical_failures` test, and disabled it by simply removing the `#[test]` annotation. This fixes the test and marks it as a test again.
Short channel "ID"s are not globally unique when they come from a BOLT 11 route hint or a first hop (which can be an outbound SCID alias). In those cases, its rather confusing that we have a `short_channel_id` method which mixes them all together, and even more confusing that we have a `CandidateHopId` which is not, in fact returning a unique identifier. In our routing logic this is mostly fine - the cost of a collision isn't super high and we should still do just fine finding a route, however the same can't be true for downstream users, as they may or may not rely on the apparent guarantees. Thus, here, we privatise the SCID and id accessors.
These are used in the performance-critical routing and scoring operations, which may happen outside of our crate. Thus, we really need to allow downstream crates to inline these accessors into their code, which we do here.
Rather than calling `CandidateRouteHop::FirstHop::node_id` just `node_id`, we should call it `payer_node_id` to provide more context. We also take this opportunity to make it a reference, avoiding bloating `CandidateRouteHop`.
`TestRouter` tries to make scoring calls that mimic what an actual router would do, but the changes in f0ecc3e failed to make scoring calls for private hints or if we take a public hop for the last hop. This fixes those regressions, though no tests currently depend on this behavior.
This avoids bloating `CandidateRouteHop` with a full 33-byte node_id (and avoids repeated public key serialization when we do multiple pathfinding passes).
We'd previously aggressively cached elements in the `PathBuildingHop` struct (and its sub-structs), which resulted in a rather bloated size. This implied cache misses as we read from and write to multiple cache lines during processing of a single channel. Here, we reduce caching in `DirectedChannelInfo`, fitting the `(NodeId, PathBuildingHop)` tuple in exactly 128 bytes. While this should fit in a single cache line, it sadly does not generally lie in only two lines, as glibc returns large buffers from `malloc` which are very well aligned, plus 16 bytes (for its own allocation tracking). Thus, we try to avoid reading from the last 16 bytes of a `PathBuildingHop`, but luckily that isn't super hard. Note that here we make accessing `DirectedChannelInfo::effective_capacity` somewhat slower, but that's okay as its only ever done once per `DirectedChannelInfo` anyway. While our routing benchmarks are quite noisy, this appears to result in between a 5% and 15% performance improvement in the probabilistic scoring benchmarks.
Given `PathBuildingHop` is now an even multiple of cache lines, we can pick which fields "fall off" the cache line we have visible when dealing with hops, which we do here.
`RouteGraphNode` currently recalculates scores in its `Ord` implementation, wasting time while sorting the main Dijkstra's heap. Further, some time ago, when implementing the `htlc_maximum_msat` amount reduction while walking the graph, we added `PathBuildingHop::was_processed`, looking up the source node in `dist` each time we pop'ed an element off of the binary heap. As a result, we now have a reference to our `PathBuildingHop` when processing a best-node's channels, leading to several fields in `RouteGraphNode` being entirely redundant. Here we drop those fields, but add a pre-calculated score field, as well as force a suboptimal `RouteGraphNode` layout, retaining its existing 64 byte size. Without the suboptimal layout, performance is very mixed, but with it performance is mostly improved, by around 10% in most tests.
6df0dd2
to
1171bc1
Compare
Squashed with one minor further tweak: $ git diff-tree -U2 6df0dd23 1171bc19
diff --git a/lightning/src/routing/router.rs b/lightning/src/routing/router.rs
index 2c9247049..942307874 100644
--- a/lightning/src/routing/router.rs
+++ b/lightning/src/routing/router.rs
@@ -941,5 +941,6 @@ impl Readable for RouteHint {
///
/// While this generally comes from BOLT 11's `r` field, this struct includes more fields than are
-/// available in BOLT 11.
+/// available in BOLT 11. Thus, encoding and decoding this via `lightning-invoice` is lossy, as
+/// fields not supported in BOLT 11 will be stripped.
#[derive(Clone, Debug, Hash, Eq, PartialEq, Ord, PartialOrd)]
pub struct RouteHintHop { |
A bunch of doc and cleanup changes, followed by a few performance optimizations, cause why not.