Store `channels` per-peer #1507

ViktorTigerstrom · 2022-06-01T00:06:19Z

Closes #105 and enables #422 and #1278.

Moves the storage of our channels from the ChannelManager::ChannelHolder to the ChannelHolder::PeerState to enable that we store the channels per-peer instead of a single map.

This allows us to have channels with the same temporary_channel_id for different peers, as specified by BOLT2:
https://github.com/lightning/bolts/blob/4b62d26af93a07166d6a9de2cb5eff80849d9c19/02-peer-protocol.md?plain=1#L172

codecov-commenter · 2022-06-01T00:31:30Z

Codecov Report

Base: 90.74% // Head: 90.72% // Decreases project coverage by -0.02% ⚠️

Coverage data is based on head (edef5e8) compared to base (d8a20ed).
Patch coverage: 86.25% of modified lines in pull request are covered.

❗ Current head edef5e8 differs from pull request most recent head 23fdcca. Consider uploading reports for the commit 23fdcca to get more accurate results

Additional details and impacted files

@@            Coverage Diff             @@
##             main    #1507      +/-   ##
==========================================
- Coverage   90.74%   90.72%   -0.03%     
==========================================
  Files          96       96              
  Lines       50133    50523     +390     
  Branches    50133    50523     +390     
==========================================
+ Hits        45493    45835     +342     
- Misses       4640     4688      +48

Impacted Files	Coverage Δ
lightning/src/ln/channelmanager.rs	`87.18% <ø> (+0.52%)`	⬆️
lightning/src/ln/functional_test_utils.rs	`91.46% <45.00%> (-2.31%)`	⬇️
lightning-persister/src/lib.rs	`93.51% <92.85%> (+0.06%)`	⬆️
lightning/src/ln/functional_tests.rs	`96.98% <95.94%> (-0.08%)`	⬇️
lightning/src/ln/chanmon_update_fail_tests.rs	`97.59% <100.00%> (ø)`
lightning/src/ln/monitor_tests.rs	`99.56% <100.00%> (ø)`
lightning/src/ln/onion_route_tests.rs	`97.64% <100.00%> (+<0.01%)`	⬆️
lightning/src/ln/payment_tests.rs	`98.73% <100.00%> (-0.01%)`	⬇️
lightning/src/ln/reload_tests.rs	`95.58% <100.00%> (+0.01%)`	⬆️
lightning/src/ln/reorg_tests.rs	`100.00% <100.00%> (ø)`
... and 10 more

Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.

☔ View full report at Codecov.
📢 Do you have feedback about the report comment? Let us know in this issue.

ViktorTigerstrom · 2022-06-01T00:37:32Z

Snap, will address the failing check_commits check tomorrow. Sorry about that!

TheBlueMatt

Left some comments with some examples of a few high-level patterns here - you've added a lot of unreachable branches, some of which I think look quite reachable, at a minimum there should probably be a comment in a few of the cases explaining why you're confident those are unreachable. Similarly, there are quite a few places you take the per_peer_state write lock where I think you only need the read lock. I'd really like to see some kind of writeup in the documentation for the various internal fields explaining the consistency guarantees between the channel_state fields and the per_peer_state stuff, especially given we'd really like to remove stuff from channel_state over time and take individual locks for a shorter period to get better parallelism.

TheBlueMatt · 2022-06-01T01:01:13Z

lightning/src/ln/channelmanager.rs

+	/// added to `ChannelMonitor`s in the future, this map should be removed, as only the
+	/// `ChannelManager::per_peer_state` is required to access the channel if the
+	/// `counterparty_node_id` is passed with `MonitorEvent`s.
+	pub(super) id_to_peer: HashMap<[u8; 32], PublicKey>,


Maybe lets move this out of the ChannelHolder, not sure what reason we'd have to put it in there and easy to just track it separately? Once you do that, I think we can drop the $id_to_peer argument in most of the macros and just take the lock for a very short period by accessing $self.

TheBlueMatt · 2022-06-01T17:38:35Z

lightning/src/ln/functional_test_utils.rs

@@ -511,31 +511,32 @@ macro_rules! get_htlc_update_msgs {

 #[cfg(test)]
 macro_rules! get_channel_ref {
-	($node: expr, $lock: ident, $channel_id: expr) => {
+	($node: expr, $peer_state_lock: ident, $channel_id: expr) => {


Lets keep the old style here where we do the lock inside of the macro to DRY up the callsites, which are now a bit more verbose.

Ok, I opted for this version as I simply had too many problems in getting your suggested version to work. But I'll give it another go and try to fix that :)

TheBlueMatt · 2022-06-01T17:40:59Z

lightning/src/ln/channelmanager.rs

-				hash_map::Entry::Vacant(_) => return Err(APIError::ChannelUnavailable{err: "No such channel".to_owned()})
+						break Ok(());
+					},
+					hash_map::Entry::Vacant(_) => return Err(APIError::ChannelUnavailable{err: format!("No such channel for the passed counterparty_node_id {}", counterparty_node_id) })


Maybe update the error messages in a separate commit? This single commit is...kinda huge.

Thanks! Yeah, huge is a bit of an understatement hah 😄

TheBlueMatt · 2022-06-01T17:46:43Z

lightning/src/ln/channelmanager.rs

+	/// the handling of the events.
+	///
+	/// TODO:
+	/// The `counterparty_node_id` can't be passed with `MonitorEvent`s currently, as adding it to


requiring it in MonitorEvents breaks backwards compat, adding it does not :) Maybe phrase this as a "we should add this to the monitor events and then eventually rely on it".

TheBlueMatt · 2022-06-01T17:47:37Z

lightning/src/ln/channelmanager.rs

-	/// Because adding or removing an entry is rare, we usually take an outer read lock and then
-	/// operate on the inner value freely. Sadly, this prevents parallel operation when opening a
-	/// new channel.
+	/// Currently, we usually take an outer read lock and then operate on the inner value freely.


Huh? Multiple threads can take the read lock, so that still allows for parallel operation?

Snap thanks! That ment to say write lock, but given that I'll update that, I'll change this as well :)

TheBlueMatt · 2022-06-01T17:53:44Z

lightning/src/ln/channelmanager.rs

+				},
+				hash_map::Entry::Vacant(entry) => { entry.insert(channel); }
+			}
+		} else { unreachable!() }


Why is this unreachable? I don't think it is.

Given your feedback on "taking the write lock, then dropping it, then taking it again", I'd agree :). I'll make sure to be more careful about this and other unreachable clauses.

TheBlueMatt · 2022-06-01T17:56:25Z

lightning/src/ln/channelmanager.rs

-							if is_permanent {
-								remove_channel!(self, channel_state, chan_entry);
-								break result;
+			let per_peer_state = self.per_peer_state.write().unwrap();


Why is this taking a write lock? In general it seems you take a write lock in quite a few places where its not necessary.

Right, the reasoning behind taking the write lock here as well as in other places where it's not really needed is because of what i said in the PR message above. But I'll make sure to change that given that it shouldn't be necessary then :)

Reading through this again I realize that I should probably specify a scenario I'm worried about without digging into the message/request thread execution further:
If we grab the read lock only, in the places where we only need operate on the inner lock, to I'm a bit worried for a scenario where for example large routing nodes would have threads that grab the outer read lock rapidly, so that there's almost always at least one read lock acquired. This would then cause issues for the threads that want to access the outer lock with write access, as they would need to wait an extensive amount of time to access the write lock, locking the thread meanwhile.
Though, like I stated in my PR message, I'm haven't digged into that deep enough to understand if this is a cause for concern, or if that's nothing we need to worry about :)!

Ah, yea, writer starvation could be a concern. We have a FairRwLock that we use in PeerManager for this, which would let you take read locks without worrying too much about it :).

TheBlueMatt · 2022-06-01T18:01:05Z

lightning/src/ln/channelmanager.rs

-				},
-				None => { return Err(APIError::ChannelUnavailable { err: "No such channel".to_owned() }) },
+			let (res, chan) = {
+				let per_peer_state = self.per_peer_state.write().unwrap();


Lets avoid taking the write lock, then dropping it, then taking it again, it makes it really hard to reason about the consistency across the two parts of a single function, and I think it likely makes the unreachable down in the else branch of the second per_peer_state lock actually reachable.

ViktorTigerstrom · 2022-06-01T21:48:40Z

Thanks a lot for the feedback @TheBlueMatt! Will address it as soon as I can. I will therefore push the failing check commits fix together with the feedback addressing as well :)

ViktorTigerstrom · 2022-06-01T22:06:20Z

Would you like me to update the name of the ChannelHolder as well as channel_state given that the channels aren't stored there any longer?

ariard · 2022-06-05T23:37:47Z

lightning/src/ln/channelmanager.rs

 	/// SCIDs (and outbound SCID aliases) to the real channel id. Outbound SCID aliases are added
 	/// here once the channel is available for normal use, with SCIDs being added once the funding
 	/// transaction is confirmed at the channel's required confirmation depth.
-	pub(super) short_to_id: HashMap<u64, [u8; 32]>,
+	pub(super) short_to_id: HashMap<u64, (PublicKey, [u8; 32])>,


Maybe update the variable name and consequence now you're storing the counterparty_node_id here.

ariard · 2022-06-06T00:10:24Z

Would you like me to update the name of the ChannelHolder as well as channel_state given that the channels aren't stored there any longer?

Could be something like CrossChanStateHolder or CrossStateHolder.

TheBlueMatt · 2022-06-06T17:12:43Z

Would you like me to update the name of the ChannelHolder as well as channel_state given that the channels aren't stored there any longer?

I guess you could, no strong opinion. Ideally we'll just remove it sooner or later anyway, so dunno if its worth touching every line with channel_state, but up to you :).

See-also #1403 (comment) - dunno if you want to take that part of this PR out to help move #1403 forward.

ViktorTigerstrom · 2022-06-06T18:16:56Z

Could be something like CrossChanStateHolder or CrossStateHolder.

I guess you could, no strong opinion. Ideally we'll just remove it sooner or later anyway, so dunno if its worth touching every line with channel_state, but up to you :).

Thanks a lot for the review @ariard!

Ok, given that there's no immediate need to update the name, I'll leave it for now as this PR is getting quite huge, and it might therefore be better to leave the name change for a follow-up later because of that.

Sorry for the delay on this one, I'm gonna try to get time to address the feedback later today!

TheBlueMatt · 2022-06-09T11:34:30Z

Looks like this needs rebase after #1527. I do wonder if we can split this PR up a bit more - either more commits or ideally even more PRs - to do various smaller, more reviewable new-parameters and such changes before the big honking "move things around" change.

ViktorTigerstrom · 2022-06-09T12:16:26Z

Looks like this needs rebase after #1527. I do wonder if we can split this PR up a bit more - either more commits or ideally even more PRs - to do various smaller, more reviewable new-parameters and such changes before the big honking "move things around" change.

Sure I'll try to split it up as much as possible! I'll push the new maps in a separate PR and such. Unfortunately though, I think it's quite unavoidable to not make the big "move things around" commit huge, as long as we want every separate commit to build. I'll try to focus it as much as possible on only moving things around, and not changing logic, error messages and such in the same commit though :).

TheBlueMatt · 2022-06-13T16:55:59Z

Yea, there's no question that final commit is going to be big no matter what we do. My only point here is if there's anything we can split at all, we absolutely should, and if possible even into a new PR.

ViktorTigerstrom · 2022-06-14T20:21:31Z

Apologies for taking so long time to address the feedback! This PR has now been split up and is now based on #1542. So this can be marked as blocked by dependent PR.

I'd really like to see some kind of writeup in the documentation for the various internal fields explaining the consistency guarantees between the channel_state fields and the per_peer_state stuff, especially given we'd really like to remove stuff from channel_state over time and take individual locks for a shorter period to get better parallelism.

@TheBlueMatt, could you please specify a bit what consistency guarantees you would like to have documented? Is it mainly the consistency guarantees between the different maps you're interested in :)?

I still need to update some of the commit messages, but will get to that when i squash the commits :).

Happy to address any feedback :)

TheBlueMatt · 2022-06-16T17:06:02Z

@TheBlueMatt, could you please specify a bit what consistency guarantees you would like to have documented? Is it mainly the consistency guarantees between the different maps you're interested in :)?

Yea, basically that - in what order do locks need to be taken, what lock ensures what is consistent with what, etc.

TheBlueMatt · 2022-07-13T01:03:46Z

Needs rebase 🚀

ViktorTigerstrom · 2022-07-13T08:03:07Z

Rebased on main now that the blocking PR has been merged :)!

TheBlueMatt · 2022-07-13T19:03:30Z

Can you go ahead and squash fixups down and clean up the git history? Then I'll do another round of review.

ViktorTigerstrom · 2022-07-13T19:20:24Z

Can you go ahead and squash fixups down and clean up the git history? Then I'll do another round of review.

Sure squashed :)! Though I've realized I still haven't written the docs for the consistency guarantees. A question before I write the documentation though. Do we want this PR to not change the lock order for the channel_state and the peer_state locks, and leave that for a follow-up PR?

TheBlueMatt · 2022-07-13T20:08:30Z

On second thought, I'm gonna try to get #1420 passing CI and we'll want to test this with that patch as well (doesn't need to wait on #1420 to land, just want to at least run it with both to make sure there's no errors caught with the additions there).

This passes.

After `channels` are now stored in the `per_peer_state`, some logic can be simplified and extra accessing of the `per_peer_state` can be removed.

ViktorTigerstrom · 2023-01-09T22:51:30Z

Feel free to always squash in fixups since this is super close :)

Thanks, squashed!

TheBlueMatt

Ooookkayyy this looks good to me. There's some notes here that I'd like to see addressed in a followup, but they're not bugs, and in the interest of delaying this, I'm happy to see it land.

TheBlueMatt · 2023-01-09T20:14:41Z

lightning/src/ln/channelmanager.rs

@@ -1651,24 +1686,18 @@ where

 		let mut failed_htlcs: Vec<(HTLCSource, PaymentHash)>;
 		let result: Result<(), _> = loop {
-			let mut channel_state_lock = self.channel_state.lock().unwrap();


In general, when we remove a channel we need to check and see if the peer the channel was connected to has since disconnected and remove the per_peer entry for it. ie if we force-close a channel for a peer we aren't connected to we have to clean the per_peer entry. I'm not 100% sure how best to do this without always taking the per_peer_state write lock, which would kill paralleism, so maybe we can do it in the timer. In any case, we should ensure it doesn't leave us vulnerable to a DoS like in #1889.

TheBlueMatt · 2023-01-09T21:41:43Z

lightning/src/ln/channelmanager.rs

+				break;
+			}
+
+			let peer_state_mutex = per_peer_state.as_ref().unwrap().get(&counterparty_node_id).unwrap();


Please don't do redundant lookups back-to-back :)

TheBlueMatt · 2023-01-09T21:43:53Z

lightning/src/ln/channelmanager.rs

@@ -3684,73 +3771,95 @@ where
 	}

 	fn claim_funds_from_hop<ComplFunc: FnOnce(Option<u64>) -> Option<MonitorUpdateCompletionAction>>(&self,
-		mut channel_state_lock: MutexGuard<ChannelHolder<<K::Target as SignerProvider>::Signer>>,
+		per_peer_state_lock: RwLockReadGuard<HashMap<PublicKey, Mutex<PeerState<<K::Target as SignerProvider>::Signer>>>>,


Similar to claim_funds_internal, I think we should just drop the lock argument here.

TheBlueMatt · 2023-01-09T21:53:52Z

lightning/src/ln/channelmanager.rs

+		};
+
+		let (found_channel, mut peer_state_opt) = if counterparty_node_id_opt.is_some() && per_peer_state_lock.get(&counterparty_node_id_opt.unwrap()).is_some() {
+			let peer_mutex = per_peer_state_lock.get(&counterparty_node_id_opt.unwrap()).unwrap();


Same here, you're doing two lookups in a row.

TheBlueMatt · 2023-01-09T21:57:10Z

lightning/src/ln/channelmanager.rs

+
+		if found_channel {
+			let peer_state = &mut *peer_state_opt.as_mut().unwrap();
+			if let hash_map::Entry::Occupied(mut chan) = peer_state.channel_by_id.entry(chan_id) {


Same here, we look up twice.

TheBlueMatt · 2023-01-09T22:19:55Z

lightning/src/ln/channelmanager.rs

+			let per_peer_state = self.per_peer_state.read().unwrap();
+			let peer_state_mutex_opt = per_peer_state.get(counterparty_node_id);
+			if let None = peer_state_mutex_opt {
+				return Err(MsgHandleErrInternal::send_err_msg_no_close(format!("Can't find a peer matching the passed counterparty node_id {}", counterparty_node_id), msg.temporary_channel_id))


In the message handling functions, we should debug_assert!(false) here - the peer handler should never pass us messages for peers we don't think are connected.

TheBlueMatt · 2023-01-09T22:50:09Z

lightning/src/ln/functional_test_utils.rs

+	}});
+	if ev_index.is_some() {
+		let mut updated_msg_events = msg_events.to_vec();
+		let ev = updated_msg_events.remove(ev_index.unwrap());


You're cloning the entire event vec just to remove one and return it? Let's instead pass in a mut reference to msg_events and remove that way.

TheBlueMatt · 2023-01-09T23:46:01Z

lightning/src/ln/channelmanager.rs

+				let mut peer_state_lock = peer_state_mutex.lock().unwrap();
+				let peer_state = &mut *peer_state_lock;
+				if peer_state.pending_msg_events.len() > 0 {
+					let mut peer_pending_events = Vec::new();


nit: you can drop this entirely, no? append "leaves the other vec empty"

valentinewallace

First pass minus tests, nothing major. I think we're going to land this and I'll do more post-merge review this week, so we can get this out of the way :)

valentinewallace · 2023-01-09T21:03:51Z

lightning/src/ln/channelmanager.rs

 		channel_state.pending_msg_events.push(events::MessageSendEvent::SendOpenChannel {
 			node_id: their_network_key,
 			msg: res,
 		});
 		Ok(temporary_channel_id)
 	}

-	fn list_channels_with_filter<Fn: FnMut(&(&[u8; 32], &Channel<<K::Target as SignerProvider>::Signer>)) -> bool>(&self, f: Fn) -> Vec<ChannelDetails> {
+	fn list_channels_with_filter<Fn: FnMut(&(&[u8; 32], &Channel<<K::Target as SignerProvider>::Signer>)) -> bool + Copy>(&self, f: Fn) -> Vec<ChannelDetails> {
 		let mut res = Vec::new();


nit: use with_capacity for conciseness

valentinewallace · 2023-01-09T22:34:27Z

lightning/src/ln/channelmanager.rs

 	/// Because adding or removing an entry is rare, we usually take an outer read lock and then
-	/// operate on the inner value freely. Sadly, this prevents parallel operation when opening a
-	/// new channel.
+	/// operate on the inner value freely. This opens up for parallel per-peer operation for


Suggested change

/// operate on the inner value freely. This opens up for parallel per-peer operation for

/// operate on the inner value freely. This opens up parallel per-peer operation for

valentinewallace · 2023-01-09T23:08:13Z

lightning/src/ln/channelmanager.rs

+	/// The bulk of our storage will eventually be here (message queues and the like). Currently
+	/// the `per_peer_state` stores our channels on a per-peer basis, as well as the peer's latest
+	/// features.


Didn't message queues move in this PR?

Suggested change

/// The bulk of our storage will eventually be here (message queues and the like). Currently

/// the `per_peer_state` stores our channels on a per-peer basis, as well as the peer's latest

/// features.

/// The bulk of our storage. Stores our channels on a per-peer basis, as well as the peer's latest

/// features.

valentinewallace · 2023-01-09T23:13:19Z

lightning/src/ln/channelmanager.rs

+		let per_peer_state = self.per_peer_state.read().unwrap();
+
+		let peer_state_mutex_opt = per_peer_state.get(&their_network_key);
+		if let None = peer_state_mutex_opt {


nit: use is_none() and elsewhere

valentinewallace · 2023-01-09T23:28:49Z

lightning/src/ln/channelmanager.rs

-					}
-				);
+				let per_peer_state = self.per_peer_state.read().unwrap();
+				if let Some(peer_state_mutex) = per_peer_state.get(&counterparty_node_id) {


I think we should error if the peer is not found in this method, fine to save for followup though

Hmm I'm not 100% sure what the correct result is here. We successfully managed to force_close the channel, but we're unable send this message to the peer. The message would just be dropped anyways if we're no longer connected to the peer, so it maybe should be an OK(()) res?

I'll return Err here in the follow-ups though, and then we can take decide there what the correct approach is :)!

On second glance, I think you're right and we shouldn't bother Erring :)

valentinewallace · 2023-01-09T23:33:05Z

lightning/src/ln/channelmanager.rs

@@ -2187,7 +2225,7 @@ where
 	/// public, and thus should be called whenever the result is going to be passed out in a
 	/// [`MessageSendEvent::BroadcastChannelUpdate`] event.
 	///
-	/// May be called with channel_state already locked!
+	/// May be called with peer_state already locked!


This comment confuses me because since a Channel is being passed in, peer_state must be locked, but not a big deal

valentinewallace · 2023-01-10T00:01:58Z

lightning/src/ln/channelmanager.rs

+		let peer_state_mutex_opt = per_peer_state.get(counterparty_node_id);
+		if let None = peer_state_mutex_opt {
+			return Err(APIError::APIMisuseError{ err: format!("Can't find a peer matching the passed counterparty node_id {}", counterparty_node_id) });
+		}
+		let mut peer_state_lock = peer_state_mutex_opt.unwrap().lock().unwrap();


Something like this would be more idiomatic/concise:

Suggested change

let peer_state_mutex_opt = per_peer_state.get(counterparty_node_id);

if let None = peer_state_mutex_opt {

return Err(APIError::APIMisuseError{ err: format!("Can't find a peer matching the passed counterparty node_id {}", counterparty_node_id) });

}

let mut peer_state_lock = peer_state_mutex_opt.unwrap().lock().unwrap();

let peer_state_mutex = per_peer_state.get(counterparty_node_id)

.ok_or_else(|| APIError::APIMisuseError{ err: format!("Can't find a peer matching the passed counterparty node_id {}", counterparty_node_id) })?;

let mut peer_state_lock = peer_state_mutex.lock().unwrap();

and elsewhere

ViktorTigerstrom · 2023-01-10T00:36:48Z

Thanks a lot @TheBlueMatt and @valentinewallace for reviewing this and the preceding PRs! Super happy to see this finally getting in ready state as it's been quite a while since this was initially opened 🚀!

I won't be able to address the feedback today as it's already night here. I'll start working on it tomorrow though :).
If you want to merge this to avoid further merge conflicts with other PRs, I'm super ok with addressing it all in follow-ups instead, but either way is ok for me. Most of it should be quick and straight forward with the exception of #1507 (comment) :).

TheBlueMatt · 2023-01-10T06:43:55Z

#1944 should fix the windows issues. Once that gets a review or two will land this.

ViktorTigerstrom · 2023-01-10T18:40:57Z

Thanks a lot for digging into the issue @TheBlueMatt!
Would you like me to rebase this on top of #1944 just to see that the regex matching fixes the issue, or are we ok with assuming it's solved?

TheBlueMatt · 2023-01-10T18:44:05Z

I think don't bother touching this PR, I tested an earlier variant of #1944 on my own branch based on this and it seemed to work so I'm happy with it, and it should be easy enough to fix if I'm wrong.

ViktorTigerstrom · 2023-01-10T18:46:43Z

I think don't bother touching this PR, I tested an earlier variant of #1944 on my own branch based on this and it seemed to work so I'm happy with it, and it should be easy enough to fix if I'm wrong.

Ok great!

valentinewallace

LGTM, will take another look after the follow-up

valentinewallace · 2023-01-11T21:01:09Z

lightning/src/ln/channelmanager.rs

+					#[cfg(debug_assertions)]
+					{
+						if let None = per_peer_state.get(&$counterparty_node_id) {
+							// This shouldn't occour in tests unless an unkown counterparty_node_id


nit: s/occour/occur, s/unkown/unknown

valentinewallace · 2023-01-11T21:18:20Z

lightning/src/ln/channelmanager.rs

@@ -3481,11 +3557,12 @@ where
 	fn fail_htlc_backwards_internal(&self, source: &HTLCSource, payment_hash: &PaymentHash, onion_error: &HTLCFailReason, destination: HTLCDestination) {
 		#[cfg(debug_assertions)]
 		{
-			// Ensure that the `channel_state` lock is not held when calling this function.
+			// Ensure that no peer state channel storage lock is not held when calling this


s/no peer state/the peer state

valentinewallace · 2023-01-11T22:34:58Z

lightning/src/ln/channelmanager.rs

-						if let Some(channel) = by_id.get_mut(&previous_channel_id) {
-							channel.claim_htlc_while_disconnected_dropping_mon_update(claimable_htlc.prev_hop.htlc_id, payment_preimage, &args.logger);
+						if let Some(peer_node_id) = id_to_peer.get(&previous_channel_id){
+							let peer_state_mutex = per_peer_state.get(peer_node_id).unwrap();


Do we know this unwrap is safe?

The id_to_peer map isn't written/read from disk, its built as we deserialize channels, so at this point it should be guaranteed to be consistent with our channel set.

Yeah agree that this is a bit confusing :). Even the channel existence check below is actually redundant, as we're guaranteed to have the channel here if it's exists in the id_to_peer map. I can of course add an extra if clause here though if you think it makes the code less confusing :).

Another general problem here going forward though: I'm a little worried here what we're gonna do here once we remove the id_to_peer map. Ideally we should be using the short_to_chan_info map here instead. Gonna look more into if we're guaranteed to have all channels we can have serialized claims for in the short_to_chan_info map here, especially channels that's not fully closed but about to (ie. is_usable is false).

ViktorTigerstrom · 2023-01-14T00:50:38Z

Aiming to have the follow-ups ready by tomorrow!

ViktorTigerstrom · 2023-01-18T01:45:56Z

Sorry for the delay. I've now opened the followup PR #1966 were the feedback has been addressed :)!

0.0.114 - Mar 3, 2023 - "Faster Async BOLT12 Retries" API Updates =========== * `InvoicePayer` has been removed and its features moved directly into `ChannelManager`. As such it now requires a simplified `Router` and supports `send_payment_with_retry` (and friends). `ChannelManager::retry_payment` was removed in favor of the automated retries. Invoice payment utilities in `lightning-invoice` now call the new code (lightningdevkit#1812, lightningdevkit#1916, lightningdevkit#1929, lightningdevkit#2007, etc). * `Sign`/`BaseSign` has been renamed `ChannelSigner`, with `EcdsaChannelSigner` split out in anticipation of future schnorr/taproot support (lightningdevkit#1967). * The catch-all `KeysInterface` was split into `EntropySource`, `NodeSigner`, and `SignerProvider`. `KeysManager` implements all three (lightningdevkit#1910, lightningdevkit#1930). * `KeysInterface::get_node_secret` is now `KeysManager::get_node_secret_key` and is no longer required for external signers (lightningdevkit#1951, lightningdevkit#2070). * A `lightning-transaction-sync` crate has been added which implements keeping LDK in sync with the chain via an esplora server (lightningdevkit#1870). Note that it can only be used on nodes that *never* ran a previous version of LDK. * `Score` is updated in `BackgroundProcessor` instead of via `Router` (lightningdevkit#1996). * `ChainAccess::get_utxo` (now `UtxoAccess`) can now be resolved async (lightningdevkit#1980). * BOLT12 `Offer`, `InvoiceRequest`, `Invoice` and `Refund` structs as well as associated builders have been added. Such invoices cannot yet be paid due to missing support for blinded path payments (lightningdevkit#1927, lightningdevkit#1908, lightningdevkit#1926). * A `lightning-custom-message` crate has been added to make combining multiple custom messages into one enum/handler easier (lightningdevkit#1832). * `Event::PaymentPathFailure` is now generated for failure to send an HTLC over the first hop on our local channel (lightningdevkit#2014, lightningdevkit#2043). * `lightning-net-tokio` no longer requires an `Arc` on `PeerManager` (lightningdevkit#1968). * `ChannelManager::list_recent_payments` was added (lightningdevkit#1873). * `lightning-background-processor` `std` is now optional in async mode (lightningdevkit#1962). * `create_phantom_invoice` can now be used in `no-std` (lightningdevkit#1985). * The required final CLTV delta on inbound payments is now configurable (lightningdevkit#1878) * bitcoind RPC error code and message are now surfaced in `block-sync` (lightningdevkit#2057). * Get `historical_estimated_channel_liquidity_probabilities` was added (lightningdevkit#1961). * `ChannelManager::fail_htlc_backwards_with_reason` was added (lightningdevkit#1948). * Macros which implement serialization using TLVs or straight writing of struct fields are now public (lightningdevkit#1823, lightningdevkit#1976, lightningdevkit#1977). Backwards Compatibility ======================= * Any inbound payments with a custom final CLTV delta will be rejected by LDK if you downgrade prior to receipt (lightningdevkit#1878). * `Event::PaymentPathFailed::network_update` will always be `None` if an 0.0.114-generated event is read by a prior version of LDK (lightningdevkit#2043). * `Event::PaymentPathFailed::all_paths_removed` will always be false if an 0.0.114-generated event is read by a prior version of LDK. Users who rely on it to determine payment retries should migrate to `Event::PaymentFailed`, in a separate release prior to upgrading to LDK 0.0.114 if downgrading is supported (lightningdevkit#2043). Performance Improvements ======================== * Channel data is now stored per-peer and channel updates across multiple peers can be operated on simultaneously (lightningdevkit#1507). * Routefinding is roughly 1.5x faster (lightningdevkit#1799). * Deserializing a `NetworkGraph` is roughly 6x faster (lightningdevkit#2016). * Memory usage for a `NetworkGraph` has been reduced substantially (lightningdevkit#2040). * `KeysInterface::get_secure_random_bytes` is roughly 200x faster (lightningdevkit#1974). Bug Fixes ========= * Fixed a bug where a delay in processing a `PaymentSent` event longer than the time taken to persist a `ChannelMonitor` update, when occurring immediately prior to a crash, may result in the `PaymentSent` event being lost (lightningdevkit#2048). * Fixed spurious rejections of rapid gossip sync data when the graph has been updated by other means between gossip syncs (lightningdevkit#2046). * Fixed a panic in `KeysManager` when the high bit of `starting_time_nanos` is set (lightningdevkit#1935). * Resolved an issue where the `ChannelManager::get_persistable_update_future` future would fail to wake until a second notification occurs (lightningdevkit#2064). * Resolved a memory leak when using `ChannelManager::send_probe` (lightningdevkit#2037). * Fixed a deadlock on some platforms at least when using async `ChannelMonitor` updating (lightningdevkit#2006). * Removed debug-only assertions which were reachable in threaded code (lightningdevkit#1964). * In some cases when payment sending fails on our local channel retries no longer take the same path and thus never succeed (lightningdevkit#2014). * Retries for spontaneous payments have been fixed (lightningdevkit#2002). * Return an `Err` if `lightning-persister` fails to read the directory listing rather than panicing (lightningdevkit#1943). * `peer_disconnected` will now never be called without `peer_connected` (lightningdevkit#2035) Security ======== 0.0.114 fixes several denial-of-service vulnerabilities which are reachable from untrusted input from channel counterparties or in deployments accepting inbound connections or channels. It also fixes a denial-of-service vulnerability in rare cases in the route finding logic. * The number of pending un-funded channels as well as peers without funded channels is now limited to avoid denial of service (lightningdevkit#1988). * A second `channel_ready` message received immediately after the first could lead to a spurious panic (lightningdevkit#2071). This issue was introduced with 0conf support in LDK 0.0.107. * A division-by-zero issue was fixed in the `ProbabilisticScorer` if the amount being sent (including previous-hop fees) is equal to a channel's capacity while walking the graph (lightningdevkit#2072). The division-by-zero was introduced with historical data tracking in LDK 0.0.112. In total, this release features 130 files changed, 21457 insertions, 10113 deletions in 343 commits from 18 authors, in alphabetical order: * Alec Chen * Allan Douglas R. de Oliveira * Andrei * Arik Sosman * Daniel Granhão * Duncan Dean * Elias Rohrer * Jeffrey Czyz * John Cantrell * Kurtsley * Matt Corallo * Max Fang * Omer Yacine * Valentine Wallace * Viktor Tigerström * Wilmer Paulino * benthecarman * jurvis

TheBlueMatt reviewed Jun 1, 2022

View reviewed changes

TheBlueMatt mentioned this pull request Jun 4, 2022

Add HTLCHandlingFailed event #1403

Merged

ariard reviewed Jun 6, 2022

View reviewed changes

This was referenced Jun 6, 2022

Pass counterparty_node_id to fail_holding_cell_htlcs #1524

Merged

Expose API to update a channel's ChannelConfig #1527

Merged

ViktorTigerstrom mentioned this pull request Jun 14, 2022

Preparations for storing channels per peer #1542

Merged

ViktorTigerstrom force-pushed the 2022-05-store-channels-per-peer branch from 1b262ca to ae665ce Compare June 14, 2022 20:14

ViktorTigerstrom marked this pull request as ready for review June 14, 2022 20:21

TheBlueMatt added the blocked on dependent pr label Jun 16, 2022

ViktorTigerstrom mentioned this pull request Jun 28, 2022

Forward over substitute channels #1578

Closed

TheBlueMatt removed the blocked on dependent pr label Jul 13, 2022

ViktorTigerstrom force-pushed the 2022-05-store-channels-per-peer branch from ae665ce to cdc0f13 Compare July 13, 2022 08:01

ViktorTigerstrom force-pushed the 2022-05-store-channels-per-peer branch from 1e60496 to d536b69 Compare July 13, 2022 19:14

ViktorTigerstrom added 14 commits January 9, 2023 23:50

Store channels per peer

1ab25a0

Remove unnecessary per_peer_state branch

b8ca7c9

After `channels` are now stored in the `per_peer_state`, some logic can be simplified and extra accessing of the `per_peer_state` can be removed.

Avoid unnecessary immediate retake per_peer_state lock

ff9840e

Unify failure to query Channel error messages

0eb74ec

Add duplicate temporary_channel_id for 2 peers test

8c175f5

Add handle unkown peer test

5578d79

Remove unnecessary channel counterparty checks

c42323c

Remove one tab level when accessing a peer_state

9e7c02c

Update id_to_peer docs with consistency guarantees info

e2c5cb5

Avoid unnecessary looping over all peers' channels

1b70a1e

Use correct node in test_invalid_upfront_shutdown_script

228fbc2

Store pending_msg_events per peer

6cb9853

Expect pending_msg_events to be in random peer order in tests

cb952f6

Remove the ChannelManager::channel_state

23fdcca

ViktorTigerstrom force-pushed the 2022-05-store-channels-per-peer branch from c06abcd to 23fdcca Compare January 9, 2023 22:50

TheBlueMatt approved these changes Jan 9, 2023

View reviewed changes

valentinewallace approved these changes Jan 10, 2023

View reviewed changes

TheBlueMatt merged commit 5221e4a into lightningdevkit:main Jan 10, 2023

TheBlueMatt mentioned this pull request Jan 10, 2023

1507 followups #1945

Closed

valentinewallace reviewed Jan 11, 2023

View reviewed changes

ViktorTigerstrom mentioned this pull request Jan 18, 2023

1507 followups #1966

Merged

	/// operate on the inner value freely. This opens up for parallel per-peer operation for
	/// operate on the inner value freely. This opens up parallel per-peer operation for

Store channels per-peer #1507

Store channels per-peer #1507

Uh oh!

Conversation

ViktorTigerstrom commented Jun 1, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

codecov-commenter commented Jun 1, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

ViktorTigerstrom commented Jun 1, 2022

Uh oh!

TheBlueMatt left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

ViktorTigerstrom Jun 2, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

ViktorTigerstrom commented Jun 1, 2022

Uh oh!

ViktorTigerstrom commented Jun 1, 2022

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

ariard commented Jun 6, 2022

Uh oh!

TheBlueMatt commented Jun 6, 2022

Uh oh!

ViktorTigerstrom commented Jun 6, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

TheBlueMatt commented Jun 9, 2022

Uh oh!

ViktorTigerstrom commented Jun 9, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

TheBlueMatt commented Jun 13, 2022

Uh oh!

ViktorTigerstrom commented Jun 14, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

TheBlueMatt commented Jun 16, 2022

Uh oh!

TheBlueMatt commented Jul 13, 2022

Uh oh!

ViktorTigerstrom commented Jul 13, 2022

Uh oh!

TheBlueMatt commented Jul 13, 2022

Store `channels` per-peer #1507

Store `channels` per-peer #1507

ViktorTigerstrom commented Jun 1, 2022 •

edited

Loading

codecov-commenter commented Jun 1, 2022 •

edited

Loading

ViktorTigerstrom Jun 2, 2022 •

edited

Loading

ViktorTigerstrom commented Jun 6, 2022 •

edited

Loading

ViktorTigerstrom commented Jun 9, 2022 •

edited

Loading

ViktorTigerstrom commented Jun 14, 2022 •

edited

Loading

TheBlueMatt commented Jul 13, 2022 •

edited

Loading