Peer Storage (Part 3): Identifying Lost Channel States #3897

adi2011 · 2025-06-28T10:21:48Z

In this PR, we begin serializing the ChannelMonitors and sending them over to determine whether any states were lost upon retrieval.

The next PR will be the final one, where we use FundRecoverer to initiate a force close and potentially go on-chain using a penalty transaction.

Sorry for the delay!

ldk-reviews-bot · 2025-06-28T10:21:51Z

👋 Thanks for assigning @tnull as a reviewer!
I'll wait for their review and will help manage the review process.
Once they submit their review, I'll check if a second reviewer would be helpful.

'PeerStorageMonitorHolder' is used to wrap a single ChannelMonitor, here we are adding some fields separetly so that we do not need to read the whole ChannelMonitor to identify if we have lost some states. `PeerStorageMonitorHolderList` is used to keep the list of all the channels which would be sent over the wire inside Peer Storage.

Create a utililty function to prevent code duplication while writing ChannelMonitors. Serialise them inside ChainMonitor::send_peer_storage and send them over. TODO: Peer storage should not cross 64k limit.

Deserialise the ChannelMonitors and compare the data to determine if we have lost some states.

Node should now determine lost states using retrieved peer storage.

codecov · 2025-06-29T05:13:45Z

Codecov Report

Attention: Patch coverage is 54.29864% with 101 lines in your changes missing coverage. Please review.

Project coverage is 88.86%. Comparing base (61a37b1) to head (4c9f3c3).

Files with missing lines	Patch %	Lines
lightning/src/chain/channelmonitor.rs	40.17%	8 Missing and 62 partials ⚠️
lightning/src/ln/channelmanager.rs	58.49%	22 Missing ⚠️
lightning/src/ln/our_peer_storage.rs	70.37%	0 Missing and 8 partials ⚠️
lightning/src/chain/chainmonitor.rs	95.00%	1 Missing ⚠️

Additional details and impacted files

@@            Coverage Diff             @@
##             main    #3897      +/-   ##
==========================================
- Coverage   88.86%   88.86%   -0.01%     
==========================================
  Files         165      165              
  Lines      118886   118962      +76     
  Branches   118886   118962      +76     
==========================================
+ Hits       105650   105710      +60     
- Misses      10911    10923      +12     
- Partials     2325     2329       +4

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

TheBlueMatt · 2025-06-30T01:05:52Z

lightning/src/chain/channelmonitor.rs

+///
+/// [`ChainMonitor`]: crate::chain::chainmonitor::ChainMonitor
+#[rustfmt::skip]
+pub(crate) fn write_util<Signer: EcdsaChannelSigner, W: Writer>(channel_monitor: &ChannelMonitorImpl<Signer>, is_stub: bool, writer: &mut W) -> Result<(), Error> {


@wpaulino what do you think we should reasonably cut here to reduce the size of a ChannelMonitor without making the emergency-case ChannelMonitors all that different from the regular ones to induce more code changes across channelmonitor.rs? Obviously we should avoid counterparty_claimable_outpoints, but how much code is gonna break in doing so?

TheBlueMatt · 2025-06-30T01:13:07Z

lightning/src/ln/channelmanager.rs

+							}
+						},
+						None => {
+							// TODO: Figure out if this channel is so old that we have forgotten about it.


There's no need to worry here, I think. If the channel is gone we either haven't fallen behind (probably) or we already broadcasted a stale state (because we broadcast on startup if the channel is gone and we have a ChannelMonitor) at which point were screwed. So nothing to do here.

ldk-reviews-bot · 2025-06-30T11:18:09Z

🔔 1st Reminder

Hey @tnull! This PR has been waiting for your review.
Please take a look when you have a chance. If you're unable to review, please let us know so we can find another reviewer.

tnull

Took a first look, but will hold off with going more into details until we decided on which way we should go with the ChannelMonitor stub,

tnull · 2025-06-30T12:21:17Z

lightning/src/ln/channelmanager.rs

+			},
+
+			Err(e) => {
+				panic!("Wrong serialisation of PeerStorageMonitorHolderList: {}", e);


I don't think we should ever panic in any of this code. Yes, something might be wrong if we have peer storage data we can't read anymore, but really no reason to refuse to at least keep other potential channels operational.

ldk-reviews-bot requested a review from joostjager June 28, 2025 10:32

tnull requested review from tnull and removed request for joostjager June 28, 2025 11:17

adi2011 force-pushed the peer-storage/serialise-deserialise branch from 101f31c to a35566a Compare June 29, 2025 05:03

adi2011 added 4 commits June 29, 2025 10:34

Serialise ChannelMonitors and send them over inside Peer Storage

9a0ef54

Create a utililty function to prevent code duplication while writing ChannelMonitors. Serialise them inside ChainMonitor::send_peer_storage and send them over. TODO: Peer storage should not cross 64k limit.

Determine if we have lost data

8d82fd3

Deserialise the ChannelMonitors and compare the data to determine if we have lost some states.

test: Modify test_peer_storage to check latest changes

4c9f3c3

Node should now determine lost states using retrieved peer storage.

adi2011 force-pushed the peer-storage/serialise-deserialise branch from a35566a to 4c9f3c3 Compare June 29, 2025 05:04

TheBlueMatt reviewed Jun 30, 2025

View reviewed changes

tnull reviewed Jun 30, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Peer Storage (Part 3): Identifying Lost Channel States #3897

Peer Storage (Part 3): Identifying Lost Channel States #3897

Uh oh!

adi2011 commented Jun 28, 2025

Uh oh!

ldk-reviews-bot commented Jun 28, 2025 •

edited

Loading

Uh oh!

codecov bot commented Jun 29, 2025 •

edited

Loading

Uh oh!

TheBlueMatt Jun 30, 2025

Uh oh!

TheBlueMatt Jun 30, 2025

Uh oh!

ldk-reviews-bot commented Jun 30, 2025

Uh oh!

tnull left a comment

Uh oh!

tnull Jun 30, 2025

Uh oh!

Uh oh!

Peer Storage (Part 3): Identifying Lost Channel States #3897

Are you sure you want to change the base?

Peer Storage (Part 3): Identifying Lost Channel States #3897

Uh oh!

Conversation

adi2011 commented Jun 28, 2025

Uh oh!

ldk-reviews-bot commented Jun 28, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

codecov bot commented Jun 29, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

TheBlueMatt Jun 30, 2025

Choose a reason for hiding this comment

Uh oh!

TheBlueMatt Jun 30, 2025

Choose a reason for hiding this comment

Uh oh!

ldk-reviews-bot commented Jun 30, 2025

Uh oh!

tnull left a comment

Choose a reason for hiding this comment

Uh oh!

tnull Jun 30, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

ldk-reviews-bot commented Jun 28, 2025 •

edited

Loading

codecov bot commented Jun 29, 2025 •

edited

Loading