Skip to content

Peer Storage (Part 3): Identifying Lost Channel States #3897

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 4 commits into
base: main
Choose a base branch
from

Conversation

adi2011
Copy link
Contributor

@adi2011 adi2011 commented Jun 28, 2025

In this PR, we begin serializing the ChannelMonitors and sending them over to determine whether any states were lost upon retrieval.

The next PR will be the final one, where we use FundRecoverer to initiate a force close and potentially go on-chain using a penalty transaction.

Sorry for the delay!

@ldk-reviews-bot
Copy link

ldk-reviews-bot commented Jun 28, 2025

👋 Thanks for assigning @tnull as a reviewer!
I'll wait for their review and will help manage the review process.
Once they submit their review, I'll check if a second reviewer would be helpful.

@tnull tnull requested review from tnull and removed request for joostjager June 28, 2025 11:17
@adi2011 adi2011 force-pushed the peer-storage/serialise-deserialise branch from 101f31c to a35566a Compare June 29, 2025 05:03
adi2011 added 4 commits June 29, 2025 10:34
'PeerStorageMonitorHolder' is used to wrap a single ChannelMonitor, here we are
adding some fields separetly so that we do not need to read the whole ChannelMonitor
to identify if we have lost some states.

`PeerStorageMonitorHolderList` is used to keep the list of all the channels which would
be sent over the wire inside Peer Storage.
Create a utililty function to prevent code duplication while writing ChannelMonitors.

Serialise them inside ChainMonitor::send_peer_storage and send them over.
TODO: Peer storage should not cross 64k limit.
Deserialise the ChannelMonitors and compare the data to determine if we have
lost some states.
Node should now determine lost states using retrieved peer storage.
@adi2011 adi2011 force-pushed the peer-storage/serialise-deserialise branch from a35566a to 4c9f3c3 Compare June 29, 2025 05:04
Copy link

codecov bot commented Jun 29, 2025

Codecov Report

Attention: Patch coverage is 54.29864% with 101 lines in your changes missing coverage. Please review.

Project coverage is 88.86%. Comparing base (61a37b1) to head (4c9f3c3).

Files with missing lines Patch % Lines
lightning/src/chain/channelmonitor.rs 40.17% 8 Missing and 62 partials ⚠️
lightning/src/ln/channelmanager.rs 58.49% 22 Missing ⚠️
lightning/src/ln/our_peer_storage.rs 70.37% 0 Missing and 8 partials ⚠️
lightning/src/chain/chainmonitor.rs 95.00% 1 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #3897      +/-   ##
==========================================
- Coverage   88.86%   88.86%   -0.01%     
==========================================
  Files         165      165              
  Lines      118886   118962      +76     
  Branches   118886   118962      +76     
==========================================
+ Hits       105650   105710      +60     
- Misses      10911    10923      +12     
- Partials     2325     2329       +4     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

///
/// [`ChainMonitor`]: crate::chain::chainmonitor::ChainMonitor
#[rustfmt::skip]
pub(crate) fn write_util<Signer: EcdsaChannelSigner, W: Writer>(channel_monitor: &ChannelMonitorImpl<Signer>, is_stub: bool, writer: &mut W) -> Result<(), Error> {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@wpaulino what do you think we should reasonably cut here to reduce the size of a ChannelMonitor without making the emergency-case ChannelMonitors all that different from the regular ones to induce more code changes across channelmonitor.rs? Obviously we should avoid counterparty_claimable_outpoints, but how much code is gonna break in doing so?

}
},
None => {
// TODO: Figure out if this channel is so old that we have forgotten about it.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's no need to worry here, I think. If the channel is gone we either haven't fallen behind (probably) or we already broadcasted a stale state (because we broadcast on startup if the channel is gone and we have a ChannelMonitor) at which point were screwed. So nothing to do here.

@ldk-reviews-bot
Copy link

🔔 1st Reminder

Hey @tnull! This PR has been waiting for your review.
Please take a look when you have a chance. If you're unable to review, please let us know so we can find another reviewer.

Copy link
Contributor

@tnull tnull left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Took a first look, but will hold off with going more into details until we decided on which way we should go with the ChannelMonitor stub,

},

Err(e) => {
panic!("Wrong serialisation of PeerStorageMonitorHolderList: {}", e);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we should ever panic in any of this code. Yes, something might be wrong if we have peer storage data we can't read anymore, but really no reason to refuse to at least keep other potential channels operational.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants