-
Notifications
You must be signed in to change notification settings - Fork 2.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
lnd: implement "safe mode" node stand up #3287
Comments
I am going to take a shot at working on this. |
It would be nice to have a flag on force close to do a "dangerous" force close |
@alexbosworth: Is the case in which that would be helpful when you want to make sure that no other channels are force-closed besides ones you specifically choose? I could see that being useful. What do you think about adding a confirmation message in the case of a "dangerous force close" command received over RPC (I'm not sure if there is a precedent for that type of user interaction)? Also, I'm assuming that automated force closing requests would still be prohibited. Alternatively, one could argue that "safe mode" should prevent a user from performing dangerous operations, and that the user should restart |
The use case I am specifically thinking about is one where you have an out of date backup and you want to use it to recover funds I'm not sure if this is handled in this PR, but blanket banning force closes seems risky to me in the event of race conditions relating to HTLC resolution So where I would see using this is:
|
Should possibly also reject channels state updates. |
This is already done for restored channels (sorry, tabbed to Close and Comment lol) |
Yeah, but in this case regular channels won't be marked borked/restored, so they can still have updates. |
why wouldn't they? isn't this supposed to be used after restoring w/ SCB? |
It's a little unclear when it's possible to leave "safe mode" and resume normal operation. If our node has a bad state, contacts the other peer to initiate a force close, and then leaves safe mode before the peer's force close tx is confirmed, it's possible for our node to force close. Also I agree with @alexbosworth above that if all force closes are banned, there could be some legitimate, synced channels which need to be force closed but aren't. |
If you are restoring from SCBs then I don't think safe mode is necessary, since you don't have any toxic data. |
Yup we reviewed this PR in the lnd review club and conner also suggested maybe disabling several features like no bootstrap, no graph sync, no channel acceptance in addition to no force closures |
I think the users don't always know whether they are in a old state, so I wonder if it makes sense to delay the channel_arbitrator actions like e.g. going on chain for an expired HTLC but instead at least wait for the peer connection to build up, because their a wrong state of the channel would cause our peer to Force-Close the channel avoiding probably that we will go onchain with the wrong state. |
@ziggie1984 good point. One way would be to have a mode to start in safe mode (so ppl could do it all the time), then later on check an endpoint to see if any actions would've' been executed, then allow an API call to upgrade to regular operation.
FWIW, this would be the opposite of what was suggested in: #8166 I think a middle ground could make sense though. Need to think about it further. |
Perhaps safe mode could be automatically enabled on startup if the node is more than X blocks behind the chain. The more blocks behind, the more likely the DB is out-of-date. This would allow #8166 DoS protections in the crash and restart case, while a node that's been offline for a while will use safe mode until upgraded. |
Although we now have the proper base set of tools in place (SCB) to enable nodes to safely reclaim their channels in the event of data loss, it's still possible that a node boots up with stale data. If this is the case, then the node is at risk of breaching the other channel peer inadvertently. Due to systems like the
contractcourt
which will automatically force close channels that have expired HTLCs, this can happen in an automated fashion on restarts.Rather than resuming normal operation if one knows they may be restoring with an out dated state, we can instead implement a "safe mode" of sorts. When users boot up in this mode, all commitment broadcasts are forbidden. Once
lnd
has booted up, the user can then examine the set of channel states to see if they become borked once we connect to peers (indicative of local data loss).Steps To Completion
Add new
--safemode
config parameter to thelnd
binary.If safe mode is enabled, reject all RPC level force close requests.
If safe mode is enabled, reject all automated force close requests by channel arbitrators.
The text was updated successfully, but these errors were encountered: