Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Debug assertion confirmation_height >= latest_broadcast_height triggers frequently #3456

Open
whfuyn opened this issue Dec 11, 2024 · 4 comments

Comments

@whfuyn
Copy link

whfuyn commented Dec 11, 2024

Self::PendingFirstConfirmation {
first_broadcast_hash,
latest_broadcast_height,
..
} => {
debug_assert!(confirmation_height >= *latest_broadcast_height);
*self = Self::PendingThresholdConfirmations {
first_broadcast_hash: *first_broadcast_hash,
latest_broadcast_height: *latest_broadcast_height,
latest_spending_tx,
confirmation_height,
confirmation_hash,
};
},

[rust-lightning/lightning/src/util/sweep.rs:179:6] "ERROR: confirmation_height < latest_broadcast_height" = "ERROR: confirmation_height < latest_broadcast_height"
[rust-lightning/lightning/src/util/sweep.rs:180:6] confirmation_height = 57500
[rust-lightning/lightning/src/util/sweep.rs:180:6] &*latest_broadcast_height = 57501

This debug assertion frequently triggers on both regtest and testnet. As it's possible for a new tx broadcast to be attempted before we receive the confirmation notification, would it be better to remove this debug assertion?

@whfuyn whfuyn changed the title Debug assertion confirmation_height < latest_broadcast_height triggers frequently Debug assertion confirmation_height >= latest_broadcast_height triggers frequently Dec 11, 2024
@tnull
Copy link
Contributor

tnull commented Dec 11, 2024

As it's possible for a new tx broadcast to be attempted before we receive the confirmation notification, would it be better to remove this debug assertion?

Huh, how are you syncing the sweeper? Via Confirm or Listen? I assume the former and you're describing a case where you calling best_block_updated before transactions_confirmed and the broadcast attempt would be interleaved?

@whfuyn
Copy link
Author

whfuyn commented Dec 11, 2024

Yes, it's syncing via Confirm, and I'm actually using ldk-node! :)

It looks like best_block_updated is indeed called before transactions_confirmed in lightning-transaction-sync.

match maybe_await!(self.sync_best_block_updated(
&confirmables,
&mut sync_state,
&tip_hash
)) {
Ok(()) => {},
Err(InternalError::Inconsistency) => {
// Immediately restart syncing when we encounter any inconsistencies.
log_debug!(
self.logger,
"Encountered inconsistency during transaction sync, restarting."
);
sync_state.pending_sync = true;
continue;
},
Err(err) => {
// (Semi-)permanent failure, retry later.
log_error!(self.logger,
"Failed during transaction sync, aborting. Synced so far: {} confirmed, {} unconfirmed.",
num_confirmed,
num_unconfirmed
);
sync_state.pending_sync = true;
return Err(TxSyncError::from(err));
},
}
}
match maybe_await!(self.get_confirmed_transactions(&sync_state)) {
Ok(confirmed_txs) => {
// Double-check the tip hash. If it changed, a reorg happened since
// we started syncing and we need to restart last-minute.
match maybe_await!(self.client.get_tip_hash()) {
Ok(check_tip_hash) => {
if check_tip_hash != tip_hash {
tip_hash = check_tip_hash;
log_debug!(self.logger,
"Encountered inconsistency during transaction sync, restarting.");
sync_state.pending_sync = true;
continue;
}
num_confirmed += confirmed_txs.len();
sync_state
.sync_confirmed_transactions(&confirmables, confirmed_txs);
},

I made a reproduction of this problem. It typically occurs when force closing a channel.
https://github.com/whfuyn/ldk-node/tree/rust-lightning-3456

cargo test --package ldk-node --test integration_tests_rust -- confirmation_height_lt_latest_broadcast_height --exact --nocapture
== Node A ==
Setting network: regtest
Setting random LDK storage dir: /tmp/BqOOgk1
Setting random LDK listening addresses: [TcpIpV4 { addr: [127, 0, 0, 1], port: 44592 }, TcpIpV4 { addr: [127, 0, 0, 1], port: 7716 }]
Setting random LDK node alias: Some(NodeAlias([108, 100, 107, 45, 110, 111, 100, 101, 45, 54, 57, 55, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]))

== Node B ==
Setting network: regtest
Setting random LDK storage dir: /tmp/mdM2nf0
Setting random LDK listening addresses: [TcpIpV4 { addr: [127, 0, 0, 1], port: 13798 }, TcpIpV4 { addr: [127, 0, 0, 1], port: 49109 }]
Setting random LDK node alias: Some(NodeAlias([108, 100, 107, 45, 110, 111, 100, 101, 45, 52, 57, 50, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]))
Generating 101 blocks... Done!

Generating 1 blocks... Done!


A -- open_channel -> B
Generating 6 blocks... Done!

Generating 6 blocks... Done!

Generating 10 blocks... Done!

Generating 10 blocks... Done!

thread 'confirmation_height_lt_latest_broadcast_height' panicked at /home/_/.cargo/registry/src/index.crates.io-6f17d22bba15001f/lightning-0.0.125/src/util/sweep.rs:176:17:
assertion failed: confirmation_height >= *latest_broadcast_height
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace
test confirmation_height_lt_latest_broadcast_height ... FAILED

failures:

failures:
    confirmation_height_lt_latest_broadcast_height

test result: FAILED. 0 passed; 1 failed; 0 ignored; 0 measured; 17 filtered out; finished in 42.45s

error: test failed, to rerun pass `-p ldk-node --test integration_tests_rust`

@tnull
Copy link
Contributor

tnull commented Dec 12, 2024

Hmm, I see. We could consider removing the debug_assert, but tbh. it's not exactly wrong to check for the given conditional. That this can be hit is more a (known) shortcoming of how our Confirm interface works, i.e., that transactions_confirmed and best_block_updated can get called independently from each other and in arbitrary order. I'm tempted to resolve this by finally refactoring the interface to have a single method that takes an argument including all the changes that happened during connecting a new chain tip.

@TheBlueMatt
Copy link
Collaborator

Would be nice, but we shouldn't leave a known-reachable debug assertion in place either :/. I assume there's no actual logic bug here, right?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants