Skip to content

Make SpendableOutput claims more robust #103

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
May 19, 2023
Merged

Conversation

TheBlueMatt
Copy link
Contributor

@TheBlueMatt TheBlueMatt commented May 7, 2023

Rather than only trying once to claim a SpendableOutputDescriptor, try in a loop forever. This is important as feerates rise as a sweep may fail and fall out of the mempool to be lost forever.

This is currently entirely untested, will need to do that before we land it.

@TheBlueMatt TheBlueMatt marked this pull request as draft May 7, 2023 18:17
@TheBlueMatt TheBlueMatt force-pushed the main branch 4 times, most recently from 53dd65e to f643837 Compare May 9, 2023 18:21
@TheBlueMatt
Copy link
Contributor Author

I did manage to get one claim through this pipeline on testnet, but not a batch one. Would appreciate some review but will hold it until i get a batch claim through to actually merge.

@TheBlueMatt TheBlueMatt marked this pull request as ready for review May 9, 2023 21:48
src/main.rs Outdated
Comment on lines 348 to 351
let key = hex_utils::hex_str(&keys_manager.get_secure_random_bytes());
// Note that if the type here changes our read code needs to change as well.
let output: SpendableOutputDescriptor = output;
persister.persist(&format!("pending_spendable_outputs/{}", key), &output).unwrap();
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

consider letting the persister implementation decide what kind of encoding/formatting it wants

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The persister trait already allows arbitrary keys, this is us specifying our own for this particular implementation.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did you mean for the keys or for the object itself? Indeed, currently the persister trait allows arbitrary keys, we need to clean it up but for now this is an easy way to write a file durably (ie doing the stupid fsync-file-fsync-folder thing you have to do).

src/main.rs Outdated
Comment on lines 814 to 853
let pending_spendables_dir = format!("{}/pending_spendable_outputs", ldk_data_dir);
let processing_spendables_dir = format!("{}/processing_spendable_outputs", ldk_data_dir);
let spendables_dir = format!("{}/spendable_outputs", ldk_data_dir);
let spending_keys_manager = Arc::clone(&keys_manager);
let spending_logger = Arc::clone(&logger);
tokio::spawn(async move {
let mut interval = tokio::time::interval(Duration::from_secs(3600));
loop {
interval.tick().await;
if let Ok(dir_iter) = fs::read_dir(&pending_spendables_dir) {
// Move any spendable descriptors from pending folder so that we don't have any
// races with new files being added.
for file_res in dir_iter {
let file = file_res.unwrap();
// Only move a file if its a 32-byte-hex'd filename, otherwise it might be a
// temporary file.
if file.file_name().len() == 64 {
fs::create_dir_all(&processing_spendables_dir).unwrap();
let mut holding_path = PathBuf::new();
holding_path.push(&processing_spendables_dir);
holding_path.push(&file.file_name());
fs::rename(file.path(), holding_path).unwrap();
}
}
// Now concatenate all the pending files we moved into one file in the
// `spendable_outputs` directory and drop the processing directory.
let mut outputs = Vec::new();
if let Ok(processing_iter) = fs::read_dir(&processing_spendables_dir) {
for file_res in processing_iter {
outputs.append(&mut fs::read(file_res.unwrap().path()).unwrap());
}
}
if !outputs.is_empty() {
let key = hex_utils::hex_str(&spending_keys_manager.get_secure_random_bytes());
persister
.persist(&format!("spendable_outputs/{}", key), &WithoutLength(&outputs))
.unwrap();
fs::remove_dir_all(&processing_spendables_dir).unwrap();
}
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this would also be persister-implementation-specific

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not necessarily, we're already using a filesystem implementation of the persister and this specific way of batching only applies to us. We could definitely provide a crate/module to do this type of batching in a more agnostic way, but that would be as part of rust-lightning.

src/main.rs Outdated
Comment on lines 348 to 351
let key = hex_utils::hex_str(&keys_manager.get_secure_random_bytes());
// Note that if the type here changes our read code needs to change as well.
let output: SpendableOutputDescriptor = output;
persister.persist(&format!("pending_spendable_outputs/{}", key), &output).unwrap();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The persister trait already allows arbitrary keys, this is us specifying our own for this particular implementation.

src/main.rs Outdated
Comment on lines 814 to 853
let pending_spendables_dir = format!("{}/pending_spendable_outputs", ldk_data_dir);
let processing_spendables_dir = format!("{}/processing_spendable_outputs", ldk_data_dir);
let spendables_dir = format!("{}/spendable_outputs", ldk_data_dir);
let spending_keys_manager = Arc::clone(&keys_manager);
let spending_logger = Arc::clone(&logger);
tokio::spawn(async move {
let mut interval = tokio::time::interval(Duration::from_secs(3600));
loop {
interval.tick().await;
if let Ok(dir_iter) = fs::read_dir(&pending_spendables_dir) {
// Move any spendable descriptors from pending folder so that we don't have any
// races with new files being added.
for file_res in dir_iter {
let file = file_res.unwrap();
// Only move a file if its a 32-byte-hex'd filename, otherwise it might be a
// temporary file.
if file.file_name().len() == 64 {
fs::create_dir_all(&processing_spendables_dir).unwrap();
let mut holding_path = PathBuf::new();
holding_path.push(&processing_spendables_dir);
holding_path.push(&file.file_name());
fs::rename(file.path(), holding_path).unwrap();
}
}
// Now concatenate all the pending files we moved into one file in the
// `spendable_outputs` directory and drop the processing directory.
let mut outputs = Vec::new();
if let Ok(processing_iter) = fs::read_dir(&processing_spendables_dir) {
for file_res in processing_iter {
outputs.append(&mut fs::read(file_res.unwrap().path()).unwrap());
}
}
if !outputs.is_empty() {
let key = hex_utils::hex_str(&spending_keys_manager.get_secure_random_bytes());
persister
.persist(&format!("spendable_outputs/{}", key), &WithoutLength(&outputs))
.unwrap();
fs::remove_dir_all(&processing_spendables_dir).unwrap();
}
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not necessarily, we're already using a filesystem implementation of the persister and this specific way of batching only applies to us. We could definitely provide a crate/module to do this type of batching in a more agnostic way, but that would be as part of rust-lightning.

);
// Don't bother trying to announce if we don't have any public channls, though our
// peers should drop such an announcement anyway.
if chan_man.list_channels().iter().any(|chan| chan.is_public) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should also check it has 6 confs, no?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Eh? I mean yes the announcement wont propagate until then, but I'm lazy. I commented it instead.

Copy link
Contributor

@tnull tnull May 12, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But it should be at least list_usable_channels, as otherwise we may end up broadcasting when we have no connected peers? And, if we miss it the first time around, we'd wait for an hour to try again? So maybe only try and tick if we have usable channels or !peer_man.get_node_ids().is_empty(), and sleep a bit otherwise?

(see lightningdevkit/ldk-node@7225d42#diff-b1a35a68f14e696205874893c07fd24fdb88882b47c23cc0e0c80a30c7d53759R786-R809)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't see an issue with that – this is more about making sure the broadcast is valid. Also, list_usable_channels isn't what we want here since we could be connected to non-channel peers and they should still receive our updates regardless of whether we're connected to our channel counterparties.

Copy link
Contributor

@tnull tnull May 12, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, I think we want to check whether we have any online connection, doesn't have to be the one of the public channel.

The issue is that we may "waste" the broadcast tick when we're not connected and then only try again after an hour.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's no issue with "wasting" the broadcast tick as we don't try again until the next tick anyway. We could add a bunch of complexity and retry the claim after 10 seconds instead of an hour and check for if we have peers, but I'm really not convinced its worth it? This is only for public routing nodes anyway, which need to be online reliably with reasonable uptime, and broadcasts are only valid after an hour (6 blocks), so its not like we're really in a rush to get an announcement out super quick. We can just try again in an hour.

@TheBlueMatt
Copy link
Contributor Author

Squashed and pushed with some comment fixes:

$ git diff-tree -U1 49b9e76 353ca38
diff --git a/src/main.rs b/src/main.rs
index 33da6f6..c31f58c 100644
--- a/src/main.rs
+++ b/src/main.rs
@@ -805,3 +805,4 @@ async fn start_ldk() {
 			// Don't bother trying to announce if we don't have any public channls, though our
-			// peers should drop such an announcement anyway.
+			// peers should drop such an announcement anyway. Note that announcement may not
+			// propagate until we have a channel with 6+ confirmations.
 			if chan_man.list_channels().iter().any(|chan| chan.is_public) {
diff --git a/src/sweep.rs b/src/sweep.rs
index f21436d..207a50c 100644
--- a/src/sweep.rs
+++ b/src/sweep.rs
@@ -83,2 +83,7 @@ pub(crate) async fn periodic_sweep(
 		// them.
+		// Note that here we try to claim each set of spendable outputs over and over again
+		// forever, even long after its been claimed. While this isn't an issue per se, in practice
+		// you may wish to track when the claiming transaction has confirmed and remove the
+		// spendable outputs set. You may also wish to merge groups of unspent spendable outputs to
+		// combine batches.
 		if let Ok(dir_iter) = fs::read_dir(&spendables_dir) {

);
// Don't bother trying to announce if we don't have any public channls, though our
// peers should drop such an announcement anyway.
if chan_man.list_channels().iter().any(|chan| chan.is_public) {
Copy link
Contributor

@tnull tnull May 12, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But it should be at least list_usable_channels, as otherwise we may end up broadcasting when we have no connected peers? And, if we miss it the first time around, we'd wait for an hour to try again? So maybe only try and tick if we have usable channels or !peer_man.get_node_ids().is_empty(), and sleep a bit otherwise?

(see lightningdevkit/ldk-node@7225d42#diff-b1a35a68f14e696205874893c07fd24fdb88882b47c23cc0e0c80a30c7d53759R786-R809)

@TheBlueMatt
Copy link
Contributor Author

Should I go ahead and squash this?

@tnull
Copy link
Contributor

tnull commented May 16, 2023

Should I go ahead and squash this?

Excuse the delay! Yes, please go ahead!

Even if we don't have any listen addresses, it's still useful to
broadcast a node_announcement to get our node_features out there.

Here we do this and also improve the timing of our
node_announcement updates to be less spammy.
Rather than trying to sweep `SpendableOutputs` every time we see
one, try to sweep them in bulk once an hour. Then, try to sweep
them repeatedly every hour forever, even after they're claimed.

Fixes lightningdevkit#102.
@tnull tnull merged commit 30a9cb7 into lightningdevkit:main May 19, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants