Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Wait for bitcoind if it's gone backwards, don't abort. #7342

Merged

Conversation

rustyrussell
Copy link
Contributor

@ShahanaFarooqui and I spent about 2-3 hours trying to fix this, but the code was fragile and we kept breaking existing cases.

So instead, I started by reworking the code to make it simple, then I change it. The result is much neater, and will serve us well for any future changes.

Fixes: #6924

@rustyrussell rustyrussell added this to the v24.08 milestone May 26, 2024
@rustyrussell rustyrussell force-pushed the guilt/bitcoind-backwards-wait branch 2 times, most recently from 42b5f99 to b196d57 Compare May 26, 2024 06:09
Copy link
Collaborator

@vincenzopalazzo vincenzopalazzo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mh What is the main reason that makes #6924 difficult to implement with the current mainline?

I do this in folgore all the time by using last_height given by the getchaininfo RPC In #6924 and blocking till the bitcoind is not fully sync.

Do you think that we need all this pile of change while we are under rc?

@rustyrussell
Copy link
Contributor Author

Mh What is the main reason that makes #6924 difficult to implement with the current mainline?

The code is ridiculously fragile. As I said, every time we tried, we broke something else.

I do this in folgore all the time by using last_height given by the getchaininfo RPC In #6924 and blocking till the bitcoind is not fully sync.

That only works because we do not time you out, but it's silly: we should handle this internally.

Do you think that we need all this pile of change while we are under rc?

No, that's why it's set to 24.08 milestone?

@rustyrussell rustyrussell force-pushed the guilt/bitcoind-backwards-wait branch from b196d57 to cb319e2 Compare May 27, 2024 00:39
@vincenzopalazzo
Copy link
Collaborator

That only works because we do not time you out, but it's silly: we should handle this internally.

I agree

No, that's why it's set to 24.08 milestone?

Ah sorry I read it wrong!

@ShahanaFarooqui
Copy link
Collaborator

ACK b5b834a

Copy link
Collaborator

@ShahanaFarooqui ShahanaFarooqui left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

@rustyrussell rustyrussell force-pushed the guilt/bitcoind-backwards-wait branch 3 times, most recently from 8dc5386 to 7ab549d Compare June 21, 2024 01:37
Several callers stash this, but we have it, so make it explicit:
here's the block you asked for.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
The current code is confusing: there are polling loops, but we wait for
them to run once. 

Be explicit: make the calls once, then start the loops in begin_topology.

This removes various chain_topology struct members which only exist for
startup.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
It was a weird arbitrary bool passed through from the caller, which we no longer need.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Fixes: ElementsProject#6924
Changelog-Changed: lightningd: we wait for bitcoind if it has somehow gone backwards (as long as header height is still ok).
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Handling half in main() and half here was a mess.  And the name
"max_blockheight" was poor: it was the max in the db, or UINT32_MAX,
but then we changed it depending on what height we wanted to start at.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
checkchain_timer is run by chaintopology, so why have the pointer in
bitcoind?

And since we free the timers, we don't need them to self-disable by
checking the stopped flag.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
That way if you free the context, you free the call (i.e. the callback
will not be called).

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
…ile shutting down.

I've never seen this, but the race looks possible so we should close it.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Found by very slow CI: the two io_breaks() can combine into one, so we
block waiting for the second initialization which never happens.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
@rustyrussell rustyrussell force-pushed the guilt/bitcoind-backwards-wait branch from 7ab549d to 2e7f8e2 Compare June 24, 2024 01:57
@rustyrussell rustyrussell merged commit 80cde51 into ElementsProject:master Jun 24, 2024
35 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Core lightning fails on Umbrel restart if bitcoin is not synced
3 participants