Skip to content

Conversation

@furszy
Copy link

@furszy furszy commented Feb 16, 2021

This patch adds an extra "head blocks" to the chainstate, which gives the range of blocks for writes may be incomplete. At the start of a flush, we write this record, write the dirty dbcache entries in 16 MiB batches, and at the end we remove the heads record again. If it is present at startup it means we crashed during flush, and we rollback/roll forward blocks inside of it to get a consistent tip on disk before proceeding.

If a flush completes succesfully, the resulting database is compatible with previous versions. If the node crashes in the middle of a flush, a version of the code with this patch is needed to recovery.

An adaptation of the following PRs with further modifications to the feature_dbcrash.py test to be up-to-date with upstream and solve RPC related bugs.

sipa and others added 5 commits February 16, 2021 11:33
This requires that we not access pcoinsTip in InitBlockIndex's
FlushStateToDisk (so we just skip it until later in AppInitMain)
and the LoadChainTip in LoadBlockIndex (which there is already one
later in AppinitMain, after ReplayBlocks, so skipping it there is
fine).

Includes some simplifications by Suhas Daftuar and Pieter Wuille.
furszy and others added 9 commits February 18, 2021 10:03
>>> Adaptation of btc@176c021d085f5a45bc9e038e760942aa648dd797 up to the present.

Adds new functional test, dbcrash.py, which uses -dbcrashratio to exercise the
logic for recovering from a crash during chainstate flush.

dbcrash.py is added to the extended tests, as it may take ~10 minutes to run

Use _Exit() instead of exit() for crash simulation

This eliminates stderr output such as:
    terminate called without an active exception
or
    Assertion failed: (!pthread_mutex_destroy(&m)), function ~recursive_mutex, file /usr/local/include/boost/thread/pthread/recursive_mutex.hpp, line 104.

Eliminating the stderr output on crash simulation allows testing with
test_runner.py, which reports a test as failed if stderr is produced.
This should fix a very rare travis failure in zapwallettxes, but
is also more correct, as you can currently race
ReacceptWalletTransactions with stop RPC calls to get bitcoind to
(IMO) eroneously return a non-0 exit code.
A rare race condition may trigger while awaiting the body of a message, see
upsteam commit 5ff8eb26371c4dc56f384b2de35bea2d87814779 for details.

This may fix some reported rpc hangs/crashes.
The bug was introduced in 2.1.6-beta, versions before that don't need the
workaround.
This prevents a potential race condition if control flow ends up in
`ShutdownHTTPServer` before the thread gets to `queue->Run()`,
deleting the work queue while workers are still going to use it.

Meant to fix bitcoin#12362.

Signed-off-by: Wladimir J. van der Laan <laanwj@gmail.com>
This function, which waits for all threads to exit, is no longer needed
now that threads are joined instead.

Signed-off-by: Wladimir J. van der Laan <laanwj@gmail.com>
The HTTP worker thread counter, as well as the RAII object that was used
to maintain it, is unused now, so can be removed.

Signed-off-by: Wladimir J. van der Laan <laanwj@gmail.com>
Adaptation from btc@fa5b440971a0dfdd64c1b86748a573fcd7dc65d3
@furszy furszy force-pushed the 2020_feature_dbcrash branch from be3da45 to c76fa04 Compare February 18, 2021 13:03
@furszy furszy changed the title [WIP] Use non-atomic flushing with block replay Use non-atomic flushing with block replay Feb 18, 2021
@furszy
Copy link
Author

furszy commented Feb 18, 2021

Added two more commits solving the RPC timeout, GA should be good now. Ready for review.

@furszy furszy added this to the 5.1.0 milestone Feb 18, 2021
Copy link

@random-zebra random-zebra left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good stuff. Code review ACK with some points.

@furszy
Copy link
Author

furszy commented Feb 19, 2021

Done @random-zebra, commit cherry-picked.

@furszy furszy force-pushed the 2020_feature_dbcrash branch from 45aa5b7 to aab15d7 Compare February 19, 2021 21:44
Copy link

@random-zebra random-zebra left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ACK aab15d7

Copy link
Collaborator

@Fuzzbawls Fuzzbawls left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ACK aab15d7

@random-zebra random-zebra merged commit ac52366 into PIVX-Project:master Feb 21, 2021
@furszy furszy deleted the 2020_feature_dbcrash branch November 29, 2022 14:19
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants