Skip to content

Conversation

@codablock
Copy link

While working on enforcement of ChainLocks I discovered a race condition in ActivateBestChain due to it being called from multiple threads at the same time. This resulted in signals being invoked in undefined order, so it happened that signals for block 5 were called before block 4. This ends up in all kinds of issues that might occur, of which one is test failures due to waitforblockheight waiting forever. I'm also pretty sure that there are other possible issues with this which are much more severe.

This type of parallel invocation of ActivateBestChain is very unlikely in plain Bitcoin code, as it is only called from the message handler thread and when manually calling invalidateblock or reconsiderblock. But it becomes a lot more likely to happen in Dash with ChainLocks involved which call ActivateBestChain from different threads.

This PR does not really solve the issue, but it prepares the code for the actual fix. This PR makes ActivateBestChain require that cs_main is NOT held on entry, which will later allow us to introduce a new mutex that prevents execution of ActivateBestChain in parallel. With cs_main held on entry, it would lead to regular deadlocks.

This mainly PR consists of a few cherry-picks from bitcoin#11824. Please ignore the tile of bitcoin#11824 as the actually added functionality is not what we're interested in here. We're only interested in a few of the supporting commits.

Cherry picking these commits led to merge conflicts in many places due to out-of-order backporting. I decided to backport other PRs as well so that cherry-picking the commits from bitcoin#11824 was more or less conflict free. These other PRs are bitcoin#9665, bitcoin#11113 and bitcoin#11580. While backporting bitcoin#9665, I then realized that a previous backport was erroneous, so I also added a fix in b8ac517d7dd4239f7bfb0d660e80045f19fb6f87.

codablock and others added 11 commits March 12, 2019 10:43
This seems to be backported wrongly. In the Bitcoin code, there is a
condition on requested witness data and we took the other branch which
recreates the compact block. We should have taken the other branch because
we always send with witness data (there is no Segwit in Dash).
… messages

b49ad44 Add comment about cs_most_recent_block coverage (Matt Corallo)
c47f5b7 Cache witness-enabled state with recent-compact-block-cache (Matt Corallo)
efc135f Use cached [compact] blocks to respond to getdata messages (Matt Corallo)

Tree-SHA512: ffc478bddbf14b8ed304a3041f47746520ce545bdeffa9652eff2ccb25c8b0d5194abe72568c10f9c1b246ee361176ba217767af834752a2ca7263d292005e87
…de blocks

eff4bd8 [test] P2P functional test for certain fingerprinting protections (Jim Posen)
a2be3b6 [net] Ignore getheaders requests for very old side blocks (Jim Posen)

Pull request description:

  Sending a getheaders message with an empty locator and a stop hash is a request for a single header by hash. The node will respond with headers for blocks not in the main chain as well as those in the main chain. To avoid fingerprinting, the node should, however, ignore requests for headers on side branches that are too old. This replicates the logic that currently exists for `getdata` requests for blocks.

Tree-SHA512: e04ef61e2b73945be6ec5977b3c5680b6dc3667246f8bfb67afae1ecaba900c0b49b18bbbb74869f7a37ef70b6ed99e78ebe0ea0a1569369fad9e447d720ffc4
…ponse to getheaders

725b79a [test] Verify node doesn't send headers that haven't been fully validated (Russell Yanofsky)
3788a84 Do not send (potentially) invalid headers in response to getheaders (Matt Corallo)

Pull request description:

  Nowhere else in the protocol do we send headers which are for
  blocks we have not fully validated except in response to getheaders
  messages with a null locator. On my public node I have not seen any
  such request (whether for an invalid block or not) in at least two
  years of debug.log output, indicating that this should have minimal
  impact.

Tree-SHA512: c1f6e0cdcdfb78ea577d555f9b3ceb1b4b60eff4f6cf313bfd8b576c9562d797bea73abc23f7011f249ae36dd539c715f3d20487ac03ace60e84e1b77c0c1e1a
This should (marginally) speed up validationinterface queue
draining by avoiding a cs_main lock in one client.
This requires the removal of some very liberal (incorrect) cs_mains
sprinkled in some tests. It adds some chainActive.Tip() races, but
the tests are all single-threaded anyway.
@codablock codablock force-pushed the pr_activatebestchainrace branch from d4607d7 to 152a78e Compare March 12, 2019 09:43
@UdjinM6 UdjinM6 added this to the 14.0 milestone Mar 12, 2019
Copy link

@UdjinM6 UdjinM6 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good 👍

utACK

@UdjinM6 UdjinM6 merged commit 8955eb8 into dashpay:develop Mar 12, 2019
@UdjinM6 UdjinM6 mentioned this pull request Jan 22, 2020
@codablock codablock deleted the pr_activatebestchainrace branch March 24, 2020 09:51
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants