Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: wrong lock order, observed locally on top of #6467 with custom changes #6470

Merged
merged 1 commit into from
Dec 11, 2024

Conversation

knst
Copy link
Collaborator

@knst knst commented Dec 9, 2024

Issue being fixed or feature implemented

Assertion failed: detected inconsistent lock order for 'cs_main' in node/blockstorage.cpp:775 (in thread 'httpworker.2'), details in debug log.
   Previous lock order was:
    (2) 'cs_main' in rpc/net.cpp:666 (in thread 'httpworker.3')
    (1) 'm_nodes_mutex' in net.cpp:4754 (in thread 'httpworker.3')
   Current lock order is:
    (1) 'm_nodes_mutex' in net.cpp:5102 (in thread 'httpworker.2')
    (2) 'cs_main' in node/blockstorage.cpp:775 (in thread 'httpworker.2')
node0 2024-12-09T07:46:49.907123Z (mocktime: 2014-12-04T18:34:20Z) [httpworker.2] [stacktraces.cpp:528] [PrintCrashInfo] Posix Signal: Aborted

...
 15#: (0x5764EB18DE42) sync.cpp:123           - potential_deadlock_detected
 16#: (0x5764EB194A7E) sync.cpp:190           - push_lock<std::recursive_mutex>
 17#: (0x5764EB194A7E) sync.cpp:214           - void EnterCritical<std::recursive_mutex>(char const*, char const*, int, std::recursive_mutex*, bool)
 18#: (0x5764EA928642) unique_lock.h:150      - std::unique_lock<std::recursive_mutex>::try_lock()
 19#: (0x5764EA928642) sync.h:168             - UniqueLock<AnnotatedMixin<std::recursive_mutex>, std::unique_lock<std::recursive_mutex> >::Enter(char const*, char const*, int)
 20#: (0x5764EA928642) sync.h:190             - UniqueLock<AnnotatedMixin<std::recursive_mutex>, std::unique_lock<std::recursive_mutex> >::UniqueLock(AnnotatedMixin<std::recursive_mutex>&, char const*, char const*, int, bool)
 21#: (0x5764EAAD9C3D) chain.h:214            - CBlockIndex::GetBlockPos() const
 22#: (0x5764EAAD9C3D) blockstorage.cpp:775   - operator()
 23#: (0x5764EAAD9C3D) blockstorage.cpp:775   - ReadBlockFromDisk(CBlock&, CBlockIndex const*, Consensus::Params const&)
 24#: (0x5764EADB117A) stl_vector.h:1126      - std::vector<std::shared_ptr<CTransaction const>, std::allocator<std::shared_ptr<CTransaction const> > >::operator[](unsigned long)
 25#: (0x5764EADB117A) cbtx.cpp:467           - GetNonNullCoinbaseChainlock(CBlockIndex const*)
 26#: (0x5764EA9FE4F8) utils.cpp:84           - GetHashModifier
 27#: (0x5764EAA05291) utils.cpp:189          - ComputeQuorumMembers
 28#: (0x5764EAA05291) utils.cpp:167          - llmq::utils::GetAllQuorumMembers(Consensus::LLMQType, CDeterministicMNManager&, gsl::not_null<CBlockIndex const*>, bool)
 29#: (0x5764EA9906FC) stl_vector.h:1258      - std::vector<std::shared_ptr<CDeterministicMN const>, std::allocator<std::shared_ptr<CDeterministicMN const> > >::data()
 30#: (0x5764EA9906FC) span.h:164             - Span<std::shared_ptr<CDeterministicMN const> >::Span<std::vector<std::shared_ptr<CDeterministicMN const>, std::allocator<std::shared_ptr<CDeterministicMN const> > > >(std::vector<std::shared_ptr<CDeterministicMN const>, std::allocator<std::shared_ptr<CDeterministicMN const> > >&, std::enable_if<((!Span<std::shared_ptr<CDeterministicMN const> >::is_Span<std::vector<std::shared_ptr<CDeterministicMN const>, std::allocator<std::shared_ptr<CDeterministicMN const> > > >::value)&&std::is_convertible<std::remove_pointer<decltype ((((declval<std::vector<std::shared_ptr<CDeterministicMN const>, std::allocator<std::shared_ptr<CDeterministicMN const> > >&>)()).data)())>::type (*) [], std::shared_ptr<CDeterministicMN const> (*) []>::value)&&std::is_convertible<decltype ((((declval<std::vector<std::shared_ptr<CDeterministicMN const>, std::allocator<std::shared_ptr<CDeterministicMN const> > >&>)()).size)()), unsigned long>::value, decltype(nullptr)>::type)
 31#: (0x5764EA9906FC) quorums.cpp:412        - llmq::CQuorumManager::BuildQuorumFromCommitment(Consensus::LLMQType, gsl::not_null<CBlockIndex const*>, bool) const
 32#: (0x5764EA99198B) shared_ptr_base.h:1540 - std::__shared_ptr<llmq::CQuorum const, (__gnu_cxx::_Lock_policy)2>::__shared_ptr<llmq::CQuorum, void>(std::__shared_ptr<llmq::CQuorum, (__gnu_cxx::_Lock_policy)2>&&)
 33#: (0x5764EA99198B) shared_ptr.h:369       - std::shared_ptr<llmq::CQuorum const>::shared_ptr<llmq::CQuorum, void>(std::shared_ptr<llmq::CQuorum>&&)
 34#: (0x5764EA99198B) quorums.cpp:672        - llmq::CQuorumManager::GetQuorum(Consensus::LLMQType, gsl::not_null<CBlockIndex const*>, bool) const
 35#: (0x5764EA991BD6) shared_ptr_base.h:1070 - std::__shared_count<(__gnu_cxx::_Lock_policy)2>::~__shared_count()
 36#: (0x5764EA991BD6) shared_ptr_base.h:1524 - std::__shared_ptr<llmq::CQuorum const, (__gnu_cxx::_Lock_policy)2>::~__shared_ptr()
 37#: (0x5764EA991BD6) shared_ptr.h:175       - std::shared_ptr<llmq::CQuorum const>::~shared_ptr()
 38#: (0x5764EA991BD6) quorums.cpp:494        - llmq::CQuorumManager::RequestQuorumData(CNode*, Consensus::LLMQType, CBlockIndex const*, unsigned short, uint256 const&) const
 ...

What was done?

Refactored call of GetQuorum out of llmq::CQuorumManager::RequestQuorumData. It also optimize CQuorumManager::StartQuorumDataRecoveryThread because avoid double call of GetQuorum

How Has This Been Tested?

Run with and without this changes.

Breaking Changes

You may observe error RPC_INVALID_PARAMETER, "quorum not found" while calling RPC quorum getdata instead returning false with log output only in case of request for non-existing quorum.
The output is not documented at the moment anyway.

Checklist:

  • I have performed a self-review of my own code
  • I have commented my code, particularly in hard-to-understand areas
  • I have added or updated relevant unit/integration/functional/e2e tests
  • I have made corresponding changes to the documentation
  • I have assigned this pull request to a milestone

@knst knst force-pushed the fix-deadlock-getquorumdata branch from a79e983 to 747fe34 Compare December 9, 2024 09:56
Copy link

github-actions bot commented Dec 9, 2024

This pull request has conflicts, please rebase.

@knst knst mentioned this pull request Dec 10, 2024
5 tasks
…ustom changes

    Assertion failed: detected inconsistent lock order for 'cs_main' in node/blockstorage.cpp:775 (in thread 'httpworker.2'), details in debug log.
       Previous lock order was:
        (2) 'cs_main' in rpc/net.cpp:666 (in thread 'httpworker.3')
        (1) 'm_nodes_mutex' in net.cpp:4754 (in thread 'httpworker.3')
       Current lock order is:
        (1) 'm_nodes_mutex' in net.cpp:5102 (in thread 'httpworker.2')
        (2) 'cs_main' in node/blockstorage.cpp:775 (in thread 'httpworker.2')
    node0 2024-12-09T07:46:49.907123Z (mocktime: 2014-12-04T18:34:20Z) [httpworker.2] [stacktraces.cpp:528] [PrintCrashInfo] Posix Signal: Aborted

    ...
     15#: (0x5764EB18DE42) sync.cpp:123           - potential_deadlock_detected
     16#: (0x5764EB194A7E) sync.cpp:190           - push_lock<std::recursive_mutex>
     17#: (0x5764EB194A7E) sync.cpp:214           - void EnterCritical<std::recursive_mutex>(char const*, char const*, int, std::recursive_mutex*, bool)
     18#: (0x5764EA928642) unique_lock.h:150      - std::unique_lock<std::recursive_mutex>::try_lock()
     19#: (0x5764EA928642) sync.h:168             - UniqueLock<AnnotatedMixin<std::recursive_mutex>, std::unique_lock<std::recursive_mutex> >::Enter(char const*, char const*, int)
     20#: (0x5764EA928642) sync.h:190             - UniqueLock<AnnotatedMixin<std::recursive_mutex>, std::unique_lock<std::recursive_mutex> >::UniqueLock(AnnotatedMixin<std::recursive_mutex>&, char const*, char const*, int, bool)
     21#: (0x5764EAAD9C3D) chain.h:214            - CBlockIndex::GetBlockPos() const
     22#: (0x5764EAAD9C3D) blockstorage.cpp:775   - operator()
     23#: (0x5764EAAD9C3D) blockstorage.cpp:775   - ReadBlockFromDisk(CBlock&, CBlockIndex const*, Consensus::Params const&)
     24#: (0x5764EADB117A) stl_vector.h:1126      - std::vector<std::shared_ptr<CTransaction const>, std::allocator<std::shared_ptr<CTransaction const> > >::operator[](unsigned long)
     25#: (0x5764EADB117A) cbtx.cpp:467           - GetNonNullCoinbaseChainlock(CBlockIndex const*)
     26#: (0x5764EA9FE4F8) utils.cpp:84           - GetHashModifier
     27#: (0x5764EAA05291) utils.cpp:189          - ComputeQuorumMembers
     28#: (0x5764EAA05291) utils.cpp:167          - llmq::utils::GetAllQuorumMembers(Consensus::LLMQType, CDeterministicMNManager&, gsl::not_null<CBlockIndex const*>, bool)
     29#: (0x5764EA9906FC) stl_vector.h:1258      - std::vector<std::shared_ptr<CDeterministicMN const>, std::allocator<std::shared_ptr<CDeterministicMN const> > >::data()
     30#: (0x5764EA9906FC) span.h:164             - Span<std::shared_ptr<CDeterministicMN const> >::Span<std::vector<std::shared_ptr<CDeterministicMN const>, std::allocator<std::shared_ptr<CDeterministicMN const> > > >(std::vector<std::shared_ptr<CDeterministicMN const>, std::allocator<std::shared_ptr<CDeterministicMN const> > >&, std::enable_if<((!Span<std::shared_ptr<CDeterministicMN const> >::is_Span<std::vector<std::shared_ptr<CDeterministicMN const>, std::allocator<std::shared_ptr<CDeterministicMN const> > > >::value)&&std::is_convertible<std::remove_pointer<decltype ((((declval<std::vector<std::shared_ptr<CDeterministicMN const>, std::allocator<std::shared_ptr<CDeterministicMN const> > >&>)()).data)())>::type (*) [], std::shared_ptr<CDeterministicMN const> (*) []>::value)&&std::is_convertible<decltype ((((declval<std::vector<std::shared_ptr<CDeterministicMN const>, std::allocator<std::shared_ptr<CDeterministicMN const> > >&>)()).size)()), unsigned long>::value, decltype(nullptr)>::type)
     31#: (0x5764EA9906FC) quorums.cpp:412        - llmq::CQuorumManager::BuildQuorumFromCommitment(Consensus::LLMQType, gsl::not_null<CBlockIndex const*>, bool) const
     32#: (0x5764EA99198B) shared_ptr_base.h:1540 - std::__shared_ptr<llmq::CQuorum const, (__gnu_cxx::_Lock_policy)2>::__shared_ptr<llmq::CQuorum, void>(std::__shared_ptr<llmq::CQuorum, (__gnu_cxx::_Lock_policy)2>&&)
     33#: (0x5764EA99198B) shared_ptr.h:369       - std::shared_ptr<llmq::CQuorum const>::shared_ptr<llmq::CQuorum, void>(std::shared_ptr<llmq::CQuorum>&&)
     34#: (0x5764EA99198B) quorums.cpp:672        - llmq::CQuorumManager::GetQuorum(Consensus::LLMQType, gsl::not_null<CBlockIndex const*>, bool) const
     35#: (0x5764EA991BD6) shared_ptr_base.h:1070 - std::__shared_count<(__gnu_cxx::_Lock_policy)2>::~__shared_count()
     36#: (0x5764EA991BD6) shared_ptr_base.h:1524 - std::__shared_ptr<llmq::CQuorum const, (__gnu_cxx::_Lock_policy)2>::~__shared_ptr()
     37#: (0x5764EA991BD6) shared_ptr.h:175       - std::shared_ptr<llmq::CQuorum const>::~shared_ptr()
     38#: (0x5764EA991BD6) quorums.cpp:494        - llmq::CQuorumManager::RequestQuorumData(CNode*, Consensus::LLMQType, CBlockIndex const*, unsigned short, uint256 const&) const
     ...
@knst knst force-pushed the fix-deadlock-getquorumdata branch from 747fe34 to 1b0af99 Compare December 10, 2024 14:07
@UdjinM6 UdjinM6 added this to the 22.1 milestone Dec 10, 2024
Copy link

@UdjinM6 UdjinM6 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

utACK 1b0af99

Copy link
Member

@PastaPastaPasta PastaPastaPasta left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

utACK 1b0af99

@PastaPastaPasta PastaPastaPasta merged commit 6d97441 into dashpay:develop Dec 11, 2024
21 checks passed
@knst knst deleted the fix-deadlock-getquorumdata branch January 14, 2025 14:18
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants