Skip to content

Long Running Governance RPC call Kills Sentinel #1118

@ryansd92

Description

@ryansd92

Sentinel, and in particular, the python-bitcoinrpc that it relies on, have hard-coded timeouts of 30 seconds to receive the results of dash rpc calls. This is probably fine. Maybe the timeouts should be configurable but regardless, the RPC command may take a long time or even a very long time to run. If sentinel times out waiting for the command to complete, it won't report in and a masternode will eventually transition into WATCHDOG_EXPIRED even though it's trying to follow the rules.

In my test, dash-cli gobject list took 40 seconds to run. Over several hours, all three of my 12.1 testnet masternodes entered WATCHDOG_EXPIRED.

An increase to 50 second-timeout resolved the problem but that is a short term fix. Testnet is tiny compared to mainnet. As governance object increase in number, so too will the time it takes to run dash-cli gobject list to completion.

I am concerned that we may be looking at a scalability problem. How to fix? There are options. I propose looking at improving the speed of the RPC command and permitting customizable timeouts.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions