Skip to content

Commit

Permalink
Merge PIVX-Project#2920: [Http] [Test] Fix test failures on shutdown
Browse files Browse the repository at this point in the history
05c30a8 Merge bitcoin#14413: tests: Allow closed rpc handler in assert_start_raises_init_error (MarcoFalke)
cb65d0a Merge bitcoin#14993: rpc: Fix data race (UB) in InterruptRPC() (Wladimir J. van der Laan)
cf6cb29 Merge bitcoin#14670: http: Fix HTTP server shutdown (Wladimir J. van der Laan)
d19d9fa Call RPCNotifyBlockChange on RPC stopped (ale)
af10755 Merge bitcoin#11006: Improve shutdown process (Wladimir J. van der Laan)

Pull request description:

  The aim of this PR is fixing the functional test errors that we get sometimes during shutdown, for example the failure of this action  https://github.com/PIVX-Project/PIVX/actions/runs/8429115630/job/23082877380

  In a few words, on HTTP server shutdown, sometimes 2 seconds are not enough time to finish handling all the RPC requests.

  `2024-03-26T01:35:26Z HTTP event loop did not exit within allotted time, sending loopbreak`

  So after this point all existing RPC requests (in particular the `stop()` one) does not receive any answer...  And this lead to functional tests failing at the end.

  For more info see the discussion in each backported PR.

  Backport bitcoin PRs bitcoin#11006 bitcoin#14670 bitcoin#14993 bitcoin#14413

ACKs for top commit: 05c30a8
  Liquid369:
    tACK 05c30a8
  Fuzzbawls:
    ACK 05c30a8

Tree-SHA512: 3fa2d7d9f11f851dbf3703a70e45a12905cb801156b32654f624487c386d76d4eebbc9d54418f4fbe0ac500afeed626636cda2dc942294ea991ff5cde47b07f4
  • Loading branch information
Fuzzbawls committed Apr 4, 2024
2 parents 7954d0a + 05c30a8 commit f428a74
Show file tree
Hide file tree
Showing 8 changed files with 65 additions and 33 deletions.
31 changes: 12 additions & 19 deletions src/httpserver.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -11,6 +11,7 @@
#include "netbase.h"
#include "rpc/protocol.h" // For HTTP status codes
#include "sync.h"
#include "shutdown.h"
#include "guiinterface.h"

#include <stdio.h>
Expand All @@ -21,7 +22,6 @@
#include <sys/types.h>
#include <sys/stat.h>
#include <signal.h>
#include <future>

#include <event2/thread.h>
#include <event2/buffer.h>
Expand Down Expand Up @@ -426,18 +426,14 @@ bool UpdateHTTPServerLogging(bool enable) {
}

std::thread threadHTTP;
std::future<bool> threadResult;
static std::vector<std::thread> g_thread_http_workers;

bool StartHTTPServer()
{
LogPrint(BCLog::HTTP, "Starting HTTP server\n");
int rpcThreads = std::max((long)gArgs.GetArg("-rpcthreads", DEFAULT_HTTP_THREADS), 1L);
LogPrintf("HTTP: starting %d worker threads\n", rpcThreads);

std::packaged_task<bool(event_base*, evhttp*)> task(ThreadHTTP);
threadResult = task.get_future();
threadHTTP = std::thread(std::move(task), eventBase, eventHTTP);
threadHTTP = std::thread(ThreadHTTP, eventBase, eventHTTP);

for (int i = 0; i < rpcThreads; i++) {
g_thread_http_workers.emplace_back(HTTPWorkQueueRun, workQueue);
Expand All @@ -449,9 +445,7 @@ void InterruptHTTPServer()
{
LogPrint(BCLog::HTTP, "Interrupting HTTP server\n");
if (eventHTTP) {
for (evhttp_bound_socket *socket : boundSockets) {
evhttp_del_accept_socket(eventHTTP, socket);
}
// Reject requests on current connections
evhttp_set_gencb(eventHTTP, http_reject_request_cb, nullptr);
}
if (workQueue)
Expand All @@ -475,18 +469,14 @@ void StopHTTPServer()

MilliSleep(500); // Avoid race condition while the last HTTP-thread is exiting

// Unlisten sockets, these are what make the event loop running, which means
// that after this and all connections are closed the event loop will quit.
for (evhttp_bound_socket *socket : boundSockets) {
evhttp_del_accept_socket(eventHTTP, socket);
}
boundSockets.clear();
if (eventBase) {
LogPrint(BCLog::HTTP, "Waiting for HTTP event thread to exit\n");
// Give event loop a few seconds to exit (to send back last RPC responses), then break it
// Before this was solved with event_base_loopexit, but that didn't work as expected in
// at least libevent 2.0.21 and always introduced a delay. In libevent
// master that appears to be solved, so in the future that solution
// could be used again (if desirable).
// (see discussion in https://github.com/bitcoin/bitcoin/pull/6990)
if (threadResult.valid() && threadResult.wait_for(std::chrono::milliseconds(2000)) == std::future_status::timeout) {
LogPrintf("HTTP event loop did not exit within allotted time, sending loopbreak\n");
event_base_loopbreak(eventBase);
}
// Guard threadHTTP
if (threadHTTP.joinable()) {
threadHTTP.join();
Expand Down Expand Up @@ -599,6 +589,9 @@ void HTTPRequest::WriteHeader(const std::string& hdr, const std::string& value)
void HTTPRequest::WriteReply(int nStatus, const std::string& strReply)
{
assert(!replySent && req);
if (ShutdownRequested()) {
WriteHeader("Connection", "close");
}
// Send event to main http thread to send reply message
struct evbuffer* evb = evhttp_request_get_output_buffer(req);
assert(evb);
Expand Down
3 changes: 2 additions & 1 deletion src/init.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -363,7 +363,8 @@ void OnRPCStarted()
void OnRPCStopped()
{
uiInterface.NotifyBlockTip.disconnect(RPCNotifyBlockChange);
//RPCNotifyBlockChange(0);
// TODO: remove unused parameter fInitialDownload
RPCNotifyBlockChange(false, nullptr);
g_best_block_cv.notify_all();
LogPrint(BCLog::RPC, "RPC stopped.\n");
}
Expand Down
1 change: 1 addition & 0 deletions src/rpc/client.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -174,6 +174,7 @@ static const CRPCConvertParam vRPCConvertParams[] = {
{ "spork", 1, "value" },
{ "startmasternode", 1, "lock_wallet" },
{ "startmasternode", 3, "reload_conf" },
{ "stop", 0, "wait" },
{ "submitbudget", 2, "npayments" },
{ "submitbudget", 3, "start" },
{ "submitbudget", 5, "montly_payment" },
Expand Down
18 changes: 12 additions & 6 deletions src/rpc/server.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -28,7 +28,7 @@
#include <memory> // for unique_ptr
#include <unordered_map>

static bool fRPCRunning = false;
static std::atomic<bool> g_rpc_running{false};
static bool fRPCInWarmup = true;
static std::string rpcWarmupStatus("RPC server started");
static RecursiveMutex cs_rpcWarmup;
Expand Down Expand Up @@ -263,13 +263,19 @@ UniValue help(const JSONRPCRequest& jsonRequest)

UniValue stop(const JSONRPCRequest& jsonRequest)
{
if (jsonRequest.fHelp || !jsonRequest.params.empty())
// Accept the hidden 'wait' integer argument (milliseconds)
// For instance, 'stop 1000' makes the call wait 1 second before returning
// to the client (intended for testing)
if (jsonRequest.fHelp || jsonRequest.params.size() > 1)
throw std::runtime_error(
"stop\n"
"\nStop PIVX server.");
// Event loop will exit after current HTTP requests have been handled, so
// this reply will get back to the client.
StartShutdown();
if (jsonRequest.params[0].isNum()) {
MilliSleep(jsonRequest.params[0].get_int());
}
return "PIVX server stopping";
}

Expand All @@ -283,7 +289,7 @@ static const CRPCCommand vRPCCommands[] =
// --------------------- ------------------------ ----------------------- ------ ----------
/* Overall control/query calls */
{ "control", "help", &help, true, {"command"} },
{ "control", "stop", &stop, true, {} },
{ "control", "stop", &stop, true, {"wait"} },
};

CRPCTable::CRPCTable()
Expand Down Expand Up @@ -322,7 +328,7 @@ bool CRPCTable::appendCommand(const std::string& name, const CRPCCommand* pcmd)
bool StartRPC()
{
LogPrint(BCLog::RPC, "Starting RPC\n");
fRPCRunning = true;
g_rpc_running = true;
g_rpcSignals.Started();
return true;
}
Expand All @@ -331,7 +337,7 @@ void InterruptRPC()
{
LogPrint(BCLog::RPC, "Interrupting RPC\n");
// Interrupt e.g. running longpolls
fRPCRunning = false;
g_rpc_running = false;
}

void StopRPC()
Expand All @@ -344,7 +350,7 @@ void StopRPC()

bool IsRPCRunning()
{
return fRPCRunning;
return g_rpc_running;
}

void SetRPCWarmupStatus(const std::string& newStatus)
Expand Down
28 changes: 28 additions & 0 deletions test/functional/feature_shutdown.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,28 @@
#!/usr/bin/env python3
# Copyright (c) 2018 The Bitcoin Core developers
# Distributed under the MIT software license, see the accompanying
# file COPYING or http://www.opensource.org/licenses/mit-license.php.
"""Test bitcoind shutdown."""

from test_framework.test_framework import PivxTestFramework
from test_framework.util import assert_equal, get_rpc_proxy
from threading import Thread

def test_long_call(node):
block = node.waitfornewblock()
assert_equal(block['height'], 0)

class ShutdownTest(PivxTestFramework):

def set_test_params(self):
self.setup_clean_chain = True
self.num_nodes = 1

def run_test(self):
node = get_rpc_proxy(self.nodes[0].url, 1, timeout=600, coveragedir=self.nodes[0].coverage_dir)
Thread(target=test_long_call, args=(node,)).start()
# wait 1 second to ensure event loop waits for current connections to close
self.stop_node(0, wait=1000)

if __name__ == '__main__':
ShutdownTest().main()
8 changes: 4 additions & 4 deletions test/functional/test_framework/test_framework.py
Original file line number Diff line number Diff line change
Expand Up @@ -320,16 +320,16 @@ def start_nodes(self, extra_args=None, *args, **kwargs):
for node in self.nodes:
coverage.write_all_rpc_commands(self.options.coveragedir, node.rpc)

def stop_node(self, i):
def stop_node(self, i, wait=0):
"""Stop a pivxd test node"""
self.nodes[i].stop_node()
self.nodes[i].stop_node(wait=wait)
self.nodes[i].wait_until_stopped()

def stop_nodes(self):
def stop_nodes(self, wait=0):
"""Stop multiple pivxd test nodes"""
for node in self.nodes:
# Issue RPC to stop nodes
node.stop_node()
node.stop_node(wait=wait)

for node in self.nodes:
# Wait for nodes to stop
Expand Down
8 changes: 5 additions & 3 deletions test/functional/test_framework/test_node.py
Original file line number Diff line number Diff line change
Expand Up @@ -169,7 +169,9 @@ def wait_for_rpc_connection(self):
if e.errno != errno.ECONNREFUSED: # Port not yet open?
raise # unknown IO error
except JSONRPCException as e: # Initialization phase
if e.error['code'] != -28: # RPC in warmup?
# -28 RPC in warmup
# -342 Service unavailable, RPC server started but is shutting down due to error
if e.error['code'] != -28 and e.error['code'] != -342:
raise # unknown JSON RPC exception
except ValueError as e: # cookie file not found and no rpcuser or rpcassword. pivxd still starting
if "No RPC credentials" not in str(e):
Expand All @@ -186,13 +188,13 @@ def get_wallet_rpc(self, wallet_name):
wallet_path = "wallet/%s" % wallet_name
return self.rpc / wallet_path

def stop_node(self):
def stop_node(self, wait=0):
"""Stop the node."""
if not self.running:
return
self.log.debug("Stopping node")
try:
self.stop()
self.stop(wait=wait)
except http.client.CannotSendRequest:
self.log.exception("Unable to stop node.")
del self.p2ps[:]
Expand Down
1 change: 1 addition & 0 deletions test/functional/test_runner.py
Original file line number Diff line number Diff line change
Expand Up @@ -144,6 +144,7 @@
'p2p_mempool.py', # ~ 46 sec
'rpc_named_arguments.py', # ~ 45 sec
'feature_help.py', # ~ 30 sec
'feature_shutdown.py',

# Don't append tests at the end to avoid merge conflicts
# Put them in a random line within the section that fits their approximate run-time
Expand Down

0 comments on commit f428a74

Please sign in to comment.