[YSQL] Processes killed while holding a spinlock stop new connections #18272

timothy-e · 2023-07-17T21:16:08Z

Jira Link: DB-7270

Description

Add a delay to a new process while it holds the ProcStructLock

	spinLockAcquire(ProcStructLock);

	set_spins_per_delay(ProcGlobal->spins_per_delay);
+ 	pg_usleep(1000 * 1000 * 10); // 10s
	MyProc = *procgloballist;

Create a new connection A and kill it while it is holding the lock.
The postmaster cleans up some aspects of the process, but doesn't touch the ProcStructLock or the proc struct list.
Create a new connection B. It will get stuck on the spinlock and terminate after 2 minutes:

ysqlsh: could not connect to server: Connection refused
	Is the server running on host "localhost" (::1) and accepting
	TCP/IP connections on port 5433?
PANIC:  stuck spinlock detected at InitProcess, ../../../../../../../src/postgres/src/backend/storage/lmgr/proc.c:334

This fatal error will kill backend B, but not affect the postmaster.

Solution:

Reduce the time spent until spinlock is determined to be stuck
When a stuck spinlock causes a process to die, the postmaster should be restarted.

This is related to #18265, but the difference is that this issue specifically addresses spinlocks. It only occurs when the process is killed while holding the spinlock, which is extremely rate. If this issue is encountered, it is immediately very bad and obvious.

#18265 addresses the dangling proc struct list entry, and occurs if the backend is killed anytime between claiming a proc struct and having a PID assigned to the proc struct. If that scenario happens enough times, it would result in a max connection error.

Warning: Please confirm that this issue does not contain any sensitive information

I confirm this issue does not contain any sensitive information.

The text was updated successfully, but these errors were encountered:

…lding a spinlock Summary: === What's new in this diff === This diff tracks spinlock acquisition during process initialisation by using a `ybSpinLockAcquired` field on `MyProc` for processes that are already initialised. 1. If a process is killed while holding a spinlock, we restart the postmaster. 2. If a process is killed before it has a `MyProc` (during initilzation), we also restart the postmaster. We don't restart if the process exited with code 0 or 1. If the database is still initializing, new connections would be rejected with status 1. We don't want to restart the postmaster, because we might end up in a loop where the app always tries too early to connect, causing us to start the whole thing over again. 3. Reduce the spinlock deadlock detection time by 75%. We are more likely to have a bad spinlock state than PG, and a stuck spinlock could block any new connection. Not printing anything for 1 minute while the backend is frozen is a bad experience, and since most spinlocks are acquired for only a few instructions, there should never be a case that a process is waiting for 15-30s on a spinlock. === Background === Logic already existed to track if a process was holding (or acquiring) a LWLock when it was killed (see the variable `yLWLockAcquired` (previously `ybAnyLockAcquired`)). This diff adds logic to detect if a spinlock is held or being acquired when a process is killed. For example, if a process is killed while holding the `ProcStructLock`, any future connection will fail to acquire the ProcStructLock - so it cannot initialize itself. Since the postmaster typically doesn't acquire this lock, it remains blissfully unaware that no new children can be acquired. When a LWLock is acquired, it is guaranteed that the global variable `MyProc` is non-null. So it's straightforward to add a field to `MyProc` to indicate if this process is holding a lock. However, spinlocks are a much lower level construct, and are used to safely create and allocate the `MyProc` object. Some spinlocks are acquired before `MyProc` or `MyBackendId` are set. See uses of `ProcStructLock` in `proc.c::InitProcess`. There are three different stages in which we acquire spinlocks: - During Postmaster init. In this case, the acquiring process is the postmaster, so if the acquiring process is killed, we are not left in a bad state (because the postmaster is restarting) - After process init: MyProc, MyProcPid, MyBackendId, and ProcGlobal are all initialised and functional. It is easy to handle cases like this - we handle them similarly to LWLocks. - During process init. In this stage, we have a `ProcGlobal` with some various global fields set and `MyProcPid`. `MyProc` is null and `MyBackendId` is not yet set. This is the hard case, discussed next. === How to handle SpinLocks acquired during process initialisation === Since there is no way to communicate to the postmaster that a process died holding a spinlock, we simply restart if we find that a process has crashed / been killed without owning a PGPROC struct. Another idea (and a previous version of this diff) was to use buckets, writing to a large array that `crashed_pid % 1000` owns a spinlock. If a process died without a PGPROC, we could check the array to see if it was likely that it owned a spinlock. This is more complicated, not perfect, and covers a rare scenario. It is better to more conservatively crash than to introduce potential new bugs or gaps. Since the OOM killer is more likely to kill procs with high memory usage, and backends during initialization don't have high memory usage, it is very unlikely that they will be killed, so very unlikely that we'll have to restart the postmaster because a process with no PGPROC was killed. Jira: DB-7270, DB-7265 Test Plan: === Test Connections before database initialization === 1. Add delay to a new process while it holds the ProcStructLock in proc.c: ```lang=diff SpinLockAcquire(ProcStructLock); - + pg_usleep(1000 * 1000 * 10); set_spins_per_delay(ProcGlobal->spins_per_delay); ``` 2. Restart : ``` bin/yb-ctl destroy && bin/yb-ctl start ``` 3. Repeatedly make connections. The first ones will fail with ``` ysqlsh: could not connect to server: Connection refused Is the server running on host "localhost" (::1) and accepting TCP/IP connections on port 5433? could not connect to server: Connection refused Is the server running on host "localhost" (127.0.0.1) and accepting TCP/IP connections on port 5433? ``` And after a few seconds, they will fail with ``` ysqlsh: could not connect to server: Connection refused Is the server running on host "localhost" (::1) and accepting TCP/IP connections on port 5433? FATAL: the database system is starting up ``` In the logs, notice ``` 2023-11-08 20:13:28.030 UTC [274016] FATAL: the database system is starting up 2023-11-08 20:13:28.035 UTC [273956] WARNING: server process (PID 274016) exited with exit code 1 ``` for each of the second type of failures. 4. Finally, the database will finish initializing: ``` 2023-11-08 20:13:31.669 UTC [273956] LOG: database system is ready to accept connections ``` === Test Partially Initialised Process === 1. Add delay to a new process while it holds the ProcStructLock in proc.c: ```lang=diff SpinLockAcquire(ProcStructLock); - + pg_usleep(1000 * 1000 * 10); set_spins_per_delay(ProcGlobal->spins_per_delay); ``` 2. Create a new connection A 3. Create a new connection B and kill it while it is holding the lock. (`kill -9 44650`) 4. Observe that: * Connection A must reconnect because the postmaster restarted ``` yugabyte=# select * from pg_class; WARNING: terminating connection because of crash of another server process DETAIL: The postmaster has commanded this server process to roll back the current transaction and exit, because another server process exited abnormally and possibly corrupted shared memory. HINT: In a moment you should be able to reconnect to the database and repeat your command. server closed the connection unexpectedly This probably means the server terminated abnormally before or while processing the request. The connection to the server was lost. Attempting reset: Succeeded. yugabyte=# ``` * In the logs: ``` 2023-11-08 20:08:16.903 UTC [272686] WARNING: terminating active server processes due to backend crash of a partially initialized process 2023-11-08 20:08:16.903 UTC [272686] WARNING: server process (PID 272919) was terminated by signal 9: Killed 2023-11-08 20:08:16.903 UTC [272686] WARNING: could not find BEEntry for pid 272919 to add to yb_terminated_queries 2023-11-08 20:08:16.903 UTC [272866] WARNING: terminating connection because of crash of another server process 2023-11-08 20:08:16.903 UTC [272866] DETAIL: The postmaster has commanded this server process to roll back the current transaction and exit, because another server process exited abnormally and possibly corrupted shared memory. 2023-11-08 20:08:16.903 UTC [272866] HINT: In a moment you should be able to reconnect to the database and repeat your command. 2023-11-08 20:08:16.907 UTC [272686] WARNING: server process (PID 272866) exited with exit code 2 2023-11-08 20:08:16.909 UTC [272686] LOG: all server processes terminated; reinitializing ``` (272866 is the existing connection, 272686 is the postmaster, and 272919 is the new process) === Test Initialized Process === 1. Add the lines ```lang=c slock_t* lock = NULL; lock = (slock_t *) ShmemAlloc(sizeof(slock_t)); SpinLockInit(lock); SpinLockAcquire(lock); pg_usleep(1000 * 1000 * 10); SpinLockRelease(lock); ``` somewhere to `YbSeqNext` in `YbSeqScan.c`. 2. Create a new connection A 3. Create a new connection B. Run a seq scan and kill it while it is holding the lock. 4. Observe that: * Connection A must reconnect, exactly as in the previous case. * In the logs: ``` 2023-09-13 16:45:16.091 EDT [14387] WARNING: terminating active server processes due to backend crash while acquiring SpinLock 2023-09-13 16:45:16.091 EDT [14387] WARNING: server process (PID 14403) was terminated by signal 9: Killed 2023-09-13 16:45:16.091 EDT [14387] DETAIL: Failed process was running: SELECT * FROM t; 2023-09-13 16:45:16.091 EDT [14466] WARNING: terminating connection because of crash of another server process 2023-09-13 16:45:16.091 EDT [14466] DETAIL: The postmaster has commanded this server process to roll back the current transaction and exit, because another server process exited abnormally and possibly corrupted shared memory. 2023-09-13 16:45:16.091 EDT [14466] HINT: In a moment you should be able to reconnect to the database and repeat your command. ``` === Test Spinlock Deadlock Detection Timeout === 1. Apply the following changes to `YbSeqNext` in `nodeYbSeqScan.c`: ```lang=diff --- a/src/postgres/src/backend/executor/nodeYbSeqscan.c +++ b/src/postgres/src/backend/executor/nodeYbSeqscan.c @@ -71,6 +71,14 @@ YbSeqNext(YbSeqScanState *node) econtext = node->ss.ps.ps_ExprContext; slot = node->ss.ss_ScanTupleSlot; + slock_t* lock = NULL; + lock = (slock_t *) ShmemAlloc(sizeof(slock_t)); + SpinLockInit(lock); + SpinLockAcquire(lock); + SpinLockAcquire(lock); // double acquire, will deadlock + pg_usleep(1000 * 1000 * 10); + SpinLockRelease(lock); + /* * Initialize the scandesc upon the first invocation. * The scan only needs its ybscan field, so eventually we may use it ``` 2. Create a session and run a sequential scan: `bin/ysqlsh -c "SELECT * FROM pg_class;"` 3. Observe in the logs that a session started: ``` I1120 13:14:40.311434 56496 pg_client.cc:165] Session id 3: Session id acquired. Postgres backend pid: 56496 ``` 4. Observe in the logs that the session was killed, 35 seconds later: ``` 2023-11-20 13:15:10.214 UTC [56496] PANIC: stuck spinlock detected at YbSeqNext, ../../../../../../src/postgres/src/backend/executor/nodeYbSeqscan.c:78 ``` Reviewers: amartsinchyk, smishra Reviewed By: amartsinchyk Subscribers: yql Differential Revision: https://phorge.dev.yugabyte.com/D28471

…er when a process dies while holding a spinlock Summary: Original commit: 8e8cead / D28471 === What's new in this diff === This diff tracks spinlock acquisition during process initialisation by using a `ybSpinLockAcquired` field on `MyProc` for processes that are already initialised. 1. If a process is killed while holding a spinlock, we restart the postmaster. 2. If a process is killed before it has a `MyProc` (during initilzation), we also restart the postmaster. We don't restart if the process exited with code 0 or 1. If the database is still initializing, new connections would be rejected with status 1. We don't want to restart the postmaster, because we might end up in a loop where the app always tries too early to connect, causing us to start the whole thing over again. 3. Reduce the spinlock deadlock detection time by 75%. We are more likely to have a bad spinlock state than PG, and a stuck spinlock could block any new connection. Not printing anything for 1 minute while the backend is frozen is a bad experience, and since most spinlocks are acquired for only a few instructions, there should never be a case that a process is waiting for 15-30s on a spinlock. === Background === Logic already existed to track if a process was holding (or acquiring) a LWLock when it was killed (see the variable `yLWLockAcquired` (previously `ybAnyLockAcquired`)). This diff adds logic to detect if a spinlock is held or being acquired when a process is killed. For example, if a process is killed while holding the `ProcStructLock`, any future connection will fail to acquire the ProcStructLock - so it cannot initialize itself. Since the postmaster typically doesn't acquire this lock, it remains blissfully unaware that no new children can be acquired. When a LWLock is acquired, it is guaranteed that the global variable `MyProc` is non-null. So it's straightforward to add a field to `MyProc` to indicate if this process is holding a lock. However, spinlocks are a much lower level construct, and are used to safely create and allocate the `MyProc` object. Some spinlocks are acquired before `MyProc` or `MyBackendId` are set. See uses of `ProcStructLock` in `proc.c::InitProcess`. There are three different stages in which we acquire spinlocks: - During Postmaster init. In this case, the acquiring process is the postmaster, so if the acquiring process is killed, we are not left in a bad state (because the postmaster is restarting) - After process init: MyProc, MyProcPid, MyBackendId, and ProcGlobal are all initialised and functional. It is easy to handle cases like this - we handle them similarly to LWLocks. - During process init. In this stage, we have a `ProcGlobal` with some various global fields set and `MyProcPid`. `MyProc` is null and `MyBackendId` is not yet set. This is the hard case, discussed next. === How to handle SpinLocks acquired during process initialisation === Since there is no way to communicate to the postmaster that a process died holding a spinlock, we simply restart if we find that a process has crashed / been killed without owning a PGPROC struct. Another idea (and a previous version of this diff) was to use buckets, writing to a large array that `crashed_pid % 1000` owns a spinlock. If a process died without a PGPROC, we could check the array to see if it was likely that it owned a spinlock. This is more complicated, not perfect, and covers a rare scenario. It is better to more conservatively crash than to introduce potential new bugs or gaps. Since the OOM killer is more likely to kill procs with high memory usage, and backends during initialization don't have high memory usage, it is very unlikely that they will be killed, so very unlikely that we'll have to restart the postmaster because a process with no PGPROC was killed. Jira: DB-7270, DB-7265 Test Plan: === Test Connections before database initialization === 1. Add delay to a new process while it holds the ProcStructLock in proc.c: ```lang=diff SpinLockAcquire(ProcStructLock); - + pg_usleep(1000 * 1000 * 10); set_spins_per_delay(ProcGlobal->spins_per_delay); ``` 2. Restart : ``` bin/yb-ctl destroy && bin/yb-ctl start ``` 3. Repeatedly make connections. The first ones will fail with ``` ysqlsh: could not connect to server: Connection refused Is the server running on host "localhost" (::1) and accepting TCP/IP connections on port 5433? could not connect to server: Connection refused Is the server running on host "localhost" (127.0.0.1) and accepting TCP/IP connections on port 5433? ``` And after a few seconds, they will fail with ``` ysqlsh: could not connect to server: Connection refused Is the server running on host "localhost" (::1) and accepting TCP/IP connections on port 5433? FATAL: the database system is starting up ``` In the logs, notice ``` 2023-11-08 20:13:28.030 UTC [274016] FATAL: the database system is starting up 2023-11-08 20:13:28.035 UTC [273956] WARNING: server process (PID 274016) exited with exit code 1 ``` for each of the second type of failures. 4. Finally, the database will finish initializing: ``` 2023-11-08 20:13:31.669 UTC [273956] LOG: database system is ready to accept connections ``` === Test Partially Initialised Process === 1. Add delay to a new process while it holds the ProcStructLock in proc.c: ```lang=diff SpinLockAcquire(ProcStructLock); - + pg_usleep(1000 * 1000 * 10); set_spins_per_delay(ProcGlobal->spins_per_delay); ``` 2. Create a new connection A 3. Create a new connection B and kill it while it is holding the lock. (`kill -9 44650`) 4. Observe that: * Connection A must reconnect because the postmaster restarted ``` yugabyte=# select * from pg_class; WARNING: terminating connection because of crash of another server process DETAIL: The postmaster has commanded this server process to roll back the current transaction and exit, because another server process exited abnormally and possibly corrupted shared memory. HINT: In a moment you should be able to reconnect to the database and repeat your command. server closed the connection unexpectedly This probably means the server terminated abnormally before or while processing the request. The connection to the server was lost. Attempting reset: Succeeded. yugabyte=# ``` * In the logs: ``` 2023-11-08 20:08:16.903 UTC [272686] WARNING: terminating active server processes due to backend crash of a partially initialized process 2023-11-08 20:08:16.903 UTC [272686] WARNING: server process (PID 272919) was terminated by signal 9: Killed 2023-11-08 20:08:16.903 UTC [272686] WARNING: could not find BEEntry for pid 272919 to add to yb_terminated_queries 2023-11-08 20:08:16.903 UTC [272866] WARNING: terminating connection because of crash of another server process 2023-11-08 20:08:16.903 UTC [272866] DETAIL: The postmaster has commanded this server process to roll back the current transaction and exit, because another server process exited abnormally and possibly corrupted shared memory. 2023-11-08 20:08:16.903 UTC [272866] HINT: In a moment you should be able to reconnect to the database and repeat your command. 2023-11-08 20:08:16.907 UTC [272686] WARNING: server process (PID 272866) exited with exit code 2 2023-11-08 20:08:16.909 UTC [272686] LOG: all server processes terminated; reinitializing ``` (272866 is the existing connection, 272686 is the postmaster, and 272919 is the new process) === Test Initialized Process === 1. Add the lines ```lang=c slock_t* lock = NULL; lock = (slock_t *) ShmemAlloc(sizeof(slock_t)); SpinLockInit(lock); SpinLockAcquire(lock); pg_usleep(1000 * 1000 * 10); SpinLockRelease(lock); ``` somewhere to `YbSeqNext` in `YbSeqScan.c`. 2. Create a new connection A 3. Create a new connection B. Run a seq scan and kill it while it is holding the lock. 4. Observe that: * Connection A must reconnect, exactly as in the previous case. * In the logs: ``` 2023-09-13 16:45:16.091 EDT [14387] WARNING: terminating active server processes due to backend crash while acquiring SpinLock 2023-09-13 16:45:16.091 EDT [14387] WARNING: server process (PID 14403) was terminated by signal 9: Killed 2023-09-13 16:45:16.091 EDT [14387] DETAIL: Failed process was running: SELECT * FROM t; 2023-09-13 16:45:16.091 EDT [14466] WARNING: terminating connection because of crash of another server process 2023-09-13 16:45:16.091 EDT [14466] DETAIL: The postmaster has commanded this server process to roll back the current transaction and exit, because another server process exited abnormally and possibly corrupted shared memory. 2023-09-13 16:45:16.091 EDT [14466] HINT: In a moment you should be able to reconnect to the database and repeat your command. ``` === Test Spinlock Deadlock Detection Timeout === 1. Apply the following changes to `YbSeqNext` in `nodeYbSeqScan.c`: ```lang=diff --- a/src/postgres/src/backend/executor/nodeYbSeqscan.c +++ b/src/postgres/src/backend/executor/nodeYbSeqscan.c @@ -71,6 +71,14 @@ YbSeqNext(YbSeqScanState *node) econtext = node->ss.ps.ps_ExprContext; slot = node->ss.ss_ScanTupleSlot; + slock_t* lock = NULL; + lock = (slock_t *) ShmemAlloc(sizeof(slock_t)); + SpinLockInit(lock); + SpinLockAcquire(lock); + SpinLockAcquire(lock); // double acquire, will deadlock + pg_usleep(1000 * 1000 * 10); + SpinLockRelease(lock); + /* * Initialize the scandesc upon the first invocation. * The scan only needs its ybscan field, so eventually we may use it ``` 2. Create a session and run a sequential scan: `bin/ysqlsh -c "SELECT * FROM pg_class;"` 3. Observe in the logs that a session started: ``` I1120 13:14:40.311434 56496 pg_client.cc:165] Session id 3: Session id acquired. Postgres backend pid: 56496 ``` 4. Observe in the logs that the session was killed, 35 seconds later: ``` 2023-11-20 13:15:10.214 UTC [56496] PANIC: stuck spinlock detected at YbSeqNext, ../../../../../../src/postgres/src/backend/executor/nodeYbSeqscan.c:78 ``` Reviewers: amartsinchyk, smishra Reviewed By: amartsinchyk Subscribers: yql Tags: #jenkins-ready Differential Revision: https://phorge.dev.yugabyte.com/D30375

…er when a process dies while holding a spinlock Summary: Original commit: 8e8cead / D28471 === What's new in this diff === This diff tracks spinlock acquisition during process initialisation by using a `ybSpinLockAcquired` field on `MyProc` for processes that are already initialised. 1. If a process is killed while holding a spinlock, we restart the postmaster. 2. If a process is killed before it has a `MyProc` (during initilzation), we also restart the postmaster. We don't restart if the process exited with code 0 or 1. If the database is still initializing, new connections would be rejected with status 1. We don't want to restart the postmaster, because we might end up in a loop where the app always tries too early to connect, causing us to start the whole thing over again. 3. Reduce the spinlock deadlock detection time by 75%. We are more likely to have a bad spinlock state than PG, and a stuck spinlock could block any new connection. Not printing anything for 1 minute while the backend is frozen is a bad experience, and since most spinlocks are acquired for only a few instructions, there should never be a case that a process is waiting for 15-30s on a spinlock. === Background === Logic already existed to track if a process was holding (or acquiring) a LWLock when it was killed (see the variable `yLWLockAcquired` (previously `ybAnyLockAcquired`)). This diff adds logic to detect if a spinlock is held or being acquired when a process is killed. For example, if a process is killed while holding the `ProcStructLock`, any future connection will fail to acquire the ProcStructLock - so it cannot initialize itself. Since the postmaster typically doesn't acquire this lock, it remains blissfully unaware that no new children can be acquired. When a LWLock is acquired, it is guaranteed that the global variable `MyProc` is non-null. So it's straightforward to add a field to `MyProc` to indicate if this process is holding a lock. However, spinlocks are a much lower level construct, and are used to safely create and allocate the `MyProc` object. Some spinlocks are acquired before `MyProc` or `MyBackendId` are set. See uses of `ProcStructLock` in `proc.c::InitProcess`. There are three different stages in which we acquire spinlocks: - During Postmaster init. In this case, the acquiring process is the postmaster, so if the acquiring process is killed, we are not left in a bad state (because the postmaster is restarting) - After process init: MyProc, MyProcPid, MyBackendId, and ProcGlobal are all initialised and functional. It is easy to handle cases like this - we handle them similarly to LWLocks. - During process init. In this stage, we have a `ProcGlobal` with some various global fields set and `MyProcPid`. `MyProc` is null and `MyBackendId` is not yet set. This is the hard case, discussed next. === How to handle SpinLocks acquired during process initialisation === Since there is no way to communicate to the postmaster that a process died holding a spinlock, we simply restart if we find that a process has crashed / been killed without owning a PGPROC struct. Another idea (and a previous version of this diff) was to use buckets, writing to a large array that `crashed_pid % 1000` owns a spinlock. If a process died without a PGPROC, we could check the array to see if it was likely that it owned a spinlock. This is more complicated, not perfect, and covers a rare scenario. It is better to more conservatively crash than to introduce potential new bugs or gaps. Since the OOM killer is more likely to kill procs with high memory usage, and backends during initialization don't have high memory usage, it is very unlikely that they will be killed, so very unlikely that we'll have to restart the postmaster because a process with no PGPROC was killed. Jira: DB-7270, DB-7265 Test Plan: === Test Connections before database initialization === 1. Add delay to a new process while it holds the ProcStructLock in proc.c: ```lang=diff SpinLockAcquire(ProcStructLock); - + pg_usleep(1000 * 1000 * 10); set_spins_per_delay(ProcGlobal->spins_per_delay); ``` 2. Restart : ``` bin/yb-ctl destroy && bin/yb-ctl start ``` 3. Repeatedly make connections. The first ones will fail with ``` ysqlsh: could not connect to server: Connection refused Is the server running on host "localhost" (::1) and accepting TCP/IP connections on port 5433? could not connect to server: Connection refused Is the server running on host "localhost" (127.0.0.1) and accepting TCP/IP connections on port 5433? ``` And after a few seconds, they will fail with ``` ysqlsh: could not connect to server: Connection refused Is the server running on host "localhost" (::1) and accepting TCP/IP connections on port 5433? FATAL: the database system is starting up ``` In the logs, notice ``` 2023-11-08 20:13:28.030 UTC [274016] FATAL: the database system is starting up 2023-11-08 20:13:28.035 UTC [273956] WARNING: server process (PID 274016) exited with exit code 1 ``` for each of the second type of failures. 4. Finally, the database will finish initializing: ``` 2023-11-08 20:13:31.669 UTC [273956] LOG: database system is ready to accept connections ``` === Test Partially Initialised Process === 1. Add delay to a new process while it holds the ProcStructLock in proc.c: ```lang=diff SpinLockAcquire(ProcStructLock); - + pg_usleep(1000 * 1000 * 10); set_spins_per_delay(ProcGlobal->spins_per_delay); ``` 2. Create a new connection A 3. Create a new connection B and kill it while it is holding the lock. (`kill -9 44650`) 4. Observe that: * Connection A must reconnect because the postmaster restarted ``` yugabyte=# select * from pg_class; WARNING: terminating connection because of crash of another server process DETAIL: The postmaster has commanded this server process to roll back the current transaction and exit, because another server process exited abnormally and possibly corrupted shared memory. HINT: In a moment you should be able to reconnect to the database and repeat your command. server closed the connection unexpectedly This probably means the server terminated abnormally before or while processing the request. The connection to the server was lost. Attempting reset: Succeeded. yugabyte=# ``` * In the logs: ``` 2023-11-08 20:08:16.903 UTC [272686] WARNING: terminating active server processes due to backend crash of a partially initialized process 2023-11-08 20:08:16.903 UTC [272686] WARNING: server process (PID 272919) was terminated by signal 9: Killed 2023-11-08 20:08:16.903 UTC [272686] WARNING: could not find BEEntry for pid 272919 to add to yb_terminated_queries 2023-11-08 20:08:16.903 UTC [272866] WARNING: terminating connection because of crash of another server process 2023-11-08 20:08:16.903 UTC [272866] DETAIL: The postmaster has commanded this server process to roll back the current transaction and exit, because another server process exited abnormally and possibly corrupted shared memory. 2023-11-08 20:08:16.903 UTC [272866] HINT: In a moment you should be able to reconnect to the database and repeat your command. 2023-11-08 20:08:16.907 UTC [272686] WARNING: server process (PID 272866) exited with exit code 2 2023-11-08 20:08:16.909 UTC [272686] LOG: all server processes terminated; reinitializing ``` (272866 is the existing connection, 272686 is the postmaster, and 272919 is the new process) === Test Initialized Process === 1. Add the lines ```lang=c slock_t* lock = NULL; lock = (slock_t *) ShmemAlloc(sizeof(slock_t)); SpinLockInit(lock); SpinLockAcquire(lock); pg_usleep(1000 * 1000 * 10); SpinLockRelease(lock); ``` somewhere to `YbSeqNext` in `YbSeqScan.c`. 2. Create a new connection A 3. Create a new connection B. Run a seq scan and kill it while it is holding the lock. 4. Observe that: * Connection A must reconnect, exactly as in the previous case. * In the logs: ``` 2023-09-13 16:45:16.091 EDT [14387] WARNING: terminating active server processes due to backend crash while acquiring SpinLock 2023-09-13 16:45:16.091 EDT [14387] WARNING: server process (PID 14403) was terminated by signal 9: Killed 2023-09-13 16:45:16.091 EDT [14387] DETAIL: Failed process was running: SELECT * FROM t; 2023-09-13 16:45:16.091 EDT [14466] WARNING: terminating connection because of crash of another server process 2023-09-13 16:45:16.091 EDT [14466] DETAIL: The postmaster has commanded this server process to roll back the current transaction and exit, because another server process exited abnormally and possibly corrupted shared memory. 2023-09-13 16:45:16.091 EDT [14466] HINT: In a moment you should be able to reconnect to the database and repeat your command. ``` === Test Spinlock Deadlock Detection Timeout === 1. Apply the following changes to `YbSeqNext` in `nodeYbSeqScan.c`: ```lang=diff --- a/src/postgres/src/backend/executor/nodeYbSeqscan.c +++ b/src/postgres/src/backend/executor/nodeYbSeqscan.c @@ -71,6 +71,14 @@ YbSeqNext(YbSeqScanState *node) econtext = node->ss.ps.ps_ExprContext; slot = node->ss.ss_ScanTupleSlot; + slock_t* lock = NULL; + lock = (slock_t *) ShmemAlloc(sizeof(slock_t)); + SpinLockInit(lock); + SpinLockAcquire(lock); + SpinLockAcquire(lock); // double acquire, will deadlock + pg_usleep(1000 * 1000 * 10); + SpinLockRelease(lock); + /* * Initialize the scandesc upon the first invocation. * The scan only needs its ybscan field, so eventually we may use it ``` 2. Create a session and run a sequential scan: `bin/ysqlsh -c "SELECT * FROM pg_class;"` 3. Observe in the logs that a session started: ``` I1120 13:14:40.311434 56496 pg_client.cc:165] Session id 3: Session id acquired. Postgres backend pid: 56496 ``` 4. Observe in the logs that the session was killed, 35 seconds later: ``` 2023-11-20 13:15:10.214 UTC [56496] PANIC: stuck spinlock detected at YbSeqNext, ../../../../../../src/postgres/src/backend/executor/nodeYbSeqscan.c:78 ``` Reviewers: amartsinchyk, smishra Reviewed By: amartsinchyk Subscribers: yql Tags: #jenkins-ready Differential Revision: https://phorge.dev.yugabyte.com/D30374

…s dies while holding a spinlock Summary: Original commit: ef8433a90d470a2bc841e170bd2e70633f35273d / D28471 === What's new in this diff === This diff tracks spinlock acquisition during process initialisation by using a `ybSpinLockAcquired` field on `MyProc` for processes that are already initialised. 1. If a process is killed while holding a spinlock, we restart the postmaster. 2. If a process is killed before it has a `MyProc` (during initilzation), we also restart the postmaster. We don't restart if the process exited with code 0 or 1. If the database is still initializing, new connections would be rejected with status 1. We don't want to restart the postmaster, because we might end up in a loop where the app always tries too early to connect, causing us to start the whole thing over again. 3. Reduce the spinlock deadlock detection time by 75%. We are more likely to have a bad spinlock state than PG, and a stuck spinlock could block any new connection. Not printing anything for 1 minute while the backend is frozen is a bad experience, and since most spinlocks are acquired for only a few instructions, there should never be a case that a process is waiting for 15-30s on a spinlock. === Background === Logic already existed to track if a process was holding (or acquiring) a LWLock when it was killed (see the variable `yLWLockAcquired` (previously `ybAnyLockAcquired`)). This diff adds logic to detect if a spinlock is held or being acquired when a process is killed. For example, if a process is killed while holding the `ProcStructLock`, any future connection will fail to acquire the ProcStructLock - so it cannot initialize itself. Since the postmaster typically doesn't acquire this lock, it remains blissfully unaware that no new children can be acquired. When a LWLock is acquired, it is guaranteed that the global variable `MyProc` is non-null. So it's straightforward to add a field to `MyProc` to indicate if this process is holding a lock. However, spinlocks are a much lower level construct, and are used to safely create and allocate the `MyProc` object. Some spinlocks are acquired before `MyProc` or `MyBackendId` are set. See uses of `ProcStructLock` in `proc.c::InitProcess`. There are three different stages in which we acquire spinlocks: - During Postmaster init. In this case, the acquiring process is the postmaster, so if the acquiring process is killed, we are not left in a bad state (because the postmaster is restarting) - After process init: MyProc, MyProcPid, MyBackendId, and ProcGlobal are all initialised and functional. It is easy to handle cases like this - we handle them similarly to LWLocks. - During process init. In this stage, we have a `ProcGlobal` with some various global fields set and `MyProcPid`. `MyProc` is null and `MyBackendId` is not yet set. This is the hard case, discussed next. === How to handle SpinLocks acquired during process initialisation === Since there is no way to communicate to the postmaster that a process died holding a spinlock, we simply restart if we find that a process has crashed / been killed without owning a PGPROC struct. Another idea (and a previous version of this diff) was to use buckets, writing to a large array that `crashed_pid % 1000` owns a spinlock. If a process died without a PGPROC, we could check the array to see if it was likely that it owned a spinlock. This is more complicated, not perfect, and covers a rare scenario. It is better to more conservatively crash than to introduce potential new bugs or gaps. Since the OOM killer is more likely to kill procs with high memory usage, and backends during initialization don't have high memory usage, it is very unlikely that they will be killed, so very unlikely that we'll have to restart the postmaster because a process with no PGPROC was killed. Jira: DB-7270, DB-7265 Test Plan: === Test Connections before database initialization === 1. Add delay to a new process while it holds the ProcStructLock in proc.c: ```lang=diff SpinLockAcquire(ProcStructLock); - + pg_usleep(1000 * 1000 * 10); set_spins_per_delay(ProcGlobal->spins_per_delay); ``` 2. Restart : ``` bin/yb-ctl destroy && bin/yb-ctl start ``` 3. Repeatedly make connections. The first ones will fail with ``` ysqlsh: could not connect to server: Connection refused Is the server running on host "localhost" (::1) and accepting TCP/IP connections on port 5433? could not connect to server: Connection refused Is the server running on host "localhost" (127.0.0.1) and accepting TCP/IP connections on port 5433? ``` And after a few seconds, they will fail with ``` ysqlsh: could not connect to server: Connection refused Is the server running on host "localhost" (::1) and accepting TCP/IP connections on port 5433? FATAL: the database system is starting up ``` In the logs, notice ``` 2023-11-08 20:13:28.030 UTC [274016] FATAL: the database system is starting up 2023-11-08 20:13:28.035 UTC [273956] WARNING: server process (PID 274016) exited with exit code 1 ``` for each of the second type of failures. 4. Finally, the database will finish initializing: ``` 2023-11-08 20:13:31.669 UTC [273956] LOG: database system is ready to accept connections ``` === Test Partially Initialised Process === 1. Add delay to a new process while it holds the ProcStructLock in proc.c: ```lang=diff SpinLockAcquire(ProcStructLock); - + pg_usleep(1000 * 1000 * 10); set_spins_per_delay(ProcGlobal->spins_per_delay); ``` 2. Create a new connection A 3. Create a new connection B and kill it while it is holding the lock. (`kill -9 44650`) 4. Observe that: * Connection A must reconnect because the postmaster restarted ``` yugabyte=# select * from pg_class; WARNING: terminating connection because of crash of another server process DETAIL: The postmaster has commanded this server process to roll back the current transaction and exit, because another server process exited abnormally and possibly corrupted shared memory. HINT: In a moment you should be able to reconnect to the database and repeat your command. server closed the connection unexpectedly This probably means the server terminated abnormally before or while processing the request. The connection to the server was lost. Attempting reset: Succeeded. yugabyte=# ``` * In the logs: ``` 2023-11-08 20:08:16.903 UTC [272686] WARNING: terminating active server processes due to backend crash of a partially initialized process 2023-11-08 20:08:16.903 UTC [272686] WARNING: server process (PID 272919) was terminated by signal 9: Killed 2023-11-08 20:08:16.903 UTC [272686] WARNING: could not find BEEntry for pid 272919 to add to yb_terminated_queries 2023-11-08 20:08:16.903 UTC [272866] WARNING: terminating connection because of crash of another server process 2023-11-08 20:08:16.903 UTC [272866] DETAIL: The postmaster has commanded this server process to roll back the current transaction and exit, because another server process exited abnormally and possibly corrupted shared memory. 2023-11-08 20:08:16.903 UTC [272866] HINT: In a moment you should be able to reconnect to the database and repeat your command. 2023-11-08 20:08:16.907 UTC [272686] WARNING: server process (PID 272866) exited with exit code 2 2023-11-08 20:08:16.909 UTC [272686] LOG: all server processes terminated; reinitializing ``` (272866 is the existing connection, 272686 is the postmaster, and 272919 is the new process) === Test Initialized Process === 1. Add the lines ```lang=c slock_t* lock = NULL; lock = (slock_t *) ShmemAlloc(sizeof(slock_t)); SpinLockInit(lock); SpinLockAcquire(lock); pg_usleep(1000 * 1000 * 10); SpinLockRelease(lock); ``` somewhere to `YbSeqNext` in `YbSeqScan.c`. 2. Create a new connection A 3. Create a new connection B. Run a seq scan and kill it while it is holding the lock. 4. Observe that: * Connection A must reconnect, exactly as in the previous case. * In the logs: ``` 2023-09-13 16:45:16.091 EDT [14387] WARNING: terminating active server processes due to backend crash while acquiring SpinLock 2023-09-13 16:45:16.091 EDT [14387] WARNING: server process (PID 14403) was terminated by signal 9: Killed 2023-09-13 16:45:16.091 EDT [14387] DETAIL: Failed process was running: SELECT * FROM t; 2023-09-13 16:45:16.091 EDT [14466] WARNING: terminating connection because of crash of another server process 2023-09-13 16:45:16.091 EDT [14466] DETAIL: The postmaster has commanded this server process to roll back the current transaction and exit, because another server process exited abnormally and possibly corrupted shared memory. 2023-09-13 16:45:16.091 EDT [14466] HINT: In a moment you should be able to reconnect to the database and repeat your command. ``` === Test Spinlock Deadlock Detection Timeout === 1. Apply the following changes to `YbSeqNext` in `nodeYbSeqScan.c`: ```lang=diff --- a/src/postgres/src/backend/executor/nodeYbSeqscan.c +++ b/src/postgres/src/backend/executor/nodeYbSeqscan.c @@ -71,6 +71,14 @@ YbSeqNext(YbSeqScanState *node) econtext = node->ss.ps.ps_ExprContext; slot = node->ss.ss_ScanTupleSlot; + slock_t* lock = NULL; + lock = (slock_t *) ShmemAlloc(sizeof(slock_t)); + SpinLockInit(lock); + SpinLockAcquire(lock); + SpinLockAcquire(lock); // double acquire, will deadlock + pg_usleep(1000 * 1000 * 10); + SpinLockRelease(lock); + /* * Initialize the scandesc upon the first invocation. * The scan only needs its ybscan field, so eventually we may use it ``` 2. Create a session and run a sequential scan: `bin/ysqlsh -c "SELECT * FROM pg_class;"` 3. Observe in the logs that a session started: ``` I1120 13:14:40.311434 56496 pg_client.cc:165] Session id 3: Session id acquired. Postgres backend pid: 56496 ``` 4. Observe in the logs that the session was killed, 35 seconds later: ``` 2023-11-20 13:15:10.214 UTC [56496] PANIC: stuck spinlock detected at YbSeqNext, ../../../../../../src/postgres/src/backend/executor/nodeYbSeqscan.c:78 ``` Reviewers: amartsinchyk, smishra Reviewed By: amartsinchyk Subscribers: yql Tags: #jenkins-ready Differential Revision: https://phorge.dev.yugabyte.com/D30374

timothy-e added area/ysql Yugabyte SQL (YSQL) status/awaiting-triage Issue awaiting triage labels Jul 17, 2023

timothy-e self-assigned this Jul 17, 2023

yugabyte-ci added kind/enhancement This is an enhancement of an existing feature priority/medium Medium priority issue and removed status/awaiting-triage Issue awaiting triage labels Jul 17, 2023

yugabyte-ci added kind/bug This issue is a bug and removed kind/enhancement This is an enhancement of an existing feature labels Nov 15, 2023

yugabyte-ci closed this as completed Nov 28, 2023

yugabyte-ci reopened this Nov 29, 2023

sushantrmishra closed this as completed Dec 6, 2023

sushantrmishra reopened this Dec 6, 2023

yugabyte-ci closed this as completed Dec 6, 2023

nyndyny mentioned this issue Apr 6, 2024

[Snyk] Security upgrade eslint from 4.19.1 to 9.0.0 nyndyny/yugabyte-db#239

Open

ryan-ally mentioned this issue Apr 7, 2024

[Snyk] Security upgrade eslint from 4.19.1 to 9.0.0 ryan-ally/yugabyte-db#265

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[YSQL] Processes killed while holding a spinlock stop new connections #18272

[YSQL] Processes killed while holding a spinlock stop new connections #18272

timothy-e commented Jul 17, 2023 •

edited by jira bot

Loading

[YSQL] Processes killed while holding a spinlock stop new connections #18272

[YSQL] Processes killed while holding a spinlock stop new connections #18272

Comments

timothy-e commented Jul 17, 2023 • edited by jira bot Loading

Description

Warning: Please confirm that this issue does not contain any sensitive information

timothy-e commented Jul 17, 2023 •

edited by jira bot

Loading