Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: removal failure of freeConnection causing infinite loop #2171

Open
wants to merge 2 commits into
base: master
Choose a base branch
from

Conversation

ksdaylight
Copy link

Description

This pull request addresses a critical deadlock issue that arose from incorrect removal of connections in the _freeConnections list. The deadlock was caused by having different indices for the same connection instance, resulting in an infinite loop and CPU consumption reaching 100%.

Evidence of Issue

During debugging, I discovered that some different indices in the queue were pointing to the same connection, such as indices 0 and 3, 5 and 4, etc.(The provided text, omitting irrelevant repetitions:):

> this._freeConnections._list
(1024) [PoolConnection, PoolConnection, PoolConnection, PoolConnection, PoolConnection, PoolConnection, PoolConnection, PoolConnection, PoolConnection, PoolConnection, PoolConnection, PoolConnection, PoolConnection, PoolConnection, PoolConnection, PoolConnection, PoolConnection, PoolConnection, PoolConnection, PoolConnection, PoolConnection, PoolConnection, PoolConnection, PoolConnection, PoolConnection, PoolConnection, PoolConnection, PoolConnection, PoolConnection, PoolConnection, PoolConnection, PoolConnection, PoolConnection, PoolConnection, PoolConnection, PoolConnection, PoolConnection, PoolConnection, PoolConnection, PoolConnection, PoolConnection, PoolConnection, PoolConnection, PoolConnection, PoolConnection, PoolConnection, PoolConnection, PoolConnection, PoolConnection, PoolConnection, PoolConnection, PoolConnection, PoolConnection, PoolConnection, PoolConnection, PoolConnection, PoolConnection, PoolConnection, PoolConnection, PoolConnection, PoolConnection, PoolConnection, PoolConnection, PoolConnection, PoolConnection, PoolConnection, PoolConnection, PoolConnection, PoolConnection, PoolConnection, PoolConnection, PoolConnection, PoolConnection, PoolConnection, PoolConnection, PoolConnection, PoolConnection, PoolConnection, PoolConnection, PoolConnection, PoolConnection, PoolConnection, PoolConnection, PoolConnection, PoolConnection, PoolConnection, PoolConnection, PoolConnection, PoolConnection, PoolConnection, PoolConnection, PoolConnection, PoolConnection, PoolConnection, PoolConnection, PoolConnection, PoolConnection, PoolConnection, PoolConnection, PoolConnection, …]
length: 1024
[[Prototype]]: Array(0)
[0..99]:
0: PoolConnection {_events: {…}, _eventsCount: 2, _maxListeners: undefined, config: ConnectionConfig, stream: Socket, …}
_authPlugin: data => {\n    const authPluginData1 = data.slice(0, 8);\n    const authPluginData2 = data.slice(8, 20);\n    let authToken;\n    if (passwordSha1) {\n      authToken = auth41.calculateTokenFromPasswordSha(\n        passwordSha1,\n        authPluginData1,\n        authPluginData2\n      );\n    } else {\n      authToken = auth41.calculateToken(\n        password,\n        authPluginData1,\n        authPluginData2\n      );\n    }\n    return authToken;\n  }
_command: undefined
_commands: Denque {_capacity: undefined, _head: 3, _tail: 3, _capacityMask: 31, _list: Array(32)}
_events: {end: ƒ, error: Array(2)}
_eventsCount: 2
_fatalError: null
_handshakePacket: Handshake {protocolVersion: 10, serverVersion: '8.0.33', capabilityFlags: 3758096383, connectionId: 16, authPluginData1: Buffer(8), …}
_internalId: 6
_maxListeners: undefined
_outOfOrderPackets: (0) []
_paused: false
_paused_packets: Denque {_capacity: undefined, _head: 0, _tail: 0, _capacityMask: 3, _list: Array(4)}
_pool: Pool {_events: {…}, _eventsCount: 0, _maxListeners: undefined, config: PoolConfig, _allConnections: Denque, …}
_protocolError: null
_statements: LRUCache {#initializeTTLTracking: ƒ, #initializeSizeTracking: ƒ, #indexes: ƒ, #rindexes: ƒ, #isValidIndex: ƒ, …}
authorized: true
clientEncoding: 'utf8'
compressedSequenceId: 0
config: ConnectionConfig {isServer: undefined, stream: undefined, host: '127.0.0.1', port: 3306, localAddress: undefined, …}
connectionId: 16
connectTimeout: null
fatalError: ƒ fatalError() {\n    return this._fatalError;\n  }
lastActiveTime: 1691933977412
packetParser: PacketParser {buffer: Array(0), bufferLength: 0, packetHeaderLength: 4, headerLen: 0, length: 5, …}
sequenceId: 35
serverCapabilityFlags: 3758096383
serverEncoding: 'utf8'
stream: Socket {connecting: false, _hadError: false, _parent: null, _host: null, _closeAfterHandlingError: false, …}
Symbol(kCapture): false
threadId: 16
[[Prototype]]: Connection
1: PoolConnection {_events: {…}, _eventsCount: 2, _maxListeners: undefined, config: ConnectionConfig, stream: Socket, …}
2: PoolConnection {_events: {…}, _eventsCount: 2, _maxListeners: undefined, config: ConnectionConfig, stream: Socket, …}
_authPlugin: data => {\n    const authPluginData1 = data.slice(0, 8);\n    const authPluginData2 = data.slice(8, 20);\n    let authToken;\n    if (passwordSha1) {\n      authToken = auth41.calculateTokenFromPasswordSha(\n        passwordSha1,\n        authPluginData1,\n        authPluginData2\n      );\n    } else {\n      authToken = auth41.calculateToken(\n        password,\n        authPluginData1,\n        authPluginData2\n      );\n    }\n    return authToken;\n  }
_command: undefined
_commands: Denque {_capacity: undefined, _head: 5, _tail: 5, _capacityMask: 31, _list: Array(32)}
_events: {end: ƒ, error: Array(2)}
_eventsCount: 2
_fatalError: null
_handshakePacket: Handshake {protocolVersion: 10, serverVersion: '8.0.33', capabilityFlags: 3758096383, connectionId: 15, authPluginData1: Buffer(8), …}
_internalId: 4
_maxListeners: undefined
_outOfOrderPackets: (0) []
_paused: false
_paused_packets: Denque {_capacity: undefined, _head: 0, _tail: 0, _capacityMask: 3, _list: Array(4)}
_pool: Pool {_events: {…}, _eventsCount: 0, _maxListeners: undefined, config: PoolConfig, _allConnections: Denque, …}
_protocolError: null
_statements: LRUCache {#initializeTTLTracking: ƒ, #initializeSizeTracking: ƒ, #indexes: ƒ, #rindexes: ƒ, #isValidIndex: ƒ, …}
authorized: true
clientEncoding: 'utf8'
compressedSequenceId: 0
config: ConnectionConfig {isServer: undefined, stream: undefined, host: '127.0.0.1', port: 3306, localAddress: undefined, …}
connectionId: 15
connectTimeout: null
fatalError: ƒ fatalError() {\n    return this._fatalError;\n  }
lastActiveTime: 1691933977408
packetParser: PacketParser {buffer: Array(0), bufferLength: 0, packetHeaderLength: 4, headerLen: 0, length: 5, …}
sequenceId: 69
serverCapabilityFlags: 3758096383
serverEncoding: 'utf8'
stream: Socket {connecting: false, _hadError: false, _parent: null, _host: null, _closeAfterHandlingError: false, …}
Symbol(kCapture): false
threadId: 15
[[Prototype]]: Connection
3: PoolConnection {_events: {…}, _eventsCount: 2, _maxListeners: undefined, config: ConnectionConfig, stream: Socket, …}
_authPlugin: data => {\n    const authPluginData1 = data.slice(0, 8);\n    const authPluginData2 = data.slice(8, 20);\n    let authToken;\n    if (passwordSha1) {\n      authToken = auth41.calculateTokenFromPasswordSha(\n        passwordSha1,\n        authPluginData1,\n        authPluginData2\n      );\n    } else {\n      authToken = auth41.calculateToken(\n        password,\n        authPluginData1,\n        authPluginData2\n      );\n    }\n    return authToken;\n  }
_command: undefined
_commands: Denque {_capacity: undefined, _head: 3, _tail: 3, _capacityMask: 31, _list: Array(32)}
_events: {end: ƒ, error: Array(2)}
_eventsCount: 2
_fatalError: null
_handshakePacket: Handshake {protocolVersion: 10, serverVersion: '8.0.33', capabilityFlags: 3758096383, connectionId: 16, authPluginData1: Buffer(8), …}
_internalId: 6
_maxListeners: undefined
_outOfOrderPackets: (0) []
_paused: false
_paused_packets: Denque {_capacity: undefined, _head: 0, _tail: 0, _capacityMask: 3, _list: Array(4)}
_pool: Pool {_events: {…}, _eventsCount: 0, _maxListeners: undefined, config: PoolConfig, _allConnections: Denque, …}
_protocolError: null
_statements: LRUCache {#initializeTTLTracking: ƒ, #initializeSizeTracking: ƒ, #indexes: ƒ, #rindexes: ƒ, #isValidIndex: ƒ, …}
authorized: true
clientEncoding: 'utf8'
compressedSequenceId: 0
config: ConnectionConfig {isServer: undefined, stream: undefined, host: '127.0.0.1', port: 3306, localAddress: undefined, …}
connectionId: 16
connectTimeout: null
fatalError: ƒ fatalError() {\n    return this._fatalError;\n  }
lastActiveTime: 1691933977412
packetParser: PacketParser {buffer: Array(0), bufferLength: 0, packetHeaderLength: 4, headerLen: 0, length: 5, …}
sequenceId: 35
serverCapabilityFlags: 3758096383
serverEncoding: 'utf8'
stream: Socket {connecting: false, _hadError: false, _parent: null, _host: null, _closeAfterHandlingError: false, …}
Symbol(kCapture): false
threadId: 16
[[Prototype]]: Connection
4: PoolConnection {_events: {…}, _eventsCount: 2, _maxListeners: undefined, config: ConnectionConfig, stream: Socket, …}
5: PoolConnection {_events: {…}, _eventsCount: 2, _maxListeners: undefined, config: ConnectionConfig, stream: Socket, …}
_authPlugin: data => {\n    const authPluginData1 = data.slice(0, 8);\n    const authPluginData2 = data.slice(8, 20);\n    let authToken;\n    if (passwordSha1) {\n      authToken = auth41.calculateTokenFromPasswordSha(\n        passwordSha1,\n        authPluginData1,\n        authPluginData2\n      );\n    } else {\n      authToken = auth41.calculateToken(\n        password,\n        authPluginData1,\n        authPluginData2\n      );\n    }\n    return authToken;\n  }
_command: undefined
_commands: Denque {_capacity: undefined, _head: 0, _tail: 0, _capacityMask: 3, _list: Array(4)}
_events: {end: ƒ, error: Array(2)}
_eventsCount: 2
_fatalError: null
_handshakePacket: Handshake {protocolVersion: 10, serverVersion: '8.0.33', capabilityFlags: 3758096383, connectionId: 17, authPluginData1: Buffer(8), …}
_internalId: 5
_maxListeners: undefined
_outOfOrderPackets: (0) []
_paused: false
_paused_packets: Denque {_capacity: undefined, _head: 0, _tail: 0, _capacityMask: 3, _list: Array(4)}
_pool: Pool {_events: {…}, _eventsCount: 0, _maxListeners: undefined, config: PoolConfig, _allConnections: Denque, …}
_protocolError: null
_statements: LRUCache {#initializeTTLTracking: ƒ, #initializeSizeTracking: ƒ, #indexes: ƒ, #rindexes: ƒ, #isValidIndex: ƒ, …}
authorized: true
clientEncoding: 'utf8'
compressedSequenceId: 0
config: ConnectionConfig {isServer: undefined, stream: undefined, host: '127.0.0.1', port: 3306, localAddress: undefined, …}
connectionId: 17
connectTimeout: null
fatalError: ƒ fatalError() {\n    return this._fatalError;\n  }
lastActiveTime: 1691933976243
packetParser: PacketParser {buffer: Array(0), bufferLength: 0, packetHeaderLength: 4, headerLen: 0, length: 5, …}
sequenceId: 9
serverCapabilityFlags: 3758096383
serverEncoding: 'utf8'
stream: Socket {connecting: false, _hadError: false, _parent: null, _host: null, _closeAfterHandlingError: false, …}
Symbol(kCapture): false
threadId: 17
[[Prototype]]: Connection
6: PoolConnection {_events: {…}, _eventsCount: 2, _maxListeners: undefined, config: ConnectionConfig, stream: Socket, …}
...
99: PoolConnection {_events: {…}, _eventsCount: 2, _maxListeners: undefined, config: ConnectionConfig, stream: Socket, …}
[[Prototype]]: Object
[100..199]:
...
[1000..1023]:
> this._freeConnections._list[0]=== this._freeConnections._list[3]
true
> this._freeConnections._list[5]=== this._freeConnections._list[4]
true

When instance 0 called the following code:

//pool.js
this._freeConnections.get(0).destroy();

//pool_connection.js
destroy() {
  this._removeFromPool();
  super.destroy();
}

_removeFromPool() {
  if (!this._pool || this._pool._closed) {
    return;
  }
  const pool = this._pool;
  this._pool = null;
  pool._removeConnection(this);
}

This logic set the connection's this._pool to null. Consequently, this also caused the _pool attribute of _freeConnections[3] to become null.

When _removeIdleTimeoutConnectionsTimer is running with index 3, execute the same code as mentioned above.

Here's where the problem occurred: since [3] and the original [0] were the same instance and [3]'s this._pool was null, it returned directly, leaving [3] in the _freeConnections list. This led to an infinite loop, causing the CPU consumption to spike to 100%.

> this._freeConnections
Denque {_capacity: undefined, _head: 1, _tail: 578, _capacityMask: 1023, _list: Array(1024)}
_capacity: undefined
_capacityMask: 1023
_head: 1
_list: (1024) [undefined, PoolConnection, PoolConnection, PoolConnection, PoolConnection, PoolConnection, PoolConnection, PoolConnection, PoolConnection, PoolConnection, PoolConnection, PoolConnection, PoolConnection, PoolConnection, PoolConnection, PoolConnection, PoolConnection, PoolConnection, PoolConnection, PoolConnection, PoolConnection, PoolConnection, PoolConnection, PoolConnection, PoolConnection, PoolConnection, PoolConnection, PoolConnection, PoolConnection, PoolConnection, PoolConnection, PoolConnection, PoolConnection, PoolConnection, PoolConnection, PoolConnection, PoolConnection, PoolConnection, PoolConnection, PoolConnection, PoolConnection, PoolConnection, PoolConnection, PoolConnection, PoolConnection, PoolConnection, PoolConnection, PoolConnection, PoolConnection, PoolConnection, PoolConnection, PoolConnection, PoolConnection, PoolConnection, PoolConnection, PoolConnection, PoolConnection, PoolConnection, PoolConnection, PoolConnection, PoolConnection, PoolConnection, …]
length: 1024
[[Prototype]]: Array(0)
[0..99]:
0: undefined
1: PoolConnection {_events: {…}, _eventsCount: 2, _maxListeners: undefined, config: ConnectionConfig, stream: Socket, …}
2: PoolConnection {_events: {…}, _eventsCount: 2, _maxListeners: undefined, config: ConnectionConfig, stream: Socket, …}
3: PoolConnection {_events: {…}, _eventsCount: 2, _maxListeners: undefined, config: ConnectionConfig, stream: Socket, …}
_authPlugin: data => {\n    const authPluginData1 = data.slice(0, 8);\n    const authPluginData2 = data.slice(8, 20);\n    let authToken;\n    if (passwordSha1) {\n      authToken = auth41.calculateTokenFromPasswordSha(\n        passwordSha1,\n        authPluginData1,\n        authPluginData2\n      );\n    } else {\n      authToken = auth41.calculateToken(\n        password,\n        authPluginData1,\n        authPluginData2\n      );\n    }\n    return authToken;\n  }
_closing: true
_command: undefined
_commands: Denque {_capacity: undefined, _head: 3, _tail: 3, _capacityMask: 31, _list: Array(32)}
_events: {end: ƒ, error: Array(2)}
_eventsCount: 2
_fatalError: null
_handshakePacket: Handshake {protocolVersion: 10, serverVersion: '8.0.33', capabilityFlags: 3758096383, connectionId: 16, authPluginData1: Buffer(8), …}
_internalId: 6
_maxListeners: undefined
_outOfOrderPackets: (0) []
_paused: false
_paused_packets: Denque {_capacity: undefined, _head: 0, _tail: 0, _capacityMask: 3, _list: Array(4)}
_pool: null
_protocolError: null
_statements: LRUCache {#initializeTTLTracking: ƒ, #initializeSizeTracking: ƒ, #indexes: ƒ, #rindexes: ƒ, #isValidIndex: ƒ, …}
addCommand: ƒ _addCommandClosedState(cmd) {\n    const err = new Error(\n      "Can't add new command when connection is in closed state"\n    );\n    err.fatal = true;\n    if (cmd.onResult) {\n      cmd.onResult(err);\n    } else {\n      this.emit('error', err);\n    }\n  }
authorized: true
clientEncoding: 'utf8'
compressedSequenceId: 0
config: ConnectionConfig {isServer: undefined, stream: undefined, host: '127.0.0.1', port: 3306, localAddress: undefined, …}
connectionId: 16
connectTimeout: null
fatalError: ƒ fatalError() {\n    return this._fatalError;\n  }
lastActiveTime: 1691933977412
packetParser: PacketParser {buffer: Array(0), bufferLength: 0, packetHeaderLength: 4, headerLen: 0, length: 5, …}
sequenceId: 35
serverCapabilityFlags: 3758096383
serverEncoding: 'utf8'
stream: Socket {connecting: false, _hadError: false, _parent: null, _host: null, _closeAfterHandlingError: false, …}
Symbol(kCapture): false
threadId: 16
[[Prototype]]: Connection
...

Changes Made

  • Added checks to validate connection identity before removal.
  • Fixed the removal logic for idle connections to prevent duplicate instances.

Testing

I have run all the tests locally, and all of them passed successfully without any errors or failures. The detailed output of the tests is as follows:

[0:01:02 0 125/126 99.2% node test/integration/connection/encoding/test-track-encodings.js] 
  
  Debugger listening on ws://127.0.0.1:53103/46f1aa84-9854-4226-8073-8e58ae7912c9
  For help, see: https://nodejs.org/en/docs/inspector
  Debugger attached.
  Waiting for the debugger to disconnect...
  
[0:01:02 0 126/126 100.0% node test/integration/connection/encoding/test-track-encodings.js]
About to exit with code: 0
Waiting for the debugger to disconnect...

@wellwelwel
Copy link
Collaborator

wellwelwel commented Sep 6, 2023

Thanks @ksdaylight, 🙋🏻‍♂️

Is there a way to reproduce this error (through tests or a basic repro)?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants