Skip to content

Bluetooth: Host: ATT: Zephyr host is deadlocked due to the use of the low-priority system work queue to block the high-priority BT RX thread. #78761

@pin-zephyr

Description

@pin-zephyr

precondition:

  1. No use of EATT on the device Peripheral_P1
  2. CONFIG_BT_ATT_TX_COUNT=8 on Peripheral_P1 (req_slab available slot = 8)

steps of reproduction:

  1. Peripheral_P1 sends out a "READ request" to Phone_A, Phone_A does not reply the request right away. Therefore there is an ongoing transaction on the attribute channel, Peripheral_P1 can not send out any gatt indication before receiving "READ response" from the Phone_A.
    (req_slab available slot = 7)

  2. Phone_A writes to a characteristics on the Peripheral_P1, which cause Peripheral_P1 trys to send 8 gatt indications to Phone_A.

  3. The first 7 outgoing gatt indications go well and are put into a system list since Phone_A does not reply the "READ request" yet. (req_slab available slot = 0)

  4. The 8th outgoing gatt indication will block the system workqueue by the function call gatt_req_alloc() for 30 seconds (BT_ATT_TIMEOUT) due to no more free slots in req_slab.

  5. The next incoming ATT read/write request is received by BT RX thread. then BT RX thread will notify any pending TX before processing any new data for this connection. Notifying any pending TX will commit a work to system workqueue.

However, system workqueue is already blocked in step 4 for 30 seconds. Therefore zephyr host will be deadlocked here. The deadlock will be resolved after the ATT_TIMEOUT of 30 seconds, but then the link would anyway be unusable because the BT spec allows no further ATT communication on it. This would cause one or more disconnects after 30 seconds of unresponsiveness each time on the device Peripheral_P1.

Metadata

Metadata

Labels

Type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions