Skip to content

Conversation

@zhongzhijie1
Copy link
Contributor

@zhongzhijie1 zhongzhijie1 commented Oct 29, 2025

Introduce bt_gatt_cb_unregister() to unregister callbacks.

Replace callback_list traversal with SYS_SLIST_FOR_EACH_CONTAINER_SAFE()

@alwa-nordic
Copy link
Contributor

This follows the established pattern in the Host, so it might as well go in.

But I am concerned about this pattern, so I would like to take this opportunity to ask both the author and anyone else: Looking at the documentation of this function, do you expect cb to not be invoked after unregister returns? Please vote with 🎉 for no, and 🚀 for yes now 🙂.

Both 'no' and 'yes' are sometimes desirable. The solution may be to offer both non-blocking removal and a drain operation. Linux calls this RCU, I believe.

I am also tempted to insist we make this function thread-safe at least. I think the Bluetooth WG committed at some point that new code has to be thread-safe. (And we fix other code as we go.) Does that sound right @jhedberg?

@alwa-nordic alwa-nordic requested a review from Copilot October 31, 2025 17:24
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This pull request adds a new API function bt_gatt_cb_unregister() to allow applications to unregister previously registered GATT callbacks, providing symmetry with the existing bt_gatt_cb_register() function.

  • Adds bt_gatt_cb_unregister() function implementation in gatt.c
  • Adds public API declaration with documentation in gatt.h

Reviewed Changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 2 comments.

File Description
subsys/bluetooth/host/gatt.c Implements the unregister function using sys_slist_find_and_remove()
include/zephyr/bluetooth/gatt.h Adds public API declaration and documentation for the unregister function

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@Thalley
Copy link
Contributor

Thalley commented Nov 4, 2025

This follows the established pattern in the Host, so it might as well go in.

But I am concerned about this pattern, so I would like to take this opportunity to ask both the author and anyone else: Looking at the documentation of this function, do you expect cb to not be invoked after unregister returns? Please vote with 🎉 for no, and 🚀 for yes now 🙂.

Both 'no' and 'yes' are sometimes desirable. The solution may be to offer both non-blocking removal and a drain operation. Linux calls this RCU, I believe.

I am also tempted to insist we make this function thread-safe at least. I think the Bluetooth WG committed at some point that new code has to be thread-safe. (And we fix other code as we go.) Does that sound right @jhedberg?

This is a fair point. cb may indeed be called after the function returns. If the unregistered callback structure is dynamically allocated and free'd immediately after return, this may have fatal and hard-to-debug issues. Ideally we'd "drain" the processing of the callback lists, but then we'd also need to handle the case where bt_gatt_cb_unregister is called in one of the callbacks.

@jhedberg jhedberg added this to the v4.4.0 milestone Nov 4, 2025
@zephyrbot zephyrbot added the Release Notes To be mentioned in the release notes label Nov 5, 2025
@zephyrbot zephyrbot requested review from cfriedt and kartben November 5, 2025 07:19
@zhongzhijie1 zhongzhijie1 requested a review from Thalley November 5, 2025 07:26
Comment on lines +1637 to +1638
if (!sys_slist_find_and_remove(&callback_list, &cb->node)) {
return -ENOENT;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see that callback_list is only used in bt_gatt_att_max_mtu_changed as

void bt_gatt_att_max_mtu_changed(struct bt_conn *conn, uint16_t tx, uint16_t rx)
{
	struct bt_gatt_cb *cb;

	SYS_SLIST_FOR_EACH_CONTAINER(&callback_list, cb, node) {
		if (cb->att_mtu_updated) {
			cb->att_mtu_updated(conn, tx, rx);
		}
	}
}

I think we should probably change that to use SYS_SLIST_FOR_EACH_CONTAINER_SAFE in the case that bt_gatt_cb_unregister is called while in a callback.
Alternatively we should reject calls to bt_gatt_cb_unregister while we are iterating on callback_list.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@alwa-nordic how do you suggest we handle thread safetyness here?

Guard all access with a mutex or semaphore? (which is always a bugger when application callbacks are involved...)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

int bt_gatt_cb_register(struct bt_gatt_cb *cb)
{
...

    k_mutex_lock(&callback_list_lock, K_FOREVER);
    sys_slist_append(&callback_list, &cb->node);
    k_mutex_unlock(&callback_list_lock);

    return 0;
}
int bt_gatt_cb_unregister(struct bt_gatt_cb *cb)
{
...

    k_mutex_lock(&callback_list_lock, K_FOREVER);
    bool removed = sys_slist_find_and_remove(&callback_list, &cb->node);
    k_mutex_unlock(&callback_list_lock);

    return removed ? 0 : -ENOENT;
    
...
}

Something like this

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A few issue with the above suggestion:

  1. We also need to guard the list while iterating on it for the callbacks to prevent the list from changing while iterating on it, and to fix the issue where the callbacks may be called after bt_gatt_cb_unregister due to a race condition as mentioned in host: Implement bt_gatt_cb_unregister method for sys_slist_find_and_remove. #98458 (comment)
  2. In this specific case, the list is very simple, but if we want to apply the same pattern to other callback lists, we may need to consider that the callbacks may be called in both the BT RX thread and system workqueue thread, and using a mutex/semaphore in these cases could cause deadlocks
  3. In the case that bt_gatt_cb_unregister is called while in a callback (assuming 1) has been implemented), then it would also trigger a deadlock with k_mutex_lock(&callback_list_lock, K_FOREVER);

We have several of these callback lists in the existing codebase, and while I agree with @alwa-nordic that we should do what we can to be thread safe, I would also be OK postponing implementing the solution, as it's not specific to what you are adding here, but rather a generic and unsolved issue in the existing codebase

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree that it is better to implement the proper thread-safe solution later with a generic consensus of how to solve this. There are quite few places that need to be fixed as well.

Copy link
Contributor

@alwa-nordic alwa-nordic Nov 6, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought of a way: The iterating function adds a zipper element to the front of the list. The zipper element is then moved forwards each iteration. At the end of the loop the zipper element falls off. The zipper element is a sure way for the iterating function to have a 'current' element that will not be removed by someone else. It is now safe for the iterating function to release the list mutex while invoking the callback. It's also completely safe and non-blocking for any threads, even the callback itself, to add or remove elements on the list. Patents pending 😉

The zipper element can be very minimal: We can use sys_sflist_t and the zipper element can be just a sys_sfnode_t. Tiny, so it can be allocated on the stack of the iterating function, no matter the size of the actual elements on the list. The zippers are flagged. This allows the iterating functions to know that an element on the list is not real, but just a zipper from concurrently iterating thread that must be skipped. Naturally, since the zipper is a bookmark in the list, the iteration can be performed by asynchronous functions. The iterating function can of course also end early and remove the zipper from the middle.

Of course, instead of a mutex we can use any other type of lock. The choice depends what properties we want the lock to have.

I can make a demo for this. Should we go for it?

Copy link
Contributor

@alwa-nordic alwa-nordic Nov 6, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But I also question the need for truly dynamic lists in embedded. We would in many cases would be better served by a compile-time-collected array and only a runtime bitfield that marks elements active. This would be a very RAM-efficient way to have 'subsets' of a compile-time-collected set.

Callback lists is one of the areas which I think would be better served by 'subsets'. Rarely do you not know at compile-time that a particular function may be registered as a callback. And rarely are there so many of these functions that it becomes inefficient to traverse a bitfield that includes the disabled ones.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You have a point w.r.t. runtime lists, and we could probably rely on solutions like

/**
 *  @brief Register a callback structure for connection events.
 *
 *  @param _name Name of callback structure.
 */
#define BT_CONN_CB_DEFINE(_name)					\
	static const STRUCT_SECTION_ITERABLE(bt_conn_cb,		\
						_CONCAT(bt_conn_cb_,	\
							_name))

For callback structs, as the overhead is minimum. It becomes a bit more tricky when the callback list is part of another struct, which is the case for e.g. L2CAP and ISO that have ops for the application-allocated channels, but they are rarely/never slists, so they are not iterated on in the same way.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought of a way: The iterating function adds a zipper element to the front of the list. The zipper element is then moved forwards each iteration. At the end of the loop the zipper element falls off. The zipper element is a sure way for the iterating function to have a 'current' element that will not be removed by someone else. It is now safe for the iterating function to release the list mutex while invoking the callback. It's also completely safe and non-blocking for any threads, even the callback itself, to add or remove elements on the list. Patents pending 😉

The zipper element can be very minimal: We can use sys_sflist_t and the zipper element can be just a sys_sfnode_t. Tiny, so it can be allocated on the stack of the iterating function, no matter the size of the actual elements on the list. The zippers are flagged. This allows the iterating functions to know that an element on the list is not real, but just a zipper from concurrently iterating thread that must be skipped. Naturally, since the zipper is a bookmark in the list, the iteration can be performed by asynchronous functions. The iterating function can of course also end early and remove the zipper from the middle.

Of course, instead of a mutex we can use any other type of lock. The choice depends what properties we want the lock to have.

I can make a demo for this. Should we go for it?

I'm not sure I fully follow this, or understand how it differs from the _SAFE versions of the linked list iterators. Do you have any resources or similar to help explain?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Thalley , if I understand @alwa-nordic correctly, _SAFE won't save you from removing an entry by another thread and we still need to "wrap" the loop with a critical section which will include wrapping the callback. With the zipper you can release the lock when calling the callback because zipper itself can't be removed by any other threads.

@zhongzhijie1 zhongzhijie1 force-pushed the gatt_unreg branch 2 times, most recently from 4deea23 to 3580adf Compare November 6, 2025 02:55
@zhongzhijie1 zhongzhijie1 requested a review from Thalley November 6, 2025 02:59
Copy link
Contributor

@Thalley Thalley left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Your commit message title should be something like
bluetooth: GATT: Implement bt_gatt_cb_unregister - The existing commit title is failing our compliance check

@zhongzhijie1 zhongzhijie1 force-pushed the gatt_unreg branch 2 times, most recently from 210025d to 39f155d Compare November 6, 2025 10:12
@zhongzhijie1 zhongzhijie1 requested a review from Thalley November 6, 2025 11:26
Copy link
Contributor

@Thalley Thalley left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm OK with the current changes. It follows existing patterns (which have been pointed out to be less than ideal), but it doesn't introduce any new issues that we don't already have.

One request I do want to make, is that we use this new API somewhere in the tree; either in a sample or test (or perhaps both) so that we can easily verify that it works as intended. Ideally it would be added with unittests as well to provide test coverage

…acks

New API bt_gatt_cb_unregister, use _SAFE iteration for callback list.

Signed-off-by: Zhijie Zhong <zhongzhijie1@xiaomi.com>
call unregister func, Mark TODO for additional cleanup.

Signed-off-by: Zhijie Zhong <zhongzhijie1@xiaomi.com>
@sonarqubecloud
Copy link

@zhongzhijie1
Copy link
Contributor Author

I will reproduce the CI failures locally and fix them accordingly.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area: Bluetooth Audio area: Bluetooth Host Bluetooth Host (excluding BR/EDR) area: Bluetooth area: Tests Issues related to a particular existing or missing test Release Notes To be mentioned in the release notes

Projects

Status: In Review

Development

Successfully merging this pull request may close these issues.

6 participants