Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bluetooth: Mesh: Add support for provisioning remote devices #17729

Merged
merged 4 commits into from
Oct 30, 2019

Conversation

tsvehagen
Copy link
Collaborator

@tsvehagen tsvehagen commented Jul 23, 2019

Known limitations:

  • Only AUTH_METHOD_NO_OOB is supported currently. I think this is good enough for now and support for OOB can be added in another PR.
  • Add a mesh_provisioner sample to samples/bluetooth I think mesh_shell is good enough.
  • Add documentation
  • Only PB-ADV is supported. I think this is also good enough for now and support for PB-GATT can be added in another PR

Questions:

  1. Start and PubKey messages should be sent right after each other and currently the sending of the PubKey is triggered from gen_prov_ack() when the ACK for the Start message is received. This feels like a weird placement.. any other ideas for handling this?

  2. When provisioning is complete, the specification states that a LINK_CLOSE message should be sent at least three times. Any ideas how to handle this in a nice way? I guess we should not reset the link on Complete but rather when all the close messages have been sent. FIXED

  3. Might it be a good idea to rename bt_mesh.dev_key to bt_mesh.local_dev_key and bt_mesh.dev_keys to bt_mesh.remote_dev_keys? FIXED

  4. Currently there is no way to build provisioner support but not device support. It could be possible to keep BT_MESH_PROV option and then have a BT_MESH_PROV_DEVICE default to y and a BT_MESH_PROV_PROVISIONER default to n. BT_MESH_PROV would include the common stuff while the others would be specific things for device and provisioner. Is this necessary or should device support always be enabled when BT_MESH_PROV is enabled?

  5. Storing of AppKey and NetKey supports being written by a timer. I guess it is to minimize flash operations. This is currently not implemented for DevKey storage. Should this be done and if so should I try to modify and reuse the key_update struct for DevKeys as well? FIXED

Test it:

I have used the mesh shell together with a nrf52840_pca10056 board and a nrf52_adafruit_feather board for testing. Simply build one with the BT_MESH_PROVISIONER option enabled.

On device:
mesh init
mesh uuid 1234
mesh pb-adv on

On provisioner:
mesh init
mesh provision 0 1
mesh provision-adv 1234 0 2 0
mesh get-comp

@zephyrbot
Copy link
Collaborator

zephyrbot commented Jul 23, 2019

All checks are passing now.

Review history of this comment for details about previous failed status.
Note that some checks might have not completed yet.

@tsvehagen tsvehagen force-pushed the bt_mesh_provisioner branch 2 times, most recently from c515342 to c1161b2 Compare July 24, 2019 06:36
@trond-snekvik
Copy link
Contributor

Some input:

I think provision_remote might be a bad name for provisioning other devices with PB-adv. The SIG wants to reserve this keyword for devices that are somewhere else in the mesh network, instead of right next to the provisioner, so that in the future, we can tunnel provisioning through the mesh. Ideally, bt_mesh_provision should probably be bt_mesh_provision_self(), so that this new API could take its place, but I guess it's a bit late for that. Perhaps bt_mesh_provision_other, _by_uuid or _node could be a good compromise?

IMO question 3 should be handled by changing the new bt_mesh_dev_key structure to something like bt_mesh_node. It should probably align with the bt_mesh_provision_remote API name. The dev key is just one part of the structure, which actually represents a whole node, so I found the current naming a bit confusing. I suppose this structure could also replace the two first input parameters of the config client API functions, to avoid leaking the netkey-to-addr binding.

+1 for Q5.

Works very well on my nrf52_pca10040 setup. Confirmed working with the Nordic Mesh SDK as well.

@tsvehagen
Copy link
Collaborator Author

Good input!

I think provision_remote might be a bad name for provisioning other devices with PB-adv. The SIG wants to reserve this keyword for devices that are somewhere else in the mesh network, instead of right next to the provisioner, so that in the future, we can tunnel provisioning through the mesh. Ideally, bt_mesh_provision should probably be bt_mesh_provision_self(), so that this new API could take its place, but I guess it's a bit late for that. Perhaps bt_mesh_provision_other, _by_uuid or _node could be a good compromise?

Totally agree on the bt_mesh_provision_self() part but like you say, it might be a bit too late for that change :/ I think I like the _node name best but any of the above is fine by me. I'll wait for some more feedback.

IMO question 3 should be handled by changing the new bt_mesh_dev_key structure to something like bt_mesh_node. It should probably align with the bt_mesh_provision_remote API name. The dev key is just one part of the structure, which actually represents a whole node, so I found the current naming a bit confusing.

Yes, that name is a bit confusing and I like the bt_mesh_node name. Should probably change the setting DevKey to Node as well then. Do you think the bt_mesh_dev_key_find should be renamed to bt_mesh_node_find as well?

I suppose this structure could also replace the two first input parameters of the config client API functions, to avoid leaking the netkey-to-addr binding.

Do you mean passing in a struct bt_mesh_node to functions like bt_mesh_cfg_comp_data_get?

+1 for Q5.

I'll look into that then.

Works very well on my nrf52_pca10040 setup. Confirmed working with the Nordic Mesh SDK as well.

Nice to hear!

@jhedberg
Copy link
Member

Ideally, bt_mesh_provision should probably be bt_mesh_provision_self(), so that this new API could take its place, but I guess it's a bit late for that

@trond-snekvik I don’t think it’s too late. That API has mainly been for internal use (called by PB-ADV or PB-GATT) and only used directly by apps for demo purposes. So I wouldn’t be against renaming it. That said, the new API for provisioning other devices should probably still get its own suffix. Btw, IIRC the spec uses “device” for unprovisioned entities and “node” for provisioned ones. I.e the act of provisioning makes a device into a node.

I haven’t thought too deeply about this, but wouldn’t it make sense to name the APIs based on provisioning bearer? So we’d have provision_gatt() provision_adv() and then the current provision() would become provision_self() or provision_ local().

@trond-snekvik
Copy link
Contributor

Do you think the bt_mesh_dev_key_find should be renamed to bt_mesh_node_find as well?

IMO yes, this'll make more sense when the usage of the node structure expands in the future (to config client or the configuration database @jhedberg mentioned in #17454, for instance).

Do you mean passing in a struct bt_mesh_node to functions like bt_mesh_cfg_comp_data_get?

Yes, the cfg_cli.h signatures could look like int bt_mesh_cfg_comp_data_get(struct bt_mesh_node * node, u8_t page, u8_t *status, struct net_buf_simple *comp);.
Since the first two parameters of the cfg_cli functions always have to be used with the correct combination of netkey and address, the application will have to either store them as a pair after they come out of the provisioning_complete callback, or pick them out of the dev_key array with the bt_mesh_dev_key_find() function. Either way, the current cfg_cli API leaks this binding, increasing complexity in the user app and opening up for more bugs, and the bt_mesh_dev_key structure could alleviate this. It could also help make it less awkward to configure its own server.
This would change would break the API, and it's probably out of scope for this PR. I'm not confident that it's worth doing, but I thought it might be worth discussing :)

Related: The provisioning_complete callback is now serving two purposes: being provisioned and provisioning others. This should probably either be reflected in its documentation, or a separate callback should be made for the provisioner, preferably with the struct bt_mesh_dev_key as a parameter instead of the current netkey-address tuple. The need for the new is_provisioned flag in shell.c highlights this problem, I think.

wouldn’t it make sense to name the APIs based on provisioning bearer?

I like this, it'll future proof the API for new bearers, and fits well with the spec.

@tsvehagen
Copy link
Collaborator Author

Do you think the bt_mesh_dev_key_find should be renamed to bt_mesh_node_find as well?

IMO yes, this'll make more sense when the usage of the node structure expands in the future (to config client or the configuration database @jhedberg mentioned in #17454, for instance).

Ok I'll go for that.

Do you mean passing in a struct bt_mesh_node to functions like bt_mesh_cfg_comp_data_get?

Yes, the cfg_cli.h signatures could look like int bt_mesh_cfg_comp_data_get(struct bt_mesh_node * node, u8_t page, u8_t *status, struct net_buf_simple *comp);.

Sounds like a plan but like you say, it probably belongs in another PR.

Related: The provisioning_complete callback is now serving two purposes: being provisioned and provisioning others. This should probably either be reflected in its documentation, or a separate callback should be made for the provisioner, preferably with the struct bt_mesh_dev_key as a parameter instead of the current netkey-address tuple. The need for the new is_provisioned flag in shell.c highlights this problem, I think.

Yes that is the reason for the is_provisioned flag in the shell. Either adding another callback or a bool local flag to the existing one sounds like a good idea. Is it better with a new callback?

wouldn’t it make sense to name the APIs based on provisioning bearer?

I like this, it'll future proof the API for new bearers, and fits well with the spec.

Yea I also think that makes sense. Good idea @jhedberg.

@trond-snekvik
Copy link
Contributor

Either adding another callback or a bool local flag to the existing one sounds like a good idea. Is it better with a new callback?

That's what I would do, but I'm still taking in The Zephyr Way™️ of doing things, so your judgement is probably better than mine 😄

@tsvehagen
Copy link
Collaborator Author

Either adding another callback or a bool local flag to the existing one sounds like a good idea. Is it better with a new callback?

That's what I would do, but I'm still taking in The Zephyr Waytm of doing things, so your judgement is probably better than mine

Haha I doubt that! Let's wait and see what @jhedberg has to say about it :)

@jhedberg
Copy link
Member

jhedberg commented Aug 13, 2019

Either adding another callback or a bool local flag to the existing one sounds like a good idea. Is it better with a new callback?

That's what I would do, but I'm still taking in The Zephyr Waytm of doing things, so your judgement is probably better than mine

Haha I doubt that! Let's wait and see what @jhedberg has to say about it :)

Definitely a new callback 😄 It could also be an ephemeral one given as a parameter to the provision() API rather than being registered separately.

@tsvehagen tsvehagen force-pushed the bt_mesh_provisioner branch 2 times, most recently from 92f0cff to dea1b35 Compare August 20, 2019 20:40
@tsvehagen
Copy link
Collaborator Author

tsvehagen commented Aug 20, 2019

Either adding another callback or a bool local flag to the existing one sounds like a good idea. Is it better with a new callback?

That's what I would do, but I'm still taking in The Zephyr Waytm of doing things, so your judgement is probably better than mine

Haha I doubt that! Let's wait and see what @jhedberg has to say about it :)

Definitely a new callback It could also be an ephemeral one given as a parameter to the provision() API rather than being registered separately.

I added a complete_node callback for now at least. I did not change the old callback but maybe that should be called complete_self if we also choose to rename bt_mesh_provision to bt_mesh_provision_self.

@carlescufi
Copy link
Member

@tsvehagen thanks for the PR. We are going to release 2.0 in a few days, and after that the merge window will be open again for merging PRs that are not bugfixes.
Do you have an estimate as to when you will be able to continue the work here? I am setting the milestone for 2.1 tentatively in the meantime.

@carlescufi carlescufi added this to the v2.1.0 milestone Aug 29, 2019
@tsvehagen
Copy link
Collaborator Author

@tsvehagen thanks for the PR. We are going to release 2.0 in a few days, and after that the merge window will be open again for merging PRs that are not bugfixes.
Do you have an estimate as to when you will be able to continue the work here? I am setting the milestone for 2.1 tentatively in the meantime.

I'm back from vacation and can kind of continue whenever. I think I need some more feedback from @jhedberg on what is needed to get it merged.

Milestone 2.1 sounds good to me.

Copy link
Member

@jhedberg jhedberg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Generally this looks very promising, thanks! It seems we may indeed be able to do everything necessary in a single prov.c file.

subsys/bluetooth/mesh/prov.c Outdated Show resolved Hide resolved
subsys/bluetooth/mesh/prov.c Outdated Show resolved Hide resolved
subsys/bluetooth/mesh/main.c Outdated Show resolved Hide resolved
subsys/bluetooth/mesh/Kconfig Outdated Show resolved Hide resolved
subsys/bluetooth/mesh/Kconfig Outdated Show resolved Hide resolved
include/bluetooth/mesh/main.h Outdated Show resolved Hide resolved
subsys/bluetooth/mesh/shell.c Outdated Show resolved Hide resolved
subsys/bluetooth/mesh/prov.c Outdated Show resolved Hide resolved
@jhedberg jhedberg added the Enhancement Changes/Updates/Additions to existing features label Sep 4, 2019
@tsvehagen
Copy link
Collaborator Author

@tsvehagen could you please rebase this PR and reply to the unanswered review comments? Thanks!

I'm working on it, will hopefully find some more time tomorrow.

@tsvehagen tsvehagen force-pushed the bt_mesh_provisioner branch from ee08823 to 0d403f0 Compare October 12, 2019 07:00
@tsvehagen
Copy link
Collaborator Author

@trond-snekvik I changed the bt_mesh_app_key_get() that you recently added. I added an addr argument in order to know what device key to fetch. Can you please check that unseg_app_sdu_unpack() still makes sense. is it possible to forward friend-messages that are encrypted with a device-key?

@tsvehagen tsvehagen force-pushed the bt_mesh_provisioner branch 2 times, most recently from b046c76 to cda4559 Compare October 12, 2019 13:10
@tsvehagen tsvehagen force-pushed the bt_mesh_provisioner branch from cda4559 to 0b28f72 Compare October 15, 2019 10:12
Copy link
Contributor

@trond-snekvik trond-snekvik left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@trond-snekvik I changed the bt_mesh_app_key_get() that you recently added. I added an addr argument in order to know what device key to fetch. Can you please check that unseg_app_sdu_unpack() still makes sense. is it possible to forward friend-messages that are encrypted with a device-key?

That'll work nicely, it's only called on packets from the friend to the LPN 🙂

subsys/bluetooth/mesh/nodes.c Outdated Show resolved Hide resolved
subsys/bluetooth/mesh/net.c Outdated Show resolved Hide resolved
@tsvehagen tsvehagen force-pushed the bt_mesh_provisioner branch from 0b28f72 to de8ad0e Compare October 23, 2019 12:23
@trond-snekvik
Copy link
Contributor

@tsvehagen What's left to do in this PR?
To me, the node_added call on start is the only thing I think has to be addressed before we can merge, the rest is mostly nitpicking and stuff that can be postponed.

Adds the unprovisioned_beacon callback to the bt_mesh_prov structure.

Signed-off-by: Tobias Svehagen <tobias.svehagen@gmail.com>
@tsvehagen tsvehagen force-pushed the bt_mesh_provisioner branch 2 times, most recently from 5ef7412 to fa1513d Compare October 25, 2019 12:38
Make it possible to provision devices over advertising bearer (PB-ADV).
Many messages in the provisioning protocol are the same for provisioner
and device so much of the code could be reused by only changing when
they are expected to arrive.

This introduces to concept of local and remote device keys. The models
for cfg_cli and cfg_srv have been updated to reflect this concept. Both
the send and receive path in the transport layer have been updated to
support encrypting/decrypting with local and remote device keys.

When a node has been provisioned it is stored in bt_mesh_net.nodes. If
CONFIG_BT_SETTINGS is enabled, they are also saved to settings. If the
callback node_added in bt_mesh_prov has been set, it will be called for
every node that gets provisioned. This includes when they are retrieved
from settings.

The configuration CONFIG_BT_MESH_NODE_COUNT controls how many nodes that
can be provisioned.

Signed-off-by: Tobias Svehagen <tobias.svehagen@gmail.com>
Add API for supporting provisioning of devices over PB-ADV.

Signed-off-by: Tobias Svehagen <tobias.svehagen@gmail.com>
Adds the command provision-adv to the mesh shell.

Signed-off-by: Tobias Svehagen <tobias.svehagen@gmail.com>
@tsvehagen tsvehagen force-pushed the bt_mesh_provisioner branch from fa1513d to 168d089 Compare October 25, 2019 14:26
@tsvehagen
Copy link
Collaborator Author

@tsvehagen What's left to do in this PR?
To me, the node_added call on start is the only thing I think has to be addressed before we can merge, the rest is mostly nitpicking and stuff that can be postponed.

Yea I'm quite happy with it and I think it is ready for merge.

Comment on lines +94 to +100
if (!atomic_test_bit(bt_mesh.flags, BT_MESH_VALID)) {
return -EINVAL;
}

if (bt_mesh_subnet_get(net_idx) == NULL) {
return -EINVAL;
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One of the things that has been stressed at many Mesh WG meetings I've attended is that the Provisioner is not required to be a node on the network. It's only if we also support configuration client that we need to be a node. The above two checks seem to require that we're both provisioned into a network and that we've been configured with the net_idx we intend to give to the to-be-provisioned node.

What you should be checking here (I think) is if the net_idx is available in the provisioner's database.

While we could just say that we don't support separating configuration client and provisioner roles, we should still have separation of the provisioner's database and the parts of it that have been configured for the local node. We could e.g. be adding app & net keys to other nodes that the local node (configuration client) hasn't configured for its own use as a node (beyond passing values as parameters in configuration client messages).

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it would be a very good idea to have that sort of database that could have its own API for adding keys and such. The nodes added should probably also be put in that database instead of in the struct bt_mesh_net.

What you should be checking here (I think) is if the net_idx is available in the provisioner's database.

Agree, but there is no notion of a provisioner's database currently. Do you think that is required before merging this PR?

The checks in the API call is basically just a "fail fast" but I'm pretty sure they could be removed. The real problem is that send_prov_data() uses the local node as database for the net_key, iv_index and flags. I think this is where the separation should happen.

For know I think it would be okay to say that separation is not supported but I see no problem of creating the separation in the future.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@tsvehagen as long as there's no major impact on the public API I think we can do the separation (provisioner database) in a later PR, but it'd be good to be done sooner than later, ideally before the 2.1 merge window closes in a couple of weeks.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay so regarding the API I think we should then have a short discussion before merging.

Currently the API only takes a net_idx. This won't be enough if for example a provisioner can work with multiple networks as a the same net_idx could appear in both. In that case there must also be some sort of network id parameter. Is that something we want to support?

Copy link
Collaborator Author

@tsvehagen tsvehagen Oct 26, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One of the things that has been stressed at many Mesh WG meetings I've attended is that the Provisioner is not required to be a node on the network. It's only if we also support configuration client that we need to be a node. The above two checks seem to require that we're both provisioned into a network and that we've been configured with the net_idx we intend to give to the to-be-provisioned node.

@jhedberg
One thing that I don't really understand is that during provisioning, like I commented above, the provisioning data includes flags and iv_index. If the provisioner device is separate from the configuration client node, how will the provisioner know these things? Is there anything specified for how a provisioner can get new network information?

Also in the Mesh Profile spec under 3.10.1 Mesh Network Creation procedure, it says that a Provisioner shall allocate a unicast address and assign it to its primary element. Doesn't this make it a node? 🤔

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the provisioner device is separate from the configuration client node, how will the provisioner know these things? Is there anything specified for how a provisioner can get new network information?

This is left as a vendor-specific solution for the provisioner implementation. There's a spec coming out soon with a standard format for encoding such information, which can be used to synchronise multiple provisioners of the same network (this e.g. lets provisioners get subsets of the unicast address range so they don't go and provision different devices with the same addresses).

Also in the Mesh Profile spec under 3.10.1 Mesh Network Creation procedure, it says that a Provisioner shall allocate a unicast address and assign it to its primary element. Doesn't this make it a node?

That's indeed quite odd, and in contradiction to what I've learned from listening and talking to Mesh WG members. I'll check if upcoming spec versions have this clarified/corrected.

@trond-snekvik do you have any thoughts on these issues?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@trond-snekvik do you have any thoughts on these issues?

I'm also under the impression that the WG wants provisioners to be some omnipresent non-node entity, but I think the most natural way to implement it is to combine it with the configurator role.

I'm fine with the current API and its checks. Forcing the users to provide the netkey value in addition to the net_idx just introduces awkwardness for the normal case, and the current approach isn't really a blocker for the non-node provisioner.

Copy link
Member

@jhedberg jhedberg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Approving, but with the expectation that the separation of provisioner database happens fairly soon (i.e. that the provisioner is not required to be a node with all keys etc that get provisioned & configured to other nodes)

@jhedberg
Copy link
Member

Once @trond-snekvik has also approved I think we can merge this.

@tsvehagen
Copy link
Collaborator Author

Approving, but with the expectation that the separation of provisioner database happens fairly soon (i.e. that the provisioner is not required to be a node with all keys etc that get provisioned & configured to other nodes)

For me it's not a rush to merge it. I can continue to work a a provision database and try to have that it place before moving forward with this. It might not be done before 2.1 merge window closes though.

@joerchan
Copy link
Contributor

@jhedberg Looks like you have the approval you needed. :)

@jhedberg jhedberg merged commit 3282ff2 into zephyrproject-rtos:master Oct 30, 2019
@WilliamGFish
Copy link
Collaborator

Just fetched these changes and the BBC Microbit is not happy. Mesh demo went from almost working to:

***** Booting Zephyr OS build zephyr-v2.0.0-1610-g3282ff22c4aa *****
Initializing...
Unicast address: 0x0003
Bluetooth init failed (err -22)
[00:00:00.015,136] [1B][1;31m bt_settings: settings_subsys_init failed (err -22)[1B][0m

I will raise an issue

@tsvehagen tsvehagen deleted the bt_mesh_provisioner branch January 3, 2020 14:30
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area: API Changes to public APIs area: Bluetooth Mesh area: Bluetooth Enhancement Changes/Updates/Additions to existing features
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants