-
Notifications
You must be signed in to change notification settings - Fork 2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Extensebility of netdev_driver_t
API
#12469
Comments
@RIOT-OS/maintainers: Please have a look at this and discuss :-) Please feel invited to directly add alternative ideas as subsections to the "Collecting ideas" section in the description above. |
hi @maribu This is aligned with the PHY/MAC/netif rework. There was a presentation for the RIOT summit that unfortunately couldn't present because of lack of time. We discussed some insights with @PeterKietzmann, @miri64, @bergzand, @haukepetersen and @kaspar030, but the idea was to have a wide discussion about this topic. I will try to record the original session asap so we are all in sync. |
Hi @maribu After opening #12741, I think it's easier to explain my point of view here. I think considering However, I would like to point the problem the netif rework is trying to resolve (in the mid-long term, third phase of #12688 ) (sorry for the Wall Of Text) The problemNetdev is documented as a network interface:
But in practice is closer to a PHY layer (the MAC layer is on top, e.g it's called by the IEEE802.15.4 MAC in But also abstracts some transceiver operations (read packet, set hardware acceleration params, etc). So it would be ideal to have different layers to have a MAC, PHY and whatever layer is needed in the middle (proposed in #7736). (this is partially shown in #12741). E.g for the proposed actions:
In fact, Note that having MAC/PHY specific APIs doesn't interfere with users sending data directly using the low layers:
However, I'm fine to continue with this proposed approach, since the netif rework will take some time and it doesn't collides with these principles. I hope everything is clear! Please let me know if I miss something |
I fail to fully understand this. Could you sketch the API of e.g. an IEEE 802.15.4 Radio HAL? How would it differ (except for names) from the proposed
Hardly. The PHY layer send function will need some API to implement the send feature. Something like |
Let's say the different HALs are implemented with the three basic That would certainly make the compilers job easier to generate efficient machine code for a |
Without any cross layer optimizations, a pure theoretical IEEE802.15.4 HAL would need:
A pure PHY layer on top would:
(I'm not assuming it's the smartest idea to have pure theoretical PHY and HAL because most radios provide some components of PHY, but it's just to show the differences) Indeed it's possible to encapsulate everything under We could benefit from type checks. E.g
instead of
But yes, this is just an implementation decision :)
It depends on the scope of the transceiver HAL. If we add a FIFO interface, then it's possible to implement the "send" function on top.
As said before, for sure it would match. The only question that comes to my mind is, if it makes sense to encapsulate under these three basics when the actions are well known for each technology. We could think about vtables. But again, this is just an implementation detail and it shouldn't interfere with the architecture.
That's for sure! |
I just found this: https://github.com/RIOT-OS/RIOT/blob/master/drivers/at86rf2xx/at86rf2xx_netdev.c#L489 This check is not device dependent (the PHY channels are PHY dependent. IEEE802.15.4 2.4GHz always uses page 0). |
Reducing the number of "actual" functions in the HAL has proven to reduce the ROM requirements significantly. The benefit of proper type checks and range changes could be provided upon a slim API via From the perspective of an implementer of a device driver it is actually quite nice to have few functions like |
I like the idea of having static inline functions. In fact, it would be nice to have them beforehand so we can measure performance differences between different implementations of HAL.
Agreed. That's the point of separating these sub-layers |
Going back to the interface, I think I would also remove the The Check e.g the AT86RF2xx test. Note that:
|
Will that still work if an interrupt is shared between two interfaces? |
Yes. In fact, it's easier because the driver is the one that knows what to do with the interrupt. |
I proposed to define the PHY and HAL layer for each device type with Does this sound reasonable? |
I never fully understood why there were to functions for doing the initialization. To me, this makes the process more complex and increases the ROM size, as function calls add (depending on calling convention) some boilerplate instructions. But maybe there was some specific reason for that, so it might be a good idea to ask why this was chosen that way. (And ideally we can find a better solution to that, rather than splitting the setup phase into two functions.) |
maybe @kaspar030 or @haukepetersen can say something about it |
The idea was that at some point, the network stack needs to know "this device can be used". That time has come when "init()" has been called. It is basically the initialization function of the netdev layer (which might be independend of the underlying device.). |
I have three alternatives to the two-step initialisation that might work:
I'm personally in favour of option 3. But maybe this discussion leads to more and better ideas than those three :-) |
+1 to this approach. |
Even there are cases where a network interface is not associated to a device driver (e.g loopback interface) |
where? How? Is the network stack already "up"? netdev->init() is the initialization synchronization point between netdev and a network stack. |
First, Second, this is something that is totally unique in RIOT. All other drivers have a Let's look e.g. at a sensor driver for an I2C sensor. That driver needs the I2C bus to work. But there is no i2c_dec->init() as synchronisation point. The proper initialisation is done by enforcing that To me, netdevs behave to the network stack very similar to how sensor drivers behave to SAUL, when only looking at the initialisation requirements. So why having different approaches there? If one solution was preferred over the other for good reasons, those reasons should apply in the other context as well. Also: Consisted look and feel was stated as explicit design goal of RIOT. So to me, there should be sound arguments to justify the inconsistency here; if not, the inconsistency should be addressed. Finally, the split of the initialisation increases complexity and ROM requirements. But so far noone could explain me the benefit of that split. A simple callback like used in SAUL is enough to let the network stack know that initialisation succeeded. (Let's be honest about the synchronisation: It depends on init functions being called in the correct order with zero concurrency, even with the current approach. But for an OS without SMP support, that is a sane design choice. Especially when low complexity and footprint are design goals.) |
I totally agree. All synchronization should be done in the boot sequence without an init interface. |
yes. a network device is by definition asynchronous and triggers events by itself. A random saul driver will not trigger asynchronous events before being configured (initialized by the application and configured to trigger events). So either the network stack initializes the network device, or the network interface is being initialized before and then being told by the network stack that it is now "active". Without this extra activation / synchronization step, the driver's ISR doesn't know what to do in case of an ISR (as e.g., event_callback is not set up). The idea is that once "init()" has been called, the driver is from that point on free to send events, e.g., about arriving packets, link-state, ...
Exactly. But a netdev without "network stack" cannot handle its own ISRs. |
Your description of the sensor matches quite perfectly the behavior of a network device: It will not generate any event until the driver configures it to do so (upon driver initialization). So why not bring up the network stack, with only the loop back device configured. Then bring up the network devices one by one, which call back into the network stack to register themselves (just like in SAUL) right after the driver is successfully initialized. After that registration, interrupts of the netdev can be enabled and served. |
How? With a function called
That might work with gnrc. A more specialized stack might not want to dynamically add new network devices, just expect some to be there and use them. I'm sure there is a way to get rid of |
The common approach is to call
Please elaborate on how this would work with the current API. Summarizing only your own words, the constraints are:
Dropping
Start reading form #12469 (comment), and I'm sure you'll find the arguments you're looking for ;-) But showing the improvement in ROM size with a prototype might actually be more convincing ;-) |
The network stack shouldn't initialize the device driver. The OS should.
This is strictly a responsibility of the PHY layer. It's not even close to the network interface IMO.
In fact, I would expect |
An extra note. The device driver shouldn't receive interrupts out of nothing (we are doing something wrong if so). The device driver will receive events only if:
All of them can be handled if the device drivers inits before the PHY layer. |
Strong +1 to this. After all, a network device is just another device. We can use that devices for other stuff that are not even close to networks (RNG, temperature sensor, crypto acceleration, etc) |
This discussion seems to be getting stuck around assumptions on both sides. Is there a sequence diagram available somewhere explaining the proposed idea and stating the responsibilities of all functions? |
Yes, nothing wrong with that. But a netdev (logically) sits on top of that. The mentioned calls says nothing about netdev->event_callback. Those are only usable when a network stack takes the properly set up device descriptor and calls it's
It works perfectly well.
1.1. the network stack initializes its event handling mechanism (set up thread with message loop, ...)
Yes, calling
No, the network stack expects a bunch of "netdev_t" that are ready to use after calling their If there's no explicit "init()", it is implicitly assumed that the network device does not do anything requiring e.g., SPI communication (after its setup function) until the network stack issues an action that enables it (setting to RX state). Please, treat network drivers as two-stage: their actual implementation and the netdev part. Just because most drivers in RIOT are netdev only doesn't mean we should assume or even force that. |
I fail to understand something here. What do you call a network stack in this context? Are you considering PHY and MAC a part of the network stack? |
Yes. |
I refer to @jia200x post above and cite the image of the architecture: What we want to discuss is who a network device driver interfaces with the PHY layer on top. Do we now all agree that a single initialization function is sufficient? |
No, unless we want to change netdev->init() so it takes device-specific parameters ( Which IMO, we don't want. Why are you so opposed on having an explicit function for hooking up a netdev to its users? Network drivers get set up as regular devices (as in, It is a simple two stage setup procedure, one stage that has all device specific information, one that doesn't but can expect the upper layer to be ready. of course we can fold that together somehow, but that would only tie the layers unnecessarily together. |
Also, this shows the only one that requires the radios to be initialized is the PHY layer. Thus, initialization should be carried by boot sequence and not by the network stack. |
Of course we don't want that, but I don't see why we need to pass device specific params there |
Because the device cannot be initialized without its Should we set up a meeting to discuss this? IMO there are limits of github comments to discuss complex things with differing definitions... |
Could you please define network stack and in which layer wold you put netdev in terms of:
I think that would help to focus the discussion |
Yes, I think it's a good idea. |
Now is a little late in the day, but maybe sometime tomorrow? maybe 11am (CET)? |
Ok for me. Is it ok for you @maribu ? |
Sorry, I won't be able to make it. But being honest, in addition to implement most parts of the interface between network device driver and the PHY layer via |
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. If you want me to ignore this issue, please mark it with the "State: don't stale" label. Thank you for your contributions. |
The Issue
In PR #12294 the feature to let the transceiver sleep and wake up again is developed. I strongly think that there should be a common API for this functionality, so this feature can be used in the same way for all transceiver supported by RIOT.
There seems however not a an easy way to extend the API, except for the
netdev_driver_t::get()
andnetdev_driver_t::set()
function - those have been proven to be very well extensible without having compatibility issues.Using
netdev_driver_t::set()
for sleepingOne could use
netdev_driver_t::set(dev, NETOPT_STATE, NETOPT_STATE_SLEEP, sizeof(NETOPT_STATE_SLEEP))
as a conical way to get the device sleeping. But what should be the way to wake it up again?One of the following? Any of the following?
netdev_driver_t::set(dev, NETOPT_STATE, NETOPT_STATE_IDLE, sizeof(NETOPT_STATE_IDLE))
netdev_driver_t::set(dev, NETOPT_STATE, NETOPT_STATE_RX, sizeof(NETOPT_STATE_RX))
netdev_driver_t::set(dev, NETOPT_STATE, NETOPT_STATE_RESET, sizeof(NETOPT_STATE_RESET))
TX Preloading
A similar approach is used for TX preloading: When preloading is active,
netdev_driver_t::send()
will not send but preload the given frame. Andnetdev_driver_t::set(dev, NETOPT_STATE, NETOPT_STATE_TX, sizeof(NETOPT_STATE_TX))
The issue with
netdev_driver_t::set()
for actionsTo me the fact that
netdev_driver_t::set()
andnetdev_driver_t::get()
have been proven to be so versatile proves that this API is well designed. But this doesn't seem to reflect actions well.E.g. with the state of the driver: Normally, this is and implementation detail. The internal state machine of the driver can either reflect the states in
netopt_state_t
very well, or could be vastly more complex, or anything in between. Similar, the transitions between the states greatly depend on the properties of both the specific hardware used and the design decisions in the driver made. Providing an API that looks like force-setting the internal state seems very wrong to me.The road forward
So, what should we do?
Should we simple keep using the
get()
/set()
not only for things like configuration knobs (for which the API - at least to my personal opinion - has served us very well), but also for actions?Or should we try to add a second API that should ideally as extensible as the
get()
/set()
API while also allowingnetdev_driver_t
s to only support a subset of the features?Very likely there are more options that I haven't thought about
Collecting ideas
Adding an action API
How about adding a specific action API, so that
netdev_driver_t
becomes:with
(The details of
NETDEV_ACTION_RECV
are purposely left out, as e.g. just usingvalue
as the destination buffer andvalue_len
as the buffer size would lack the feature of the RX info; so either a struct that can be used to pass the buffer and retrieve the RX info should be used, or a second action or aget()
would be needed to get the RX info. But this in-detail discussion is misplaced here until the general idea of this approach should be agreed upon first.)This proposal tries to learn some lessons from previous discussions and ideas:
get()
/set()
API has proved to be very flexible and extensible. This proposed API tries to copy this featurerecv()
function has been overloaded to support thedrop()
andget_size()
functionality by providing magic values as arguments. This lead to confusion about the API and a lot of bugs in the implementations (see Misleading API in netdev driver #9805). The explicitenum
value provided here should be more obvioussend()
andrecv()
functions with a more generic functionThe text was updated successfully, but these errors were encountered: