Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

at86rf2xx radio driver not (reliably) sending ACKs #21659

Closed
markus-becker-tridonic-com opened this issue Jan 3, 2020 · 7 comments · Fixed by #21753
Closed

at86rf2xx radio driver not (reliably) sending ACKs #21659

markus-becker-tridonic-com opened this issue Jan 3, 2020 · 7 comments · Fixed by #21753
Assignees
Labels
area: Networking bug The issue is a bug, or the PR is fixing a bug priority: low Low impact/importance bug

Comments

@markus-becker-tridonic-com
Copy link
Contributor

Describe the bug
The newly introduced at86rf2xx driver is not sending ACKs.

@nandojve, many thanks for developing and upstreaming the at86rf2xx radio driver! I have been trying to use the driver on a custom hardware with an STM32 uC and the RF233 using OpenThread.

However, I am not (reliably) getting ACKs, as can be seen in the following capture:
image

The debug output indicated that the correct address filter is being set (ieee802154_rf2xx.rf2xx_set_ieee_addr: IEEE address 8e:f3:5a:10:a3:7d:64:e1) and works as the packets sent to the 64bit address are received.

*** Booting Zephyr OS build v2.1.0-rc3-43-g957ab95b0d29  ***
[00:00:00.000,000] <dbg> ieee802154_rf2xx.rf2xx_init: Initialize RF2XX Transceiver
[00:00:00.000,000] <inf> ieee802154_rf2xx: Optional instance of GPIOB device activated
[00:00:00.000,000] <inf> ieee802154_rf2xx: Optional instance of GPIOA device activated
[00:00:00.000,000] <dbg> ieee802154_rf2xx.configure_spi: SPI GPIO CS configured on GPIOB:12
[00:00:00.000,000] <dbg> ieee802154_rf2xx.rf2xx_init: GPIO and SPI configured
[00:00:00.001,000] <dbg> ieee802154_rf2xx.power_on_and_setup: Transceiver is old and unstable release
[00:00:00.001,000] <dbg> ieee802154_rf2xx.rf2xx_thread_main: RF2XX main thread
[00:00:00.003,000] <dbg> ieee802154_rf2xx.rf2xx_filter: Applying filter 0
[00:00:00.003,000] <dbg> ieee802154_rf2xx.rf2xx_set_ieee_addr: IEEE address 06:af:0f:3d:9a:23:65:78
[00:00:00.003,000] <dbg> ieee802154_rf2xx.rf2xx_filter: Applying filter 1
[00:00:00.003,000] <dbg> ieee802154_rf2xx.rf2xx_set_short_addr: Short Address: 0xfffe
[00:00:00.005,000] <inf> net_openthread: -CORE----: Notifier: StateChanged (0x00000001) [Ip6+]
[00:00:00.005,000] <inf> net_l2_openthread: Ipv6 address added
[00:00:00.005,000] <inf> net_openthread: Idle -> Discover
[00:00:00.005,000] <dbg> ieee802154_rf2xx.rf2xx_filter: Applying filter 0
[00:00:00.005,000] <dbg> ieee802154_rf2xx.rf2xx_set_ieee_addr: IEEE address 8e:f3:5a:10:a3:7d:64:e1
[00:00:00.005,000] <inf> net_openthread: -CORE----: Notifier: StateChanged (0x00000001) [Ip6+]
[00:00:00.005,000] <inf> net_l2_openthread: Ipv6 address removed
[00:00:00.005,000] <inf> net_openthread: -CORE----: Notifier: StateChanged (0x00000001) [Ip6+]
[00:00:00.005,000] <inf> net_l2_openthread: Ipv6 address added
[00:00:00.005,000] <inf> net_openthread: -CORE----: Notifier: StateChanged (0x00000001) [Ip6+]
[00:00:00.006,000] <dbg> ieee802154_rf2xx.rf2xx_filter: Applying filter 2
[00:00:00.006,000] <dbg> ieee802154_rf2xx.rf2xx_set_pan_id: Pan Id: 0x75ce
[00:00:00.006,000] <inf> net_openthread: -CORE----: Notifier: StateChanged (0x00000001) [Ip6+]
[00:00:00.007,000] <inf> net_openthread: ▒▒
[00:00:00.007,000] <inf> net_openthread: -MESH-CP-: JoinerState: Idle -> Discover
[00:00:00.014,000] <inf> net_config: Initializing network
[00:00:00.014,000] <inf> net_openthread:
[00:00:00.014,000] <inf> net_l2_openthread: Ipv6 address added
[00:00:00.016,000] <err> log: argument 0 in log message "%s" missing log_strdup().
[00:00:00.040,000] <err> log: argument 0 in log message "%s" missing log_strdup().
[00:00:00.076,000] <err> log: argument 0 in log message "%s" missing log_strdup().
[00:00:00.101,000] <err> log: argument 0 in log message "%s" missing log_strdup().
[00:00:00.125,000] <err> log: argument 0 in log message "%s" missing log_strdup().
[00:00:00.160,000] <err> log: argument 0 in log message "%s" missing log_strdup().
[00:00:00.170,000] <err> log: argument 0 in log message "%s" missing log_strdup().
[00:00:00.179,000] <err> log: argument 0 in log message "%s" missing log_strdup().
[00:00:00.203,000] <err> log: argument 0 in log message "%s" missing log_strdup().
[00:00:03.014,000] <dbg> net_echo_client_sample.main: Run echo client
[00:00:03.747,000] <dbg> ieee802154_rf2xx.rf2xx_trx_rx: Caught a packet (59) (LQI: ff, RSSI: -54, ED: 28)
[00:00:03.747,000] <inf> net_core: net (0x20000040): RF2XX Rx stack stack real size 800 unused 392 usage 408/800 (51 %)
[00:00:03.748,000] <inf> net_openthread: er network: ba7240e1a6047e86, pan:0xd01d, port:49153, chan:23, rssi:-54, allow-any:no
[00:00:03.748,000] <inf> net_openthread: -MESH-CP-: Joiner discover network: ba7240e1a6047e86, pan:0xd01d, port:49153, chan:23, rssi:-54, allow-any:no
[00:00:03.754,000] <dbg> ieee802154_rf2xx.rf2xx_trx_rx: Caught a packet (59) (LQI: ff, RSSI: -53, ED: 29)
[00:00:03.754,000] <inf> net_core: net (0x20000040): RF2XX Rx stack stack real size 800 unused 392 usage 408/800 (51 %)
[00:00:03.755,000] <inf> net_openthread: er network: ba7240e1a6047e86, pan:0xd01d, port:49153, chan:23, rssi:-54, allow-any:no
[00:00:03.755,000] <inf> net_openthread: -MESH-CP-: Joiner discover network: ba7240e1a6047e86, pan:0xd01d, port:49153, chan:23, rssi:-54, allow-any:no
[00:00:03.769,000] <dbg> ieee802154_rf2xx.rf2xx_trx_rx: Caught a packet (59) (LQI: ff, RSSI: -54, ED: 28)
[00:00:03.769,000] <inf> net_core: net (0x20000040): RF2XX Rx stack stack real size 800 unused 392 usage 408/800 (51 %)
[00:00:03.770,000] <inf> net_openthread: er network: ba7240e1a6047e86, pan:0xd01d, port:49153, chan:23, rssi:-54, allow-any:no
[00:00:03.770,000] <inf> net_openthread: -MESH-CP-: Joiner discover network: ba7240e1a6047e86, pan:0xd01d, port:49153, chan:23, rssi:-54, allow-any:no

To Reproduce
Build echo_client with ot-overlay.conf and commission against a Thread BR, e.g. based on nrf52840.

Expected behavior
ACKs are being sent.

Impact
Device cannot be joined onto Thread network, because DTLS gets multiple unexpected link-layer retransmitted packets.

Environment (please complete the following information):

  • OS: Linux
  • Toolchain GNU ARM GCC
  • Commit SHA: 17299f0
@markus-becker-tridonic-com markus-becker-tridonic-com added the bug The issue is a bug, or the PR is fixing a bug label Jan 3, 2020
@markus-becker-tridonic-com
Copy link
Contributor Author

It seems sending FORCE_TRX_OFF stops the transmission of ACKs.

Also mentioned in the datasheet http://ww1.microchip.com/downloads/en/devicedoc/atmel-8351-mcu_wireless-at86rf233_datasheet.pdf:

RX_AACK - Receive with Automatic Acknowledgement
A state transition to RX_AACK_ON is initiated by writing the RX_AACK_ON command
to the register bits TRX_CMD. On success, reading register bits TRX_STATUS
(register 0x01, TRX_STATUS) returns RX_AACK_ON or BUSY_RX_AACK. The latter
one is returned when a frame is being received.
The RX_AACK Extended Operating Mode is left by writing a new command to the
register bits TRX_CMD. If the Atmel AT86RF233 is within a frame receive or
acknowledgment procedure (BUSY_RX_AACK), the state change is executed after
finishing. Alternatively, the commands FORCE_TRX_OFF or FORCE_PLL_ON can be
used to cancel the RX_AACK transaction and switch to TRX_OFF or PLL_ON state
respectively

Commenting the following, makes ACK work for me:

diff --git a/drivers/ieee802154/ieee802154_rf2xx.c b/drivers/ieee802154/ieee802154_rf2xx.c
index 876352857d..6fa9e2b7b8 100644
--- a/drivers/ieee802154/ieee802154_rf2xx.c
+++ b/drivers/ieee802154/ieee802154_rf2xx.c
@@ -69,12 +69,12 @@ static void trx_isr_timeout(struct k_timer *timer_id)
 static void rf2xx_trx_set_state(struct device *dev,
                                enum rf2xx_trx_state_cmd_t state)
 {
-       do {
-               rf2xx_iface_reg_write(dev, RF2XX_TRX_STATE_REG,
-                                     RF2XX_TRX_PHY_STATE_CMD_FORCE_TRX_OFF);
-       } while (RF2XX_TRX_PHY_STATUS_TRX_OFF !=
-                (rf2xx_iface_reg_read(dev, RF2XX_TRX_STATUS_REG) &
-                 RF2XX_TRX_PHY_STATUS_MASK));
+       // do {
+       //      rf2xx_iface_reg_write(dev, RF2XX_TRX_STATE_REG,
+       //                            RF2XX_TRX_PHY_STATE_CMD_FORCE_TRX_OFF);
+       // } while (RF2XX_TRX_PHY_STATUS_TRX_OFF !=
+       //       (rf2xx_iface_reg_read(dev, RF2XX_TRX_STATUS_REG) &
+       //        RF2XX_TRX_PHY_STATUS_MASK));
 
        do {
                rf2xx_iface_reg_write(dev, RF2XX_TRX_STATE_REG, state);
@@ -85,8 +85,8 @@ static void rf2xx_trx_set_state(struct device *dev,
 
 static void rf2xx_trx_set_rx_state(struct device *dev)
 {
-       rf2xx_trx_set_state(dev, RF2XX_TRX_PHY_STATE_CMD_TRX_OFF);
-       rf2xx_iface_reg_read(dev, RF2XX_IRQ_STATUS_REG);
+       // rf2xx_trx_set_state(dev, RF2XX_TRX_PHY_STATE_CMD_TRX_OFF);
+       // rf2xx_iface_reg_read(dev, RF2XX_IRQ_STATUS_REG);
        /**
         * Set extended RX mode
         * Datasheet: chapter 7.2 Extended Operating Mode
@@ -214,8 +214,8 @@ static void rf2xx_thread_main(void *arg)
                                /* Set PLL_ON to avoid transceiver receive
                                 * new data until finish reading process
                                 */
-                               rf2xx_trx_set_state(dev,
-                                                   RF2XX_TRX_PHY_STATE_CMD_PLL_ON);
+                               // rf2xx_trx_set_state(dev,
+                               //                  RF2XX_TRX_PHY_STATE_CMD_PLL_ON);
                                k_timer_stop(&ctx->trx_isr_timeout);
                                rf2xx_trx_rx(dev);
                                rf2xx_trx_set_state(dev,

@nandojve
Copy link
Member

nandojve commented Jan 3, 2020

@markus-becker-tridonic-com is nice to see that someone is really using this driver. Please, do not change rf2xx_trx_set_state method. This is how we change transceiver FSM states. The RF2XX_TRX_PHY_STATE_CMD_PLL_ON before rf2xx_trx_rx is currently used to avoid buffer overrun. You can disable it but be aware it can happen once RX is enabled. I hope people like you can help with precious data to improve the driver.

I'm planning add a RX buffer queue to improve in general.

@nandojve
Copy link
Member

nandojve commented Jan 4, 2020

Hi @markus-becker-tridonic-com, could you check if the below idea works, please? I'm without radios until Jan 13.

The system is currently configured with Dynamic Frame Buffer Protection enabled by RX_SAFE_MODE at (register 0x0C, TRX_CTRL_2). However, we need execute 2 rf2xx_iface_frame_read at rf2xx_trx_rx and Dynamic Frame Buffer Protection is released on the rising edge of pin 23 (/SEL) during the first Frame Buffer read access. I'll improve this situation using SRAM access in near future.

The below snip will ensure that ACK will be sent since transceiver is on RF2XX_TRX_PHY_STATUS_BUSY_RX_AACK state until finish all RX/ACK procedures, Figure 7-12. Flow Diagram of RX_AACK. After that, we ensures that new air frames don't override current buffer until finish RX procedure.

Is this helpfull?

diff --git a/drivers/ieee802154/ieee802154_rf2xx.c b/drivers/ieee802154/ieee802154_rf2xx.c
index 876352857d..a0f6caf453 100644
--- a/drivers/ieee802154/ieee802154_rf2xx.c
+++ b/drivers/ieee802154/ieee802154_rf2xx.c
@@ -211,6 +211,11 @@ static void rf2xx_thread_main(void *arg)
                        k_timer_start(&ctx->trx_isr_timeout, K_MSEC(10), 0);
                } else if (isr_status & (1 << RF2XX_TRX_END)) {
                        if (ctx->trx_state == RF2XX_TRX_PHY_BUSY_RX) {
+                               /* Ensures that automatically ACK will be sent when
+                                * requested */
+                               while (rf2xx_iface_reg_read(dev, RF2XX_TRX_STATUS_REG) ==
+                                          RF2XX_TRX_PHY_STATUS_BUSY_RX_AACK);
+
                                /* Set PLL_ON to avoid transceiver receive
                                 * new data until finish reading process
                                 */

@markus-becker-tridonic-com
Copy link
Contributor Author

The snippet above works as well. Thanks very much.

@markus-becker-tridonic-com
Copy link
Contributor Author

@nandojve do you think this could still make it into the 2.2 release?

@nandojve
Copy link
Member

@nandojve do you think this could still make it into the 2.2 release?

Hi @markus-becker-tridonic-com, on my point of view yes. The solution is ready at #21753. If you have conditions to share some test results may help close the issue faster.

nandojve added a commit to nandojve/zephyr that referenced this issue Jan 25, 2020
The RF2XX driver not always sent ACK when a RX frame requests. This
happen because RF2XX transceiver asserts TRX_END interrupt after
confirm that the FCS is valid. The driver can now decode the frame
but in parallel the radio still processing the ACK frame. This will
sync the radio FSM state to ensure that ACK will be send.

Fixes zephyrproject-rtos#21659

Signed-off-by: Gerson Fernando Budke <nandojve@gmail.com>
@markus-becker-tridonic-com
Copy link
Contributor Author

@nandojve solution is ACKing fine now:
image

complete OpenThread stack works in combination with #21763 .

nashif pushed a commit that referenced this issue Jan 28, 2020
The RF2XX driver not always sent ACK when a RX frame requests. This
happen because RF2XX transceiver asserts TRX_END interrupt after
confirm that the FCS is valid. The driver can now decode the frame
but in parallel the radio still processing the ACK frame. This will
sync the radio FSM state to ensure that ACK will be send.

Fixes #21659

Signed-off-by: Gerson Fernando Budke <nandojve@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area: Networking bug The issue is a bug, or the PR is fixing a bug priority: low Low impact/importance bug
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants