Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

WIP: driver: uart: stm32: implement async api #14916

Closed
wants to merge 7 commits into from

Conversation

jli157
Copy link
Contributor

@jli157 jli157 commented Mar 26, 2019

By using DMA transfer, implement UART Async API for STM32F4 boards. So far, it is still under development and some issues still exist:

  • Test TX
  • Implement missing APIs.
  • validate with /tests/drivers/uart/uart_async_api
  • Remove self test application
  • Rebase and clean up

So, the current one is just for review and to get suggestions.

The test vehicle is a NUCLEO F429ZI board.

The PR is trying to resolve the issue: #13955, and base on PR #25459 and #25492
Also, parts of the implementation referred to the implementation from @StefJar.

@zephyrbot
Copy link
Collaborator

zephyrbot commented Mar 26, 2019

All checks are passing now.

Review history of this comment for details about previous failed status.
Note that some checks might have not completed yet.

@erwango erwango added area: UART Universal Asynchronous Receiver-Transmitter platform: STM32 ST Micro STM32 labels Mar 26, 2019
@erwango erwango added this to the v1.15.0 milestone Mar 26, 2019
@jli157
Copy link
Contributor Author

jli157 commented Mar 26, 2019

@StefJar Please review. Thanks! RX is working well. TX still has some issues.

@jli157 jli157 force-pushed the dma_uart_stm32 branch 3 times, most recently from 62186f0 to 5726f64 Compare March 27, 2019 01:22
@erwango erwango added the area: DMA Direct Memory Access label Mar 27, 2019
@jli157 jli157 changed the title [DNM] driver: uart: stm32: implement async api driver: uart: stm32: implement async api Apr 1, 2019
@erwango
Copy link
Member

erwango commented Apr 11, 2019

@jli157, please use #13364 to complete your work

@jli157
Copy link
Contributor Author

jli157 commented Apr 11, 2019

@jli157, please use #13364 to complete your work

Sure, will work on that.

@erwango
Copy link
Member

erwango commented Apr 12, 2019

@jli157: Can you clarify difference between this PR and #15375?

@jli157
Copy link
Contributor Author

jli157 commented Apr 12, 2019

@jli157: Can you clarify difference between this PR and #15375?

This one is still based on the existing STM32 DMA driver and could be dropped off and the PR #15375 will be the replacement for this one if the PR #13364 is finally merged. Basically, both are same.

@jli157 jli157 force-pushed the dma_uart_stm32 branch from 7b1e970 to c5c253d Compare May 6, 2019 21:03
@jli157 jli157 requested a review from nashif as a code owner May 6, 2019 21:03
@jli157 jli157 force-pushed the dma_uart_stm32 branch from c5c253d to 8c50058 Compare May 6, 2019 21:40
@zephyrproject-rtos zephyrproject-rtos deleted a comment from codecov-io May 7, 2019
@jli157 jli157 force-pushed the dma_uart_stm32 branch 4 times, most recently from 71015e6 to cc1f3fe Compare May 7, 2019 23:14
@github-actions github-actions bot added has-conflicts Issue/PR has conflicts with another issue/PR and removed has-conflicts Issue/PR has conflicts with another issue/PR labels Jul 2, 2020
jli157 added 2 commits July 9, 2020 07:58
Enable UART DMA configurations for all STM32F4 SoCs.

Signed-off-by: Jun Li <jun.r.li@intel.com>
enable two dmas to support async uart.

Signed-off-by: Jun Li <jun.r.li@intel.com>
jli157 added 4 commits July 9, 2020 08:19
Implement asycn UART APIs by enabling DMA transmission
between UART and the data buffer.

* Enabled replacing RX buffer
* If the replacing buffer is same as the current one, no replacement
* Fixed RX buffer counter issue
* Reformatted event function names.
* Make sure the buffer replacement happens only when the current one
  is full.
* Use an inline `async_timer_start` to do timer starting job.
* rebased with c99 types

Signed-off-by: Jun Li <jun.r.li@intel.com>
Define UART_DEV_NAME as "UART_6" if the board
nucleo_f429zi is selected.

Signed-off-by: Jun Li <jun.r.li@intel.com>
chained read buffer needs to rotate.

Signed-off-by: Jun Li <jun.r.li@intel.com>
Signed-off-by: Jun Li <jun.r.li@intel.com>
@giansta
Copy link
Contributor

giansta commented Jul 21, 2020

Hello @jli157
I'm trying to use this PR code to run it on our stm32l462 platform. We have four serial lines for which we defined the dma resource in dtsi files in a similar way as you did. Just a minor change was needed in dma_stm32.c, as the L4 family has two distinct TX and RX data registers.
I just started testing, yet something didn't work. I'm running the tests/drivers/uart/uart_async_api and this is showing logs forever from the drivers and never hitting the comparison of TX and RX buffer in first test test_single_read(). Moreover, the logs show only RX events forever instead of the TX events I was expecting first.

START - test_single_read DMA 5/7 0 0 1 1 rx_rdy: (0 1) 00 DMA 5/7 0 0 1 1 rx_rdy: (1 5) 00 00 00 00 00 DMA 5/7 0 0 1 1 rx_rdy: (5 5) 00 00 00 00 00 DMA 5/7 0 0 1 1 rx_rdy: (0 4) 00 00 00 00 DMA 5/7 0 0 1 1 rx_rdy: (0 2) 00 00 DMA 5/7 0 0 1 1 rx_rdy: (2 8) 00 00 00 00 00 00 00 00 DMA 5/7 0 0 1 1 rx_rdy: (0 3) 00 00 00

Here I tried to add some info in the prints. Lines with "DMA" report id=5 max_streams=7 source_periph=0 (mem) busy=0 src_size=dst_size=1; the RX lines print data that don't seem real.
I'm using a serial line that is connected to another microcontroller which implements the loopback, yet this is not receiving at moment.
Most probably I'm doing some mistake, I need some more basic test to run and understand better the code. Yet maybe you
could have the right hint. Thanks.

For testing purpose, it could be removed after the PR
is finished.

Signed-off-by: Jun Li <jun.r.li@intel.com>
@jli157
Copy link
Contributor Author

jli157 commented Jul 21, 2020

Hello @jli157
I'm trying to use this PR code to run it on our stm32l462 platform. We have four serial lines for which we defined the dma resource in dtsi files in a similar way as you did. Just a minor change was needed in dma_stm32.c, as the L4 family has two distinct TX and RX data registers.
I just started testing, yet something didn't work. I'm running the tests/drivers/uart/uart_async_api and this is showing logs forever from the drivers and never hitting the comparison of TX and RX buffer in first test test_single_read(). Moreover, the logs show only RX events forever instead of the TX events I was expecting first.

START - test_single_read DMA 5/7 0 0 1 1 rx_rdy: (0 1) 00 DMA 5/7 0 0 1 1 rx_rdy: (1 5) 00 00 00 00 00 DMA 5/7 0 0 1 1 rx_rdy: (5 5) 00 00 00 00 00 DMA 5/7 0 0 1 1 rx_rdy: (0 4) 00 00 00 00 DMA 5/7 0 0 1 1 rx_rdy: (0 2) 00 00 DMA 5/7 0 0 1 1 rx_rdy: (2 8) 00 00 00 00 00 00 00 00 DMA 5/7 0 0 1 1 rx_rdy: (0 3) 00 00 00

Here I tried to add some info in the prints. Lines with "DMA" report id=5 max_streams=7 source_periph=0 (mem) busy=0 src_size=dst_size=1; the RX lines print data that don't seem real.
I'm using a serial line that is connected to another microcontroller which implements the loopback, yet this is not receiving at moment.
Most probably I'm doing some mistake, I need some more basic test to run and understand better the code. Yet maybe you
could have the right hint. Thanks.

Hi The loopback test is conducted actually on the same port of the same board. You need to connect the TX pin to the RX pin of the same UART port before running the tests.

Anyway, this PR has not fully gone through the uart async test cases. Some cases fails without obvious reasons.

Another test it could be helpful to you is the newest patch I just uploaded (7e1609f): you can use a FTDI USB-UART cable to connect your computer and the board, and run the python script inside the test directory to see if it works.

@giansta
Copy link
Contributor

giansta commented Jul 23, 2020

Hi @jli157
Thank you for your additional test, I believe it will be very helpful.
In the meantime I was testing both with your uart_async_api test and with our application, where I could start the TX either in interrupt or DMA mode.
Using two apps I could find some mistakes I did in porting to STM32L4 and did a small progress. I'm reporting now few observations.

  1. Using our app I have now debugging issues, both using eclipse and gdb. This didn't happen before using the zephyr branch with async uart. Yet debugging works when using uart_async_api test, I have to find out why.
  2. Using our app I can see the waveform of the 5-byte transmission, yet it's bit different in the last part and a bit shorter than when transmitting with interrupt.
  3. I'm keeping using the other processor as receiver, yet we're using HW flow control in the BSP. Using our app I see that the RTS is always high (inactive) when using DMA, preventing the other processor to reply; instead it's well handled when using interrupt and the other processor replies well with loopback echo.
  4. Using the uart_async_api test with HW flow control, the RTS is always low (active), but nothing appears on the TX pin, even if the TX_DONE event comes.

I'm investigating on these points and in parallel I'm going now to remove the HW flow control to test a simplified setup.
Any your suggestion is welcome. Thanks!

@giansta
Copy link
Contributor

giansta commented Jul 29, 2020

@QSQCaito I think the implementation is almost done except some bugfix work is left. You can proceed to do the tests on your platform.

I found very strange bug when running uart_async_test as follows:

In the chained reading test, there are six reading operations to read 10 bytes every time. However, on the round 3 right after the last buffer was replaced, uart RX mistakenly reported to have read 20 bytes instead of the normal 10 bytes, which is really puzzling me. I guess maybe the DMA RX is not well reset after the buffer replacement happens. But I can't find the root cause. Maybe you can do the same test on your platform to see if the behaviors are same?

By the way, the test is a UART self loop test. You need to connect the TX pin of the UART port you are choosing to its RX pin.

Thank you!

Hi @jli157
I believe to have found the reason for the chained reading test fails. It's because of usage of "number of data to transfer register" (DMA_CNDTRx) as the number of received bytes: indeed the dma_stm32_get_status() (and LL_DMA_GetDataLength()) are returning the number of bytes "to be transmitted".
So, in the round 3, the 20 bytes are the remaining bytes, while received bytes are 30-20=10.
In the previous rounds the things work the same, either because the circular buffer mode reloads the value zero with the buffer length or because the received and remaining bytes are same (e.g. 5=10-5).
At moment I'm trying to fix this without circular buffer mode (testing the L4 family platform that @QSQCaito was mentioning). I still need to check the other test cases.
Then we should figure out if there is a way to keep using the circular buffer mode.
I'll put my changes in another branch as son as they work better. In the meantime could you please check if this diagnosis is right or not? Thanks.

@jli157
Copy link
Contributor Author

jli157 commented Jul 30, 2020

@QSQCaito I think the implementation is almost done except some bugfix work is left. You can proceed to do the tests on your platform.
I found very strange bug when running uart_async_test as follows:
In the chained reading test, there are six reading operations to read 10 bytes every time. However, on the round 3 right after the last buffer was replaced, uart RX mistakenly reported to have read 20 bytes instead of the normal 10 bytes, which is really puzzling me. I guess maybe the DMA RX is not well reset after the buffer replacement happens. But I can't find the root cause. Maybe you can do the same test on your platform to see if the behaviors are same?
By the way, the test is a UART self loop test. You need to connect the TX pin of the UART port you are choosing to its RX pin.
Thank you!

Hi @jli157
I believe to have found the reason for the chained reading test fails. It's because of usage of "number of data to transfer register" (DMA_CNDTRx) as the number of received bytes: indeed the dma_stm32_get_status() (and LL_DMA_GetDataLength()) are returning the number of bytes "to be transmitted".
So, in the round 3, the 20 bytes are the remaining bytes, while received bytes are 30-20=10.
In the previous rounds the things work the same, either because the circular buffer mode reloads the value zero with the buffer length or because the received and remaining bytes are same (e.g. 5=10-5).
At moment I'm trying to fix this without circular buffer mode (testing the L4 family platform that @QSQCaito was mentioning). I still need to check the other test cases.
Then we should figure out if there is a way to keep using the circular buffer mode.
I'll put my changes in another branch as son as they work better. In the meantime could you please check if this diagnosis is right or not? Thanks.

@giansta Thanks for the findings! Yes, please upload your branch so that I can take a look. And sorry for late reply. Got busy on other stuffs recently.

@giansta
Copy link
Contributor

giansta commented Jul 31, 2020

@giansta Thanks for the findings! Yes, please upload your branch so that I can take a look. And sorry for late reply. Got busy on other stuffs recently.

Hi @jli157
I have pushed the main changes in my branch https://github.com/giansta/zephyr/commits/dma_uart_stm32_gs
The fix I was talking about is in commit b4c30f7. It would be nice if you could check if this works the same also for F4 family.
I'm cleaning my code and I'll push few other commits.
Currently the test_chained_read is passing, but the test_double_buffer and following ones are still failing.
Anyway I'm checking another fix that I believe should work at least for test_double_buffer case.

@giansta
Copy link
Contributor

giansta commented Jul 31, 2020

@jli157
I've pushed few other fixes in my branch.
The latest one, 64fd666, fixes the double buffer test. Unfortunately the following ones are still failing, hopefully we'll fix them soon.
The commit 94aa7e3 is intented to fix a minor bug and simplify the code. I found it while debugging, as the following line looked wrong to me.

stream->source_periph = stream->direction == MEMORY_TO_PERIPHERAL;

I changed it into PERIPHERAL_TO_MEMORY and my test were running the same, most probably because source_burst_length or dest_burst_length are normally the same in my use cases. Anyway, it would be nice if you could confirm that it's not breaking the test in your platform.

@jli157
Copy link
Contributor Author

jli157 commented Jul 31, 2020

@jli157
I've pushed few other fixes in my branch.
The latest one, 64fd666, fixes the double buffer test. Unfortunately the following ones are still failing, hopefully we'll fix them soon.
The commit 94aa7e3 is intented to fix a minor bug and simplify the code. I found it while debugging, as the following line looked wrong to me.

stream->source_periph = stream->direction == MEMORY_TO_PERIPHERAL;

I changed it into PERIPHERAL_TO_MEMORY and my test were running the same, most probably because source_burst_length or dest_burst_length are normally the same in my use cases. Anyway, it would be nice if you could confirm that it's not breaking the test in your platform.

Thank you! I'll take a look at your patches? Can I merge parts of them (or maybe full) into this PR if they do fix the same issues on F4 as well? I'd like to ask you to be the co-authors of this PR.

@giansta
Copy link
Contributor

giansta commented Jul 31, 2020

Thank you! I'll take a look at your patches? Can I merge parts of them (or maybe full) into this PR if they do fix the same issues on F4 as well? I'd like to ask you to be the co-authors of this PR.

It would be perfect, please merge and/or amend them if needed. Thanks!

@giansta
Copy link
Contributor

giansta commented Aug 13, 2020

@jli157 I have rebased my branch and added some commits.

Now six tests of eight are passing on our platform.
About the two failing test, I don't understand why the test_long_buffers expects to send 1000 bytes and have the first RX ready event after 524 bytes and the other at the end. This seems a behaviour of some given platform, not the general case, in my opinion.
I still have to investigate why the last one, test_forever_timeout, is failing, I gave up at moment.

I rather tried to use again the Circular buffer mode.
My conclusion is that it's not compatible with the double buffer handling, because, when the buffer is filled, the DMA uses the buffer circularly without any ISR that can allow a buffer replace. So we should rather use a single buffer and don't generate any UART_RX_BUF_REQUEST and UART_RX_BUF_RELEASED events.
In my opinion the solution should imply a new config option or a change in the API.

Another point is the errors handling. Now our board is using the HW flow control and everything seems to work fine.
Yet, without flow control and using the Chained buffer mode, when the buffer is filled and the replace buffer process is happening (as you were mentioning, using Circular buffer mode would be more efficient and avoid this), an overrun error happens.
I patched this clearing the UART flag, but most probably we need to pass whatever error (overrun, parity, frame) to the uart_stm32_dma_rx_cb() to have a more robust driver.
This change shouldn't be difficult, I can try it out later on.

@MaureenHelm MaureenHelm removed this from the v2.4.0 milestone Sep 5, 2020
@github-actions
Copy link

This pull request has been marked as stale because it has been open (more than) 60 days with no activity. Remove the stale label or add a comment saying that you would like to have the label removed otherwise this pull request will automatically be closed in 14 days. Note, that you can always re-open a closed pull request at any time.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area: Boards area: Devicetree area: DMA Direct Memory Access area: Samples Samples area: Tests Issues related to a particular existing or missing test area: UART Universal Asynchronous Receiver-Transmitter has-conflicts Issue/PR has conflicts with another issue/PR platform: STM32 ST Micro STM32 Stale
Projects
None yet
Development

Successfully merging this pull request may close these issues.