Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[TW#16347] Calling uart_flush(2) corrupts data thereafter (IDFGH-3370) #1202

Closed
edkerekes opened this issue Oct 31, 2017 · 19 comments
Closed

Comments

@edkerekes
Copy link

Description:
Calling uart_flush(2) seems to corrupt the fifo or ring data.
It seems that once corrupted, a hard reset is the only fix.

Steps to reproduce:

  1. I am using the latest esp-idf master 4048db3.

  2. Connect TX2 to RX2 with a jumper wire.

  3. Run the provided code.
    The sample code will reproduce the issue reliably within a few seconds.
    All received data from UART2 after the loop “count == 5” is then corrupted.

The sample code performs the following basic logic:

    Initialize UART2
    Count = 0
    while (true) {
    	If (count == 5)  {
    		uart_flush(2).
    	}
    	Send 1 line of text to UART2 “abcdefghijklmnopqrst_(xxx)”
    	Read the line of text back from UART2 with RX timeout of 1 second.	
    	Log the received data to the console.
    	Count = Count + 1
    }

The board that I am using a esp-wroom-32 dev kit, but I doubt that it is board dependent.

Code to Reproduce

    /* uart_flush_test
     * This task demonstrates that after calling uart_flush(2) one time, 
     * any additional data received from UART2 is corrupted. 
     * 
     * Requirements:
     * 1. UART2:
     *    Requires a single jumper wire to connect TX2 to RX2 so data 
     *    sent on UART2 can be received back.
     * 2. Received test data is Logged to the standard console 
     *    using ESP_LOGx().
     */
    #include <stdio.h>
    #include <string.h>
    #include "esp_log.h"
    #include "freertos/FreeRTOS.h"
    #include "freertos/task.h"
    #include "driver/uart.h"
    
    static const char *TAG = "uart_flush_test";
    
    #define UART_TO_TEST  UART_NUM_2	/* Uart to test. */
    
    /* Use any 2 available I/O pins and connect them together with a 
     * jumper wire. Note pins 34 and higher can only be inputs. 
     */
    #define TXD_PIN  (GPIO_NUM_33)
    #define RXD_PIN  (GPIO_NUM_34)
    
    #define BUF_SIZE (1024)
    
    /* Configure parameters of UART driver,
     * communication pins and install the driver 
     */
    void Configure_Uart(int uart_num, int tx_pin, int rx_pin) {
        uart_config_t uart_config = { 
                .baud_rate = 115200, 
                .data_bits = UART_DATA_8_BITS, 
                .parity = UART_PARITY_DISABLE, 
                .stop_bits = UART_STOP_BITS_1, 
                .flow_ctrl = UART_HW_FLOWCTRL_DISABLE 
        };
        if (uart_param_config(uart_num, &uart_config) < 0) {
            ESP_LOGE(TAG, "uart_param_config() error!");
        }
        if (uart_set_pin(uart_num, tx_pin, rx_pin, UART_PIN_NO_CHANGE,
                UART_PIN_NO_CHANGE) < 0) {
            ESP_LOGE(TAG, "uart_set_pin() error!");
        }
        if (uart_driver_install(uart_num, BUF_SIZE * 2, 0, 0, NULL, 0) < 0) {
            ESP_LOGE(TAG, "uart_driver_install() error!");
        }
    }
    
    void app_main() {
        int len;
        uint8_t buf[BUF_SIZE];
        char s[200];
        memset(buf, 0, sizeof(buf));
        memset(s, 0, sizeof(s));
        Configure_Uart(UART_TO_TEST, TXD_PIN, RXD_PIN);
        ESP_LOGI(TAG, "UART_TO_TEST = %d", UART_TO_TEST);
    
        int count = 0;
        while (true) {
            if (count == 5) {
                /* Flush only 1 time (after sending some data). */
                ESP_LOGI(TAG, "Calling uart_flush(%d)", UART_TO_TEST);
                uart_flush (UART_TO_TEST);
            }
    
            /* Write data to the UART. */
            sprintf(s, "abcdefghijklmnopqrst_(%03d)",count);
            len = uart_write_bytes(UART_TO_TEST, s, strlen(s));
            if (len < 0) {
                ESP_LOGE(TAG, "Could not write bytes! (%d)", len);
            }
            
            /* Read data back from the UART */
            len = uart_read_bytes(UART_TO_TEST, buf, BUF_SIZE - 1,
                    1000 / portTICK_RATE_MS);
            if (len > 0) {
                /* Log received data to the console. */
                buf[len] = '\0';
                ESP_LOGI(TAG, ">%s<", buf);
            } else {
                ESP_LOGE(TAG, "RX Timeout (check tx pin is connected to rx pin");
            }
    
            count++;
        }
    }

Captured Log

    MONITOR
    --- idf_monitor on /dev/ttyUSB0 115200 ---
    --- Quit: Ctrl+] | Menu: Ctrl+T | Help: Ctrl+T followed by Ctrl+H ---
    ets Jun  8 2016 00:22:57
    
    rst:0x1 (POWERON_RESET),boot:0x13 (SPI_FAST_FLASH_BOOT)
    configsip: 0, SPIWP:0xee
    clk_drv:0x00,q_drv:0x00,d_drv:0x00,cs0_drv:0x00,hd_drv:0x00,wp_drv:0x00
    mode:DIO, clock div:2
    load:0x3fff0018,len:4
    load:0x3fff001c,len:5604
    load:0x40078000,len:0
    load:0x40078000,len:12532
    entry 0x40078f84
    I (28) boot: ESP-IDF v3.0-dev-1057-g4048db35 2nd stage bootloader
    I (28) boot: compile time 22:40:28
    I (28) boot: Enabling RNG early entropy source...
    I (35) boot: SPI Speed      : 40MHz
    I (39) boot: SPI Mode       : DIO
    I (43) boot: SPI Flash Size : 4MB
    I (47) boot: Partition Table:
    I (50) boot: ## Label            Usage          Type ST Offset   Length
    I (58) boot:  0 nvs              WiFi data        01 02 00009000 00006000
    I (65) boot:  1 phy_init         RF data          01 01 0000f000 00001000
    I (73) boot:  2 factory          factory app      00 00 00010000 00100000
    I (80) boot: End of partition table
    I (84) esp_image: segment 0: paddr=0x00010020 vaddr=0x3f400020 size=0x0476c ( 18284) map
    I (97) esp_image: segment 1: paddr=0x00014794 vaddr=0x3ffb0000 size=0x016e4 (  5860) load
    I (103) esp_image: segment 2: paddr=0x00015e80 vaddr=0x40080000 size=0x00400 (  1024) load
    0x40080000: _iram_start at /cust/ipsolutions/smartplug/esp32/esp-idf/components/freertos/./xtensa_vectors.S:1685
    
    I (111) esp_image: segment 3: paddr=0x00016288 vaddr=0x40080400 size=0x07e70 ( 32368) load
    I (127) esp_image: segment 4: paddr=0x0001e100 vaddr=0x400c0000 size=0x00000 (     0) load
    I (129) esp_image: segment 5: paddr=0x0001e108 vaddr=0x00000000 size=0x01f08 (  7944) 
    I (139) esp_image: segment 6: paddr=0x00020018 vaddr=0x400d0018 size=0x14840 ( 84032) map
    0x400d0018: _stext at ??:?
    
    I (164) boot: Loaded app from partition at offset 0x10000
    I (164) boot: Disabling RNG early entropy source...
    I (165) cpu_start: Pro cpu up.
    I (168) cpu_start: Single core mode
    I (173) heap_init: Initializing. RAM available for dynamic allocation:
    I (180) heap_init: At 3FFAE6E0 len 00001920 (6 KiB): DRAM
    I (186) heap_init: At 3FFB1E50 len 0002E1B0 (184 KiB): DRAM
    I (192) heap_init: At 3FFE0440 len 00003BC0 (14 KiB): D/IRAM
    I (198) heap_init: At 3FFE4350 len 0001BCB0 (111 KiB): D/IRAM
    I (205) heap_init: At 40088270 len 00017D90 (95 KiB): IRAM
    I (211) cpu_start: Pro cpu start user code
    I (5) cpu_start: Starting scheduler on PRO CPU.
    I (6) uart_flush_test: UART_TO_TEST = 2
    I (1006) uart_flush_test: >abcdefghijklmnopqrst_(000)<
    I (2006) uart_flush_test: >abcdefghijklmnopqrst_(001)<
    I (3006) uart_flush_test: >abcdefghijklmnopqrst_(002)<
    I (4006) uart_flush_test: >abcdefghijklmnopqrst_(003)<
    I (5006) uart_flush_test: >abcdefghijklmnopqrst_(004)<
    I (5006) uart_flush_test: Calling uart_flush(2)
    I (6006) uart_flush_test: >efghijklmnopqrst_(005)abab<      <<<<< CORRUPTION STARTS HERE
    I (7006) uart_flush_test: >efghijklmnopqrst_(006)abab<
    I (8006) uart_flush_test: >efghijklmnopqrst_(007)abab<
    I (9006) uart_flush_test: >efghijklmnopqrst_(008)abab<
    I (10006) uart_flush_test: >efghijklmnopqrst_(009)cdef<
    I (11006) uart_flush_test: >efghijklmnopqrst_(010)cdef<
    I (12006) uart_flush_test: >efghijklmnopqrst_(011)cdef<
    I (13006) uart_flush_test: >efghijklmnopqrst_(012)cdef<
    I (14006) uart_flush_test: >efghijklmnopqrst_(013)cdef<
    I (15006) uart_flush_test: >efghijklmnopqrst_(014)cdef<
    I (16006) uart_flush_test: >efghijklmnopqrst_(015)cdef<
@llewellynnu
Copy link

I had reported this issues a while back on the forums and didn't get any response for it
https://esp32.com/viewtopic.php?f=13&t=2804&p=13403&hilit=uart2#p13403.

I didn't get a solution for this so far. I have written a flush function myself just to read the remaining data from the buffer as a temporary fix.

@edkerekes
Copy link
Author

Thanks for the link.
I also observe the other issue you mention: Calling uart_flush(1) corrupts the data received on uart2. By the way, in this case the uart2 corruption will not happen unless some characters have been first transmitted to uart1 before calling uart_flush(1). Thereafter the data received from uart2 will be corrupted.

llewellyn wrote:
A uart flush on UART1 also affects UART2 read.

ESP_Angus wrote:
"I can't explain this, unless there is also a stray UART2 flush somewhere in the code. If you're sure this is happening then it points to bug in the driver."

That response to your observation makes me think I should provide a simple piece of code showing that calling uart_flush(1) affects uart2. Maybe that might attract some interest in this issue?

@edkerekes edkerekes reopened this Nov 2, 2017
@edkerekes
Copy link
Author

Sorry - reopening - I closed this by mistake.

@negativekelvin
Copy link
Contributor

This problem seems related to

uart_reset_fifo(uart_num);

Does not occur when commented out. Also do we want to reset both tx & rx fifo in flush?

@edkerekes
Copy link
Author

@negativekelvin wrote:

Does not occur when commented out. Also do we want to reset both tx & rx fifo in flush?

Yes, after calling uart_reset_fifo(), the ring buffer seems "out of sync" with any new incoming characters. So maybe something is missing in the above ring buffer code?

It does seem reasonable to have individual control over rx and tx flush. Maybe uart_tx_flush() and uart_rx_flush()? For backwards compatibility uart_flush() could call both.
Though on second look, the code appears to ONLY operate on the rx ring buffer. So I'm not sure what the original goal was...

@negativekelvin
Copy link
Contributor

It doesn't seem to be related to ringbuf but to the FIFO and FIFO reset is a hardware function

@negativekelvin
Copy link
Contributor

If we simply empty the FIFO by reading out the bytes instead of resetting it, it seems to work

@edkerekes
Copy link
Author

Interesting.
However, I'm not sure reading the fifo will return any characters in this test loop. If you notice my loop waits 1 second when calling uart_read_bytes(). So I believe that the fifo will already be empty at the time the uart_flush() routine gets called in this simple test case.

As a side note, in my actual situation (not this test loop), the rx fifo has overflowed and I want to recover from the fifo overflow condition. So I was hoping the flush would empty all rx data and reset the hardware including it's overflow flag so that I could recover without having to issue a hard system reset.

@negativekelvin
Copy link
Contributor

negativekelvin commented Nov 3, 2017

I know, I changed the test by putting the flush between write and read.

That is how fifo overflow is handled by ISR:

} else if(uart_intr_status & UART_RXFIFO_OVF_INT_ST_M) {
UART_ENTER_CRITICAL_ISR(&uart_spinlock[uart_num]);
uart_reg->conf0.rxfifo_rst = 1;
uart_reg->conf0.rxfifo_rst = 0;
uart_reg->int_clr.rxfifo_ovf = 1;
UART_EXIT_CRITICAL_ISR(&uart_spinlock[uart_num]);
uart_event.type = UART_FIFO_OVF;

@edkerekes
Copy link
Author

Emptying the fifo might be a good solution. I don't know if that would clear an overflow condition or not. I'm not sure how to force a fifo overflow for this simple test environment.

However, in my actual application program I'm polling another device. If I set a breakpoint right after the poll message goes completely out, the system stops at the breakpoint and then the other device responds with more than 128 characters. At that point since irq's and the ring buffer are stopped (by my breakpoint), the hardware fifo overflows. So I'm looking for a way to recover from situations where the fifo has overflowed.

One obvious solution would be to use hardware flow control, but that isn't an option due to the limited pin count.

@negativekelvin
Copy link
Contributor

@costaud any thoughts?

@negativekelvin
Copy link
Contributor

@Spritetm @ginkgm can you confirm whether this is issue with fifo internal reset function?

@llewellynnu
Copy link

@igrr Thoughts?

@igrr
Copy link
Member

igrr commented Nov 8, 2017

Yes, this is caused by a hardware issue with FIFO reset. The software fix is in review now and should be merged soon.

@FayeY FayeY changed the title Calling uart_flush(2) corrupts data thereafter [TW#16347] Calling uart_flush(2) corrupts data thereafter Nov 10, 2017
igrr pushed a commit that referenced this issue Nov 13, 2017
Reported from github:
#1219
#1202

After providing a simple code to digital team, they confirmed that this is a hardware bug.

Root cause:
The fifo reset signal is incorrectly connected
If we want to reset tx fifo of UART2, we have to set txfifo_rst bit of both UART1 and UART2
If we want to reset rx fifo of UART2, we have to set rxfifo_rst bit of both UART1 and UART2

Workaround:
we don't use fifo rst bit in driver.

Documentation:
Digital team would update TRM and give more explanation about this issue.
@FayeY FayeY closed this as completed Dec 1, 2017
@jeremyherbert
Copy link
Contributor

@igrr does this hardware bug apply to the TX fifo as well? I'm getting old data coming out of the TX pin on UART2 after a software reset, but after changing to UART1 everything works fine. Full power cycle fixes the problem once (until software reset, then it's back to normal). There doesn't seem to be any workaround code for clearing the TX fifo, it's just toggling the reset bit.

@github-actions github-actions bot changed the title [TW#16347] Calling uart_flush(2) corrupts data thereafter [TW#16347] Calling uart_flush(2) corrupts data thereafter (IDFGH-3370) May 25, 2020
@koobest
Copy link
Contributor

koobest commented May 25, 2020

Hi, @jeremyherbert
Yes, txfifo reset also have a hardware bug, a fix is comming. by the way, can you tell your IDF version?

@jeremyherbert
Copy link
Contributor

quite recent: v4.2-dev-1575-ga3520970f with xtensa-esp32-elf-gcc (crosstool-NG esp-2020r1) 8.2.0

Is there some hack I can put in place in the meantime? This one is a bit of a showstopper for me.

@koobest
Copy link
Contributor

koobest commented May 25, 2020

Change this line in components/esp32/system_api_esp32.c

//esp_restart_noos
DPORT_SET_PERI_REG_MASK(DPORT_PERIP_RST_EN_REG, DPORT_TIMERS_RST | DPORT_SPI01_RST | DPORT_UART_RST | DPORT_UART1_RST | DPORT_UART2_RST | DPORT_UART_MEM_RST);

and remove this line in components/driver/uart.c

//uart_param_config
uart_hal_txfifo_rst(&(uart_context[uart_num].hal));

@jeremyherbert
Copy link
Contributor

That works for me; now my issue is gone with UART2. Thank you!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

7 participants