Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

dfu-util: Error during download (LIBUSB_ERROR_TIMEOUT) (IDFGH-5227) #6999

Closed
3 tasks done
majodi opened this issue May 8, 2021 · 32 comments
Closed
3 tasks done

dfu-util: Error during download (LIBUSB_ERROR_TIMEOUT) (IDFGH-5227) #6999

majodi opened this issue May 8, 2021 · 32 comments
Labels
Resolution: Done Issue is done internally Status: Done Issue is done internally

Comments

@majodi
Copy link

majodi commented May 8, 2021

  • I have read the documentation ESP-IDF Programming Guide and the issue is not addressed there.
  • I have updated my IDF branch (master or release) to the latest version and checked that the issue is present there.
  • I have searched the issue tracker for a similar issue and not found a similar issue.

Environment

  • Development Kit: ESP32S2 Saola
  • IDF version: v4.4-dev-1183-g9d34a1cd4
  • Build System: idf.py
  • Compiler version: 8.4.0
  • Operating System: macOS
  • Power Supply: USB

Problem Description

Running idf.py dfu-flash (after creating flash using idf.py dfy) gives the following output:

Executing action: dfu-flash
Running ninja in directory /Users/majodi/esp/ccfw/build
Executing "ninja dfu-flash"...
[0/1] cd /Users/majodi/esp/ccfw/build && /usr/local/Cellar/cmake/3.19.2/bin/cmake -D ESP_DFU_BIN="/Users/majodi/esp/ccfw/build/dfu.bin" -D ESP_DFU_PID="2" -P /Users/majodi/esp/esp-idf/tools/cmake/run_dfu_util.cmake
Command list: dfu-util;-d;303a:2;-D;/Users/majodi/esp/ccfw/build/dfu.bin
dfu-util 0.10

Copyright 2005-2009 Weston Schmidt, Harald Welte and OpenMoko Inc.
Copyright 2010-2020 Tormod Volden and Stefan Schmidt
This program is Free Software and has ABSOLUTELY NO WARRANTY
Please report bugs to http://sourceforge.net/p/dfu-util/tickets/

Opening DFU capable USB device...
ID 303a:0002
Run-time device DFU version 0110
Claiming USB DFU Runtime Interface...
Determining device status: state = appIDLE, status = 0
Device really in Runtime Mode, send DFU detach request...
dfu-util: Device will detach and reattach...
error detaching
Opening DFU USB Device...
Claiming USB DFU Interface...
Setting Alternate Setting #0 ...
Determining device status: state = dfuIDLE, status = 0
dfuIDLE, continuing
DFU mode device DFU version 0110
Device returned transfer size 64
Copying data from PC to DFU device
Download [ ] 0% 0 bytesdfu-util: Error during download (LIBUSB_ERROR_TIMEOUT)
Done

This error seems to appear only with 'larger' firmwares. In this case the following partition table:

Name, Type, SubType, Offset, Size, Flags

nvs, data, nvs, 0x9000, 0x6000,
phy_init, data, phy, 0xf000, 0x1000,
factory, app, factory, 0x10000, 0x120000,

If I remove one of the embedded files (COMPONENT_EMBED_TXTFILES) the firmware fits in the standard partition table and I get no error. Seems it takes too long to download the firmware if larger than normal.

So, I know dfu-util is an 'external' tool but I'm not sure how much the process can be tuned at the (rom?) counterpart side (if any) by Espressif. Or if there is some setting to increase the timout. All I know is that I'm stuck if my custom PCB (later on) doesn't have a uart bridge and I have to rely on flashing using the native usb. Is there another way (or tool) to use perhaps?

@espressif-bot espressif-bot added the Status: Opened Issue is new label May 8, 2021
@github-actions github-actions bot changed the title dfu-util: Error during download (LIBUSB_ERROR_TIMEOUT) dfu-util: Error during download (LIBUSB_ERROR_TIMEOUT) (IDFGH-5227) May 8, 2021
@dobairoland
Copy link
Collaborator

Hi @majodi. Thank you for pointing out a possible issue but I'm having difficulties to reproduce it. I have filled up the partition with an embedded file. I still can flash the large image with DFU. I'm running Linux so maybe the issue is with the driver of macOS. I'll try to find someone who will double check this on macOS.

Is there another way (or tool) to use perhaps?

The options are outlined on this page: https://docs.espressif.com/projects/esp-idf/en/latest/esp32s2/api-guides/usb-otg-console.html. You still can use USB CDC (you can entirely avoid DFU and UART). You can manually put the chip into ROM bootloader mode to do the initial upload.

@majodi
Copy link
Author

majodi commented May 10, 2021

You still can use USB CDC

Yes, I know, I'm using that now in a cumbersome way. I have to do a DFU with a small fw and no wifi because of this issue:

Monitor no access (error) over native USB after esp_wifi_init() (IDFGH-5196) #6969

Then I can use CDC with all embedded files and wifi on. I do loose my CDC port (because of this #6969 issue) but at least I'm up and running. For monitoring I have to use the Saola which has a uart bridge. Not ideal, so was hoping for another tool or something.

I'm running Linux

Okay, I tested on Linux too. This is what I get (maybe it is something stupid I do that I overlook somehow):

Executing action: dfu-flash
Running ninja in directory /home/majodi/dev/esp/ccfw/build
Executing "ninja dfu-flash"...
[0/1] cd /home/majodi/dev/esp/ccfw/bui...esp-idf/tools/cmake/run_dfu_util.cmake
Command list: dfu-util;-d;303a:2;-D;/home/majodi/dev/esp/ccfw/build/dfu.bin
dfu-util 0.9

Copyright 2005-2009 Weston Schmidt, Harald Welte and OpenMoko Inc.
Copyright 2010-2016 Tormod Volden and Stefan Schmidt
This program is Free Software and has ABSOLUTELY NO WARRANTY
Please report bugs to http://sourceforge.net/p/dfu-util/tickets/

Opening DFU capable USB device...
ID 303a:0002
Run-time device DFU version 0110
Claiming USB DFU Runtime Interface...
Determining device status: state = appIDLE, status = 0
Device really in Runtime Mode, send DFU detach request...
Device will detach and reattach...
dfu-util: error detaching
Opening DFU USB Device...
Claiming USB DFU Interface...
Setting Alternate Setting #0 ...
Determining device status: state = dfuIDLE, status = 0
dfuIDLE, continuing
DFU mode device DFU version 0110
Device returned transfer size 64
Copying data from PC to DFU device
Download [ ] 0% 1024 bytesdfu-util: Error during download
Done

@dobairoland
Copy link
Collaborator

@majodi Could you prepare a simple project which you can share with us where you reproduce the DFU flashing issue?

I have modified the blink example. I've added your partition table and added a large text file which is printed in the beginning before turning the LED on/off. I increased the size of the text file until it filled up the partition. I couldn't reproduce the issue in this setup.

@majodi
Copy link
Author

majodi commented May 10, 2021

@dobairoland I played around with the blink example but could not get it to fail (as with you). So I created a failing project (stripped as much as possible). For this example project "idf.py set-target esp32s2" and then "idf.py dfu" works but then "idf.py dfu-flash" gives the error. The project is here: https://github.com/majodi/issue6999

I'm sure it must be something stupid I overlook. Maybe you can spot it with a fresh mind. Hope I'm not wasting your time...

@dobairoland
Copy link
Collaborator

@majodi Thank you for taking the time and preparing a project. No, you are not wasting my time. I'm happy if I can help.

But I'm afraid my answer won't make you happy. I cloned your project and checked out the exact same version of ESP-IDF you are using. I'm on Linux with dfu-util 0.10 and using the Saola board v1.0.

I run:

idf.py build dfu
idf.py -p /dev/ttyACM0 dfu-flash

... and I was able to flash successfully:

Executing action: dfu-flash
Running ninja in directory /tmp/issue6999/build
Executing "ninja dfu-flash"...
[0/1] cd /tmp/issue6999/build && /usr/bin/cmake -D ESP_DFU_BIN="/tmp/issue6999/build/dfu.bin" -D ESP_DFU_PID="2" -P /home/dragon/esp/esp-idf/tools/cmake/run_dfu_util.cmake
Command list: dfu-util;-d;303a:2;-D;/tmp/issue6999/build/dfu.bin
dfu-util 0.10

Copyright 2005-2009 Weston Schmidt, Harald Welte and OpenMoko Inc.
Copyright 2010-2020 Tormod Volden and Stefan Schmidt
This program is Free Software and has ABSOLUTELY NO WARRANTY
Please report bugs to http://sourceforge.net/p/dfu-util/tickets/

Opening DFU capable USB device...
ID 303a:0002
Run-time device DFU version 0110
Claiming USB DFU Runtime Interface...
Determining device status: state = appIDLE, status = 0
Device really in Runtime Mode, send DFU detach request...
Device will detach and reattach...
dfu-util: error detaching
Opening DFU USB Device...
Claiming USB DFU Interface...
Setting Alternate Setting #0 ...
Determining device status: state = dfuIDLE, status = 0
dfuIDLE, continuing
DFU mode device DFU version 0110
Device returned transfer size 64
Copying data from PC to DFU device
Download        [=========================] 100%      1126400 bytes
Download done.
state(2) = dfuIDLE, status(0) = No error condition is present
Done!
Done

I can even repeat the process and I'm getting success every time.

Are you still using the Saola board? Do you have anything else connected? Can you check your USB cable?

@majodi
Copy link
Author

majodi commented May 11, 2021

Are you still using the Saola board?

Yes, and also other custom board (see below)

Do you have anything else connected?

No external peripherals connected to the board if that is what you mean.

Can you check your USB cable?

Used 3 different USB-C cables for the custom board (as it has a USB-C connector) and made a new USB cable with dupont pinouts for the Saola.

Okay. I'm stuck. This is what I have and have done:

I have a Saola board and I have my custon pcb with ESP32-S2-WROVER module on it.

  1. I get the same error on both macos (Big Sur 11.3.1) and ubuntu (20.04)
  2. I get the same error on both the Saola and the custom pcb
  3. To be sure I created a new USB cable with dupont connectors for the Saola (good shielded cable with ferrite core on it) --> no difference, still same error
  4. I use an USB-C cable and connector on the custom pcb, doesn't make any difference
  5. On macos dfu-util is version 0.10
  6. On ubuntu it was 0.9, I upgraded to 0.10 and newer version libusb --> no difference
  7. If I comment out the xterm file, the error is gone
  8. Uncomment --> error is back

Below you find dfu-util logs for totally different conditions.

I could buy a new Saola (maybe different chip version? although the modules on the cust.pcb are recent), I could try Windows10. Or maybe you know other debugging options to check what is going on or what is different with my setup compared to yours. I could make a screencast if you like.

Thnx.

So procedure:

clone issue6999
cd to directory
idf.py dfu (or as you do idf.py build dfu)
idf.py dfu-flash (or as you do with port)
--> error
edit (using vscode with code .)
comment "COMPONENT_EMBED_TXTFILES += html/xterm.js" in file "component.mk"
comment " "html/xterm.js" " in file CMakeLists.txt
comment:

httpd_resp_set_type(req, "application/javascript; charset=UTF-8");
extern const char xterm_js_start[] asm("_binary_xterm_js_start");
extern const char xterm_js_end[]   asm("_binary_xterm_js_end");
httpd_resp_send(req, xterm_js_start, xterm_js_end - xterm_js_start);

in file main.c
idf.py dfu (or as you do idf.py build dfu)
idf.py dfu-flash (or as you do with port)
--> error is gone

uncomment above --> error is back

same on macos/ubuntu/Saola/cust.pcb/usb/usb-c/different cables

Two logs from "dfu-util -vvv -d 303a:2 -D /Users/majodi/esp/issue6999/build/dfu.bin" (-vvv for max verbose output)

This is the verbose output (on macos, custom pcb, usb-c dfu-util v0.10, esp-idf v4.4-dev-1183-g9d34a1cd4) only the last bit:

[ 1.074107] [0021700b] libusb: debug [darwin_handle_transfer_completion] handling transfer completion type control with kernel status 0
[ 1.074109] [0021700b] libusb: debug [usbi_handle_transfer_completion] transfer 0x7feec9c08728 has callback 0x104eebd4c
[ 1.074110] [0021700b] libusb: debug [sync_transfer_cb] actual_length=6
[ 1.074112] [0021700b] libusb: debug [libusb_free_transfer] transfer 0x7feec9c08728
[ 1.074114] [0021700b] libusb: debug [libusb_alloc_transfer] transfer 0x7feec9c08728
[ 1.074115] [0021700b] libusb: debug [libusb_submit_transfer] transfer 0x7feec9c08728
[ 1.074127] [0021700b] libusb: debug [libusb_get_next_timeout] no URB with timeout or all handled by OS; no timeout!
[ 1.074129] [0021700b] libusb: debug [libusb_handle_events_timeout_completed] doing our own event handling
[ 1.074130] [0021700b] libusb: debug [usbi_wait_for_events] poll() 1 fds with timeout in 60000ms
[ 1.074353] [00217011] libusb: debug [darwin_async_io_callback] an async io operation has completed
[ 1.074360] [0021700b] libusb: debug [usbi_wait_for_events] poll() returned 1
[ 1.074363] [0021700b] libusb: debug [handle_event_trigger] event triggered
[ 1.074364] [0021700b] libusb: debug [darwin_handle_transfer_completion] handling transfer completion type control with kernel status 0
[ 1.074365] [0021700b] libusb: debug [usbi_handle_transfer_completion] transfer 0x7feec9c08728 has callback 0x104eebd4c
[ 1.074367] [0021700b] libusb: debug [sync_transfer_cb] actual_length=64
[ 1.074369] [0021700b] libusb: debug [libusb_free_transfer] transfer 0x7feec9c08728
[ 1.074370] [0021700b] libusb: debug [libusb_alloc_transfer] transfer 0x7feec9c08728
[ 1.074372] [0021700b] libusb: debug [libusb_submit_transfer] transfer 0x7feec9c08728
[ 1.074383] [0021700b] libusb: debug [libusb_get_next_timeout] no URB with timeout or all handled by OS; no timeout!
[ 1.074385] [0021700b] libusb: debug [libusb_handle_events_timeout_completed] doing our own event handling
[ 1.074386] [0021700b] libusb: debug [usbi_wait_for_events] poll() 1 fds with timeout in 60000ms
[ 1.074495] [00217011] libusb: debug [darwin_async_io_callback] an async io operation has completed
[ 1.074503] [0021700b] libusb: debug [usbi_wait_for_events] poll() returned 1
[ 1.074505] [0021700b] libusb: debug [handle_event_trigger] event triggered
[ 1.074506] [0021700b] libusb: debug [darwin_handle_transfer_completion] handling transfer completion type control with kernel status 0
[ 1.074507] [0021700b] libusb: debug [usbi_handle_transfer_completion] transfer 0x7feec9c08728 has callback 0x104eebd4c
[ 1.074509] [0021700b] libusb: debug [sync_transfer_cb] actual_length=6
[ 1.074510] [0021700b] libusb: debug [libusb_free_transfer] transfer 0x7feec9c08728
[ 1.074512] [0021700b] libusb: debug [libusb_alloc_transfer] transfer 0x7feec9c08728
[ 1.074514] [0021700b] libusb: debug [libusb_submit_transfer] transfer 0x7feec9c08728
[ 1.074524] [0021700b] libusb: debug [libusb_get_next_timeout] no URB with timeout or all handled by OS; no timeout!
[ 1.074526] [0021700b] libusb: debug [libusb_handle_events_timeout_completed] doing our own event handling
[ 1.074528] [0021700b] libusb: debug [usbi_wait_for_events] poll() 1 fds with timeout in 60000ms
[ 6.149950] [00217011] libusb: debug [darwin_async_io_callback] an async io operation has completed
[ 6.150017] [0021700b] libusb: debug [usbi_wait_for_events] poll() returned 1
[ 6.150028] [0021700b] libusb: debug [handle_event_trigger] event triggered
[ 6.150032] [0021700b] libusb: debug [darwin_handle_transfer_completion] handling transfer completion type control with kernel status -536854447
[ 6.150035] [0021700b] libusb: warning [darwin_transfer_status] transfer error: timed out
[ 6.150038] [0021700b] libusb: debug [usbi_handle_transfer_completion] transfer 0x7feec9c08728 has callback 0x104eebd4c
[ 6.150041] [0021700b] libusb: debug [sync_transfer_cb] actual_length=0
[ 6.150048] [0021700b] libusb: debug [libusb_free_transfer] transfer 0x7feec9c08728
dfu-util: Error during download (LIBUSB_ERROR_TIMEOUT)
[ 6.150068] [0021700b] libusb: debug [libusb_close]
[ 6.150073] [0021700b] libusb: debug [libusb_release_interface] interface 2
[ 6.150320] [0021700b] libusb: debug [libusb_exit]
[ 6.150329] [0021700b] libusb: debug [libusb_exit] destroying default context
[ 6.150332] [0021700b] libusb: debug [libusb_get_next_timeout] no URBs, no timeout!
[ 6.150334] [0021700b] libusb: debug [libusb_handle_events_timeout_completed] doing our own event handling
[ 6.150337] [0021700b] libusb: debug [usbi_wait_for_events] poll() 1 fds with timeout in 0ms
[ 6.150361] [0021700b] libusb: debug [usbi_wait_for_events] poll() returned 0
[ 6.150369] [0021700b] libusb: debug [usbi_remove_event_source] remove fd 3
[ 6.150450] [00217011] libusb: debug [darwin_event_thread_main] darwin event thread exiting

This is the verbose output (on ubuntu, Saola, usb dfu-util v0.9, esp-idf v4.4-dev-1254-g639e7ad49 ) so totally different setup only the last bit again:

[ 6.060580] [000103e1] libusb: debug [handle_control_completion] handling completion status 0
[ 6.060587] [000103e1] libusb: debug [disarm_timerfd]
[ 6.060594] [000103e1] libusb: debug [usbi_handle_transfer_completion] transfer 0x55efd5ebed80 has callback 0x7fd1f5d22fc0
[ 6.060601] [000103e1] libusb: debug [sync_transfer_cb] actual_length=6
[ 6.060609] [000103e1] libusb: debug [libusb_free_transfer] transfer 0x55efd5ebed80
[ 6.060617] [000103e1] libusb: debug [libusb_alloc_transfer] transfer 0x55efd5ec2580
[ 6.060623] [000103e1] libusb: debug [libusb_submit_transfer] transfer 0x55efd5ec2580
[ 6.060629] [000103e1] libusb: debug [add_to_flying_list] arm timerfd for timeout in 5000ms (first in line)
[ 6.060642] [000103e1] libusb: debug [libusb_handle_events_timeout_completed] doing our own event handling
[ 6.060649] [000103e1] libusb: debug [handle_events] poll() 3 fds with timeout in 60000ms
[ 6.060881] [000103e1] libusb: debug [handle_events] poll() returned 1
[ 6.060892] [000103e1] libusb: debug [reap_for_handle] urb type=2 status=0 transferred=64
[ 6.060897] [000103e1] libusb: debug [handle_control_completion] handling completion status 0
[ 6.060902] [000103e1] libusb: debug [disarm_timerfd]
[ 6.060910] [000103e1] libusb: debug [usbi_handle_transfer_completion] transfer 0x55efd5ec2580 has callback 0x7fd1f5d22fc0
[ 6.060917] [000103e1] libusb: debug [sync_transfer_cb] actual_length=64
[ 6.060924] [000103e1] libusb: debug [libusb_free_transfer] transfer 0x55efd5ec2580
[ 6.060929] [000103e1] libusb: debug [libusb_alloc_transfer] transfer 0x55efd5ec1480
[ 6.060936] [000103e1] libusb: debug [libusb_submit_transfer] transfer 0x55efd5ec1480
[ 6.060942] [000103e1] libusb: debug [add_to_flying_list] arm timerfd for timeout in 5000ms (first in line)
[ 6.060953] [000103e1] libusb: debug [libusb_handle_events_timeout_completed] doing our own event handling
[ 6.060960] [000103e1] libusb: debug [handle_events] poll() 3 fds with timeout in 60000ms
[ 6.061105] [000103e1] libusb: debug [handle_events] poll() returned 1
[ 6.061116] [000103e1] libusb: debug [reap_for_handle] urb type=2 status=0 transferred=6
[ 6.061121] [000103e1] libusb: debug [handle_control_completion] handling completion status 0
[ 6.061126] [000103e1] libusb: debug [disarm_timerfd]
[ 6.061135] [000103e1] libusb: debug [usbi_handle_transfer_completion] transfer 0x55efd5ec1480 has callback 0x7fd1f5d22fc0
[ 6.061142] [000103e1] libusb: debug [sync_transfer_cb] actual_length=6
[ 6.061150] [000103e1] libusb: debug [libusb_free_transfer] transfer 0x55efd5ec1480
[ 6.061157] [000103e1] libusb: debug [libusb_alloc_transfer] transfer 0x55efd5ebcac0
[ 6.061164] [000103e1] libusb: debug [libusb_submit_transfer] transfer 0x55efd5ebcac0
[ 6.061171] [000103e1] libusb: debug [add_to_flying_list] arm timerfd for timeout in 5000ms (first in line)
[ 6.061183] [000103e1] libusb: debug [libusb_handle_events_timeout_completed] doing our own event handling
[ 6.061190] [000103e1] libusb: debug [handle_events] poll() 3 fds with timeout in 60000ms
[11.061307] [000103e1] libusb: debug [handle_events] poll() returned 1
[11.061354] [000103e1] libusb: debug [handle_events] timerfd triggered
[11.061365] [000103e1] libusb: debug [libusb_cancel_transfer] transfer 0x55efd5ebcac0
[11.061485] [000103e1] libusb: debug [disarm_timerfd]
[11.061516] [000103e1] libusb: debug [libusb_handle_events_timeout_completed] doing our own event handling
[11.061529] [000103e1] libusb: debug [handle_events] poll() 3 fds with timeout in 60000ms
[11.061550] [000103e1] libusb: debug [handle_events] poll() returned 1
[11.061569] [000103e1] libusb: debug [reap_for_handle] urb type=2 status=-2 transferred=64
[11.061580] [000103e1] libusb: debug [handle_control_completion] handling completion status -2
[11.061591] [000103e1] libusb: debug [usbi_handle_transfer_cancellation] detected timeout cancellation
[11.061604] [000103e1] libusb: debug [disarm_timerfd]
[11.061618] [000103e1] libusb: debug [usbi_handle_transfer_completion] transfer 0x55efd5ebcac0 has callback 0x7fd1f5d22fc0
[11.061638] [000103e1] libusb: debug [sync_transfer_cb] actual_length=64
[11.061655] [000103e1] libusb: debug [libusb_free_transfer] transfer 0x55efd5ebcac0
dfu-util: Error during download
[11.061691] [000103e1] libusb: debug [libusb_close]
[11.061711] [000103e1] libusb: debug [usbi_remove_pollfd] remove fd 9
[11.061767] [000103e1] libusb: debug [libusb_exit]
[11.061779] [000103e1] libusb: debug [libusb_exit] destroying default context
[11.061789] [000103e1] libusb: debug [libusb_handle_events_timeout_completed] doing our own event handling
[11.061798] [000103e1] libusb: debug [handle_events] poll fds modified, reallocating
[11.061815] [000103e1] libusb: debug [handle_events] poll() 2 fds with timeout in 0ms
[11.061831] [000103e1] libusb: debug [handle_events] poll() returned 0
[11.061844] [000103e1] libusb: debug [usbi_remove_pollfd] remove fd 6
[11.061883] [000103e1] libusb: debug [usbi_remove_pollfd] remove fd 8
[11.061964] [000103e4] libusb: debug [linux_udev_event_thread_main] udev event thread exiting

please let me know if you need more.

@dobairoland
Copy link
Collaborator

@majodi The Ubuntu and Mac are two different machines (computers), right?

I've got a bad feeling about USB-C. I cannot try with that. Are you able to try it with good old USB 2.0? For example, through an USB bridge, Raspberry Pi, ...

@majodi
Copy link
Author

majodi commented May 12, 2021

@dobairoland yes, one is an iMac (tried both the normal USB and USB-C it has) and the other (Ubuntu) is an old Dell Optiplex with normal USB (2.0).

Also, the flashing works okay on all setups when leaving off one embedded file.

@dobairoland
Copy link
Collaborator

I will ask someone else to look at this issue because I'm out of ideas. I'm sorry.

@majodi
Copy link
Author

majodi commented May 12, 2021

Thnx. Just let me know what you need.

@Spritetm
Copy link
Member

Hm, I'm also unsure of what's going on here... It's almost like the device writes to flash but gets hung up there. The decoding and writing of a DFU file is somewhat complicated (an ESP32 DFU file actually is a CPIO archive, for instance) so perhaps you managed to trigger a bug there that causes the code to go out to lunch on an edge case (triggered by e.g. a specific file size or offset). Just to confirm this, can you upload a dfu file that fails? (You should be able to find it in the build/ directory of your project).

@majodi
Copy link
Author

majodi commented May 13, 2021

Good idea. This (attached) is the dfu that fails to flash. It's this code: https://github.com/majodi/issue6999 and commenting the following lines will reduce the bin's file size to 840 KB and make it flash again. There's no need to edit component.mk and CMakeLists.txt just comment these lines to make it flash:

httpd_resp_set_type(req, "application/javascript; charset=UTF-8");
extern const char xterm_js_start[] asm("_binary_xterm_js_start");
extern const char xterm_js_end[] asm("_binary_xterm_js_end");
httpd_resp_send(req, xterm_js_start, xterm_js_end - xterm_js_start);

I'm eager to hear if this flashes or not on your side. If not, there must be something going on here with the build process and we narrowed it down. In that case we should compare stuff.

dfu.bin.zip

@Spritetm
Copy link
Member

Yep, that DFU replicates the issue on my side (Debian Linux, kernel 5.10.0-6-amd64). Time to start debugging, I guess.

@majodi
Copy link
Author

majodi commented May 13, 2021

Good news! I will think about what would be good items to compare. Let me know what your thoughts are on candidates to compare. Thnx.

@dobairoland
Copy link
Collaborator

dobairoland commented May 13, 2021

I can flash successfully the attached dfu.bin. (Archlinux 5.12.2-arch1-1, dfu-util 0.10, Saola v1.0)

@Spritetm
Copy link
Member

That makes it more confusing... perhaps it's still a timing thing? Let me check...

@majodi
Copy link
Author

majodi commented May 13, 2021

@dobairoland auch! That's crazy.

@Spritetm
Copy link
Member

Okay, I think I found it. The issue is that the ROM will erase the region a partition is in as soon as it receives the first bit of the data that is in the partition. This means that for larger partitions, larger regions of flash will need to be erased - which takes more time. In your specific case, because you embed a fairly large javascript file in the flash, the main partition is large enough to take >5 seconds to erase. Unfortunately, dfu-utils has a hardcoded 5 second timeout, and for you and me, that gets triggered. My guess is that @dobairoland is sitting in a colder room than we are (or has a less worn-out flash chip, or just got lucky) and his chip managed to erase within this timeout.

The trivial solution is to get the sources to dfu-util and change the timeout (src/dfu.c, line 33) to something longer. Longer term, we may need to change our dfu image generator to break up partitions that are >1MiB into multiple split partition files so they don't hit the timeout.

@majodi
Copy link
Author

majodi commented May 13, 2021

That's a known issue, but I didn't dig into it because it was about erasing, but now it makes sense. I will try to find the link.

@majodi
Copy link
Author

majodi commented May 13, 2021

Ah, just look for "dfu-util timeout erasing" and you'll see multiple cases. Stupid of me to not mention this before (just didn't make the connection as it was about erasing).

@Spritetm
Copy link
Member

That looks all somewhat related indeed; most of these point to some DFU variation that does erasing as a separate (not technically DFU compliant) step, so I can imagine you didn't initially think of this.

@majodi
Copy link
Author

majodi commented May 14, 2021

@Spritetm @dobairoland Happy to confirm that recompiling dfu-util with a timeout of 20sec did indeed do the job. Thank you. Should I close this issue?

@dobairoland
Copy link
Collaborator

Great @majodi. I'm happy. Nice catch by @Spritetm!

Please keep this open. We need a proper solution instead of a workaround.

@espressif-bot espressif-bot added Status: In Progress Work is in progress and removed Status: Opened Issue is new labels May 14, 2021
@dobairoland
Copy link
Collaborator

@majodi Could you please try to flash the following binary with the unmodified version of dfu-util? ccfw.bin inside is split up into a 1MB file and another small one.

dfu.zip

@majodi
Copy link
Author

majodi commented May 14, 2021

@dobairoland With the unmodified dfu-util it didn't flash (timeout), with the modified (20sec) it did flash ok.

@dobairoland
Copy link
Collaborator

Thank you @majodi for your help. I've prepared another two files. Would you please be kind and try them? I hope I've got it right this time. The file was split up so the max. partition size is 1MiB and 512 kB, respectively.

dfu.zip

@majodi
Copy link
Author

majodi commented May 15, 2021

@dobairoland Unmodified dfu-util:

dfu_1Mi.bin does NOT flash
dfu_512k.bin flash WORKS (it looks like it stops for a second at 48% but then carries on)

I don't know if you have CONFIG_ESP_CONSOLE_USB_CDC=y, but after reset I don't get a port with this one either (see #6969). But that's another thing.

@dobairoland
Copy link
Collaborator

@majodi Thank you very much for your help. I'll adjust the partition size according to your results.

The CDC should be enabled in these binaries because I have used your settings: https://github.com/majodi/issue6999/blob/master/sdkconfig.defaults#L9

@majodi
Copy link
Author

majodi commented May 16, 2021

@dobairoland Okay. That means that issue #6969 manifests itself with this bin too, as I don't get a CDC port after flashing. However it does appear when turning off wifi init code.

@dobairoland
Copy link
Collaborator

I can see the CDC port and connect to it after flashing your project wit dfu-util. It shows a warning about the wifi password being too short and aborts in loop.

@majodi
Copy link
Author

majodi commented May 17, 2021

@dobairoland Okay, maybe not the right place here in this issue to go on about CDC port not appearing ( should be in #6969 perhaps) but this is what I get (repeatable):

Your dfu_512k.bin (posted here in dfu.zip) flashed with normal unmodified dfu-util --> no CDC port after flashing/reset

code from https://github.com/majodi/issue6999 with only line (main) 41-44 commented (to be able to use unmodified dfu-util) flashed with normal unmodified dfu-util --> CDC port appears after reset

code from https://github.com/majodi/issue6999, nothing commented, flashed with modified dfu-util (now have to use modified dfu-util with longer timeout) --> no CDC port after flashing/reset

code from https://github.com/majodi/issue6999, only line (main) 97 (wifi) commented, flashed with modified dfu-util (now have to use modified dfu-util with longer timeout) --> CDC port appears after reset

@espressif-bot espressif-bot added Status: Done Issue is done internally Resolution: Done Issue is done internally and removed Status: In Progress Work is in progress labels May 31, 2021
projectgus pushed a commit that referenced this issue Jun 18, 2021
ROM will erase the region a partition is in as soon as it receives the
first bit of the data that is in the partition. For large partitions it
takes more than 5 seconds to erase which is a hard-coded limit in
dfu-utils.

This splits large binaries and adds them by chunks which should avoid
timing-out during flashing.

Closes #6999
projectgus pushed a commit that referenced this issue Jun 30, 2021
ROM will erase the region a partition is in as soon as it receives the
first bit of the data that is in the partition. For large partitions it
takes more than 5 seconds to erase which is a hard-coded limit in
dfu-utils.

This splits large binaries and adds them by chunks which should avoid
timing-out during flashing.

Closes #6999
@tormodvolden
Copy link

Unfortunately, dfu-utils has a hardcoded 5 second timeout

The 5 seconds timeout in dfu-util is for USB requests, which always should be replied to immediately. The bootloader should use the DFU mechanism to tell how long dfu-util should wait before issuing the /next/ USB request. The bootloader should definitely not get busy doing something that might take long time before replying to this USB request.

The real solution is to make the bootloader DFU compliant.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Resolution: Done Issue is done internally Status: Done Issue is done internally
Projects
None yet
Development

No branches or pull requests

5 participants