Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

OTA update Fails with [TB: preparing for update fails, Attributes might be NULL] #183

Closed
ChrSchultz opened this issue Dec 30, 2023 · 33 comments · Fixed by #180
Closed

OTA update Fails with [TB: preparing for update fails, Attributes might be NULL] #183

ChrSchultz opened this issue Dec 30, 2023 · 33 comments · Fixed by #180

Comments

@ChrSchultz
Copy link
Contributor

I've looked for all failures but I didn't get any idea for fixing this. On demo.thingsboard.io i have Version: esp8266_pt1k 0.0.2 and on my device ( D1 mini) there is version esp8266_pt1k 0.0.1.. the code is like yours.

@MathewHDYT
Copy link
Contributor

MathewHDYT commented Dec 30, 2023

This seems to be a case of the OTA_Update_Callback, being configured incorrectly. Can you attach your script, and are you sure your instance looks like this.

const OTA_Update_Callback callback(&progressCallback, &updatedCallback, CURRENT_FIRMWARE_TITLE, CURRENT_FIRMWARE_VERSION, &updater, FIRMWARE_FAILURE_RETRIES, FIRMWARE_PACKET_SIZE);

Especially important are the CURRENT_FIRMWARE_TITLE and CURRENT_FIRMWARE_VERSION which need to be passed and seem to be missing or are empty strings in your case.

Ensure you pass valid const char* to the methods and ensure they live on for the complete lifetime until the OTA Update is started. Variables need to be in global scope or static.

@ChrSchultz
Copy link
Contributor Author

ChrSchultz commented Dec 30, 2023

ok that was the first failure , corrected and now it works, but the device didn't download the Version 0.0.2 .
Telemetry-Key: fw_state = "UPDATED" ... with actual timestamp.

If I use the example I got the failure: not for us: Title different... but I use in device and thingsboard-cloud the same title... eg: "Test"

@MathewHDYT
Copy link
Contributor

Are you really sure the title are the same. The title configured in the OTA_Update_Callback has to be the same as from the OTA package assigned in ThingsBoard.

To check can you add a print to Firmware_Shared_Attribute_Received method in ThingsBoard.h after fw_title and curr_fw_title has been initalized.

Serial.printf("Received fw title: (%s), curr_fw_title: (%s)\n", fw_title, curr_fw_title);

@ChrSchultz
Copy link
Contributor Author

curr_fw_title : �!@
fw_title: TEST
but in demo.thingsboard.org is and in the code is all the same... i use the orginal Example file. I modified Thingsboard.h with in the .pio directory with printf("",,) to see curr_fw_title and fw_title. after failed comparision.
It seem that curr_fw_title wasn't read correct. I can give you my credentials for demo.thingsboard.io via mail so you can see.

@MathewHDYT
Copy link
Contributor

MathewHDYT commented Dec 31, 2023

As mentioned above in less detail, because the OTA_Updater_Callback does not save the actual memory to the title and version. It simply copies the pointer, meaning if the string is deleted while we are waiting to actually start downloading then the issue we can see in your case is occurring.

More simply the memory to the string has been erased already and it is now garbage and is then compared with from the cloud received JSON data. This results in the comparison failing, because the 2 strings can not be the same.

If you need further help I would need your main file and especially the location your FW_TITLE is located in the main file. Because the user utilising the library needs to ensure the string is kept alive until the OTA udpate has started. To achieve that either put the variable into global scope or make it static.

@ChrSchultz
Copy link
Contributor Author

I use the orgiginal example 0009-esp8266-esp32-proceess_ota.ino without StreamUtil support.

@MathewHDYT
Copy link
Contributor

MathewHDYT commented Dec 31, 2023

Okay good to know, perhaps there is some issue with flash strings, can you add this line on top of your main file.

#define THINGSBOARD_ENABLE_PROGMEM 0

@ChrSchultz
Copy link
Contributor Author

Okay good to know, perhaps there is some issue with flash strings, can you add this line on top of your main file.

#define THINGSBOARD_ENABLE_PROGMEM 0

is by default, because i use platformio and board=d1_mini that has #define ESP8266... i see it higlighted in editor

@MathewHDYT
Copy link
Contributor

Ah good to know you are on ESP8266. For now as a dirty fix can you go into the OTA_Update_Callback and change the line.

const char      *m_fwTitel;      // Current firmware title of device

to

std::string      m_fwTitel;      // Current firmware title of device

and additionally add an include to #include <string> at the top of the file under the include guards.

@ChrSchultz
Copy link
Contributor Author

ChrSchultz commented Dec 31, 2023

i tried to alter m_fwTitle but i got an compiler_error:

#include <string>
...
std:string m_fwTitle;
...
void OTA_Update_Callback::Set_Firmware_Title(const char *currFwTitle) {
    m_fwTitel = currFwTitle;
}

.pio\libdeps\d1_mini\ThingsBoard\src\OTA_Update_Callback.cpp: In member function 'void OTA_Update_Callback::Set_Firmware_Title(const char*)':
.pio\libdeps\d1_mini\ThingsBoard\src\OTA_Update_Callback.cpp:40:17: error: no match for 'operator=' (operand types are 'const string' {aka 'const std::__cxx11::basic_string'} and 'const char*')
40 | m_fwTitel = currFwTitle;

@MathewHDYT
Copy link
Contributor

Sorry my bad simply change

void OTA_Update_Callback::Set_Firmware_Title(const char *currFwTitle) {
    m_fwTitel = currFwTitle;
}

to this instead

void OTA_Update_Callback::Set_Firmware_Title(std::string currFwTitle) {
    m_fwTitel = currFwTitle;
}

@ChrSchultz
Copy link
Contributor Author

I altered all depending functions on std::string currFwTitle , and pushed to my fork in branch "testing". What now happens is that my ESP8266 abort due to some runtime errors. compilation works fine...
Here my exception log from esp8266_exeption_decoder:

0x4022179c in std::__throw_length_error(char const*) at ??:?
0x40222d22 in std::iterator_traits<char*>::difference_type std::__distance<char*>(char*, char*, std::random_access_iterator_tag) at /workdir/arena.x86_64/gcc-gnu/xtensa-lx106-elf/libstdc++-v3/include/bits/stl_iterator_base_funcs.h:104
 (inlined by) std::iterator_traits<char*>::difference_type std::distance<char*>(char*, char*) at /workdir/arena.x86_64/gcc-gnu/xtensa-lx106-elf/libstdc++-v3/include/bits/stl_iterator_base_funcs.h:141
 (inlined by) void std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >::_M_construct<char*>(char*, char*, std::forward_iterator_tag) at /workdir/arena.x86_64/gcc-gnu/xtensa-lx106-elf/libstdc++-v3/include/bits/basic_string.tcc:217
0x4020a456 in OTA_Update_Callback::Get_Firmware_Title[abi:cxx11]() const at ??:?
0x40208375 in ThingsBoardSized<8u, ThingsBoardDefaultLogger>::Firmware_Shared_Attribute_Received(ArduinoJson::V6214PB2::JsonObjectConst const&) at ??:?  
0x40222e2e in std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >::_M_limit(unsigned int, unsigned int) const at /workdir/arena.x86_64/gcc-gnu/xtensa-lx106-elf/libstdc++-v3/include/bits/basic_string.h:333
 (inlined by) std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >::basic_string(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, unsigned int, unsigned int) at /workdir/arena.x86_64/gcc-gnu/xtensa-lx106-elf/libstdc++-v3/include/bits/basic_string.h:484
0x40222e7a in std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >::substr(unsigned int, unsigned int) const at /workdir/arena.x86_64/gcc-gnu/xtensa-lx106-elf/libstdc++-v3/include/bits/basic_string.h:2837
0x40206f6e in ThingsBoardSized<8u, ThingsBoardDefaultLogger>::process_attribute_request_message(char*, ArduinoJson::V6214PB2::JsonObjectConst&) at ??:?  
0x401001e4 in ets_post at ??:?
0x40208c5a in _ZN11ArduinoJson8V6214PB26detail16JsonDeserializerINS1_13BoundedReaderIPhvEENS1_11StringMoverEE12parseVariantINS1_14AllowAllFilterEEENS0_20DeserializationError4CodeERNS1_11VariantDataET_NS0_21DeserializationOption12NestingLimitE$isra$0 at main.cpp:?
0x4020966f in PubSubClient::readByte(unsigned char*) at ??:?
0x4020ad68 in WiFiClient::available() at ??:?
0x40208ffc in ThingsBoardSized<8u, ThingsBoardDefaultLogger>::onMQTTMessage(char*, unsigned char*, unsigned int) at ??:?
0x40208839 in ThingsBoardSized<8u, ThingsBoardDefaultLogger>::Start_Firmware_Update(OTA_Update_Callback const&) at ??:?
0x40105d85 in ets_timer_arm_new at ??:?
0x4020b3e6 in WiFiClient::read() at ??:?
0x402096a0 in PubSubClient::readByte(unsigned char*) at ??:?

and:

0x4010303a in wDev_MacTim1Arm at ??:?
0x401030b1 in wDev_ProcessFiq at ??:?
0x40103054 in wDev_ProcessFiq at ??:?
0x40219a05 in _vsnprintf_r at /workdir/repo/newlib/newlib/libc/stdio/vsnprintf.c:71 (discriminator 4)
0x40219a05 in _vsnprintf_r at /workdir/repo/newlib/newlib/libc/stdio/vsnprintf.c:71 (discriminator 4)
0x40219a41 in vsnprintf at /workdir/repo/newlib/newlib/libc/stdio/vsnprintf.c:41
0x40219a05 in _vsnprintf_r at /workdir/repo/newlib/newlib/libc/stdio/vsnprintf.c:71 (discriminator 4)
0x40219a41 in vsnprintf at /workdir/repo/newlib/newlib/libc/stdio/vsnprintf.c:41
0x4020d7b5 in ets_printf_P at core_esp8266_postmortem.cpp:?
0x4020d7a8 in ets_printf_P at core_esp8266_postmortem.cpp:?
0x40219a41 in vsnprintf at /workdir/repo/newlib/newlib/libc/stdio/vsnprintf.c:41
0x4026f100 in etharp_output at ??:?
0x4021d68c in __ssputs_r at /workdir/repo/newlib/newlib/libc/stdio/nano-vfprintf.c:179
0x4020d7b5 in ets_printf_P at core_esp8266_postmortem.cpp:?
0x4020d7a8 in ets_printf_P at core_esp8266_postmortem.cpp:?
0x40102e78 in rcReachRetryLimit at ??:?
0x4020d802 in print_stack at core_esp8266_postmortem.cpp:?
0x4020daa6 in postmortem_report at core_esp8266_postmortem.cpp:?
0x4020d247 in operator new(unsigned int) at ??:?
0x4020db1a in raise_exception at core_esp8266_postmortem.cpp:?
0x4020db2c in __unhandled_exception at ??:?
0x4022179c in std::__throw_length_error(char const*) at ??:?
0x40222d22 in std::iterator_traits<char*>::difference_type std::__distance<char*>(char*, char*, std::random_access_iterator_tag) at /workdir/arena.x86_64/gcc-gnu/xtensa-lx106-elf/libstdc++-v3/include/bits/stl_iterator_base_funcs.h:104
 (inlined by) std::iterator_traits<char*>::difference_type std::distance<char*>(char*, char*) at /workdir/arena.x86_64/gcc-gnu/xtensa-lx106-elf/libstdc++-v3/include/bits/stl_iterator_base_funcs.h:141
 (inlined by) void std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >::_M_construct<char*>(char*, char*, std::forward_iterator_tag) at /workdir/arena.x86_64/gcc-gnu/xtensa-lx106-elf/libstdc++-v3/include/bits/basic_string.tcc:217
0x4020a456 in OTA_Update_Callback::Get_Firmware_Title[abi:cxx11]() const at ??:?
0x40208375 in ThingsBoardSized<8u, ThingsBoardDefaultLogger>::Firmware_Shared_Attribute_Received(ArduinoJson::V6214PB2::JsonObjectConst const&) at ??:?  
0x40222e2e in std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >::_M_limit(unsigned int, unsigned int) const at /workdir/arena.x86_64/gcc-gnu/xtensa-lx106-elf/libstdc++-v3/include/bits/basic_string.h:333
 (inlined by) std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >::basic_string(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, unsigned int, unsigned int) at /workdir/arena.x86_64/gcc-gnu/xtensa-lx106-elf/libstdc++-v3/include/bits/basic_string.h:484
0x40222e7a in std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >::substr(unsigned int, unsigned int) const at /workdir/arena.x86_64/gcc-gnu/xtensa-lx106-elf/libstdc++-v3/include/bits/basic_string.h:2837
0x40206f6e in ThingsBoardSized<8u, ThingsBoardDefaultLogger>::process_attribute_request_message(char*, ArduinoJson::V6214PB2::JsonObjectConst&) at ??:?  
0x401001e4 in ets_post at ??:?
0x40208c5a in _ZN11ArduinoJson8V6214PB26detail16JsonDeserializerINS1_13BoundedReaderIPhvEENS1_11StringMoverEE12parseVariantINS1_14AllowAllFilterEEENS0_20DeserializationError4CodeERNS1_11VariantDataET_NS0_21DeserializationOption12NestingLimitE$isra$0 at main.cpp:?
0x4020966f in PubSubClient::readByte(unsigned char*) at ??:?
0x4020ad68 in WiFiClient::available() at ??:?
0x40208ffc in ThingsBoardSized<8u, ThingsBoardDefaultLogger>::onMQTTMessage(char*, unsigned char*, unsigned int) at ??:?
0x40208839 in ThingsBoardSized<8u, ThingsBoardDefaultLogger>::Start_Firmware_Update(OTA_Update_Callback const&) at ??:?
0x40105d85 in ets_timer_arm_new at ??:?
0x4020b3e6 in WiFiClient::read() at ??:?
0x402096a0 in PubSubClient::readByte(unsigned char*) at ??:?
0x402097ba in PubSubClient::readPacket(unsigned char*) at ??:?
0x40209c6c in PubSubClient::loop() at ??:?
0x4020e458 in uart_write at ??:?
0x40210134 in std::_Function_handler<void (unsigned int const&, unsigned int const&), void (*)(unsigned int const&, unsigned int const&)>::_M_invoke(std::_Any_data const&, unsigned int const&, unsigned int const&) at ??:?
0x4020d664 in loop_wrapper() at core_esp8266_main.cpp:?
0x40100ec5 in cont_wrapper at ??:?

@MathewHDYT
Copy link
Contributor

I am not 100% sure but perhaps it is an issue with the constness.

const std::string Get_Firmware_Title() const;

Will return a const copy-by-value. Which means the one calling the method can not make the received value non-const.

To fix that please adjust the method like this.

std::string Get_Firmware_Title() const;

For the second error I am unsure, but for now simply go into the Configuration.h file and hardcode THINGSBOARD_ENABLE_PROGMEM to 0.

@ChrSchultz
Copy link
Contributor Author

ChrSchultz commented Jan 1, 2024

now it seems as fixed, OTA-Update works
created a Pull-request, will be glad, if all will be merged.
Thanks, and a very good new year :-)

@MathewHDYT
Copy link
Contributor

The problem is the std::string workaround is a dirty fix to copy the strings you pass in the constructor so the user can not mangle them up anymore.

But the underlying issue of the string being mangled even tough it should be in global scope still happens and that is what shouldn't occur. The problem is that I can't really reproduce your issue and as you mentioned you use the OTA example from the library so I am also not really sure were the error could be.


For now can you check out your branch without the std::string changes and go into the Configuration.h file and hardcode THINGSBOARD_ENABLE_PROGMEM to 0.

And then retry if it works, because I am assuming it has to do something with this. I currently can't imagine anything else, because that is the only real difference between the ESP8266 and the ESP32 version of the example.

@jortega11
Copy link

I was facing the same issue and could get it working with the changes proposed by the two of you, thank you very much! 😊

When I solved this, I got the same issue that I got in the past (see #159) where I was getting the error Given firmaware was NULL, then solved it in the same way by doing the initialization of the OTA_Update_Callback in the global scope and eventually I could get OTA Updates working again.

@tvanesse
Copy link

I am also getting the same error but can't fix it either by changing char* to std::string in the various places of OTA_Update_Callback.cpp/h, nor by defining

const OTA_Update_Callback callback(&progressCallback, &updatedCallback, CURRENT_FIRMWARE_TITLE, CURRENT_FIRMWARE_VERSION, &updater, FIRMWARE_FAILURE_RETRIES, FIRMWARE_PACKET_SIZE);

in the global scope.

Still getting the same error message:

[TB] Preparing for OTA firmware updates failed, attributes might be NULL

Although I made sure the value of CURRENT_FIRMWARE_TITLE is exactly the same as the one used in ThingsBoard. Any advice?

@MathewHDYT
Copy link
Contributor

MathewHDYT commented Jun 21, 2024

What are you using Subscribe_Firmware, or Update_Firmware. Because to start the udpate the device needs to have certain attributes set server side. The attributes found with this link in the INITIATED state need to be set.

Additionally which version of the library are you using?

@tvanesse
Copy link

@MathewHDYT I'm using Start_Firmware_Update(callback) and the version is thingsboard/ThingsBoard@^0.13.0.

Thanks for the tip regarding the required attributes, I'll double check they are present in ThingsBoard and see if that helps.

@tvanesse
Copy link

I confirm the shared attributes were already set correctly on the device
image

@MathewHDYT
Copy link
Contributor

MathewHDYT commented Jun 22, 2024

Okay good to know. Can you set THINGSBOARD_ENABLE_DEBUG to 1 in Configuration.h. It should print additional information which might help debug the issue.

@tvanesse
Copy link

Enabling debug, I'm getting one more line of log. Here is what Start_Firmware_Update spits out:

[TB] Sending data to server over topic (v1/devices/me/telemetry) with data ({"current_fw_title":"<myproject>-fw","current_fw_version":"1.0"})
[TB] Preparing for OTA firmware updates failed, attributes might be NULL

@MathewHDYT
Copy link
Contributor

MathewHDYT commented Jun 23, 2024

It seems that the Star_Firmware_Update method fails because Prepare_Firmware_Settings method returns false. That method checks if the received callback is valid and tries to send the content as telemetry to the cloud as can be seen from the log messages.

If everything worked it should return true, that does not seem to be the case tough.

Can you add additional log messages in the Prepare_Firmware_Settings method.

So it looks something like this. The likely case is that sending the telemtry did not work correctly.
Which should only occur if there is not an established MQTT connection.

bool Prepare_Firmware_Settings(OTA_Update_Callback const & callback) {
    char const * const currFwTitle = callback.Get_Firmware_Title();
    char const * const currFwVersion = callback.Get_Firmware_Version();


    if (Helper::stringIsNullorEmpty(currFwTitle) || Helper::stringIsNullorEmpty(currFwVersion)) {
        Logger::printfln("Invalid callback attributes");
        return false;
    }
    else if (!Firmware_Send_Info(currFwTitle, currFwVersion)) {
        Logger::printfln("Sending failed :("); // This message will probably be printed.
        return false;
    }

    m_fw_callback = callback;
    return true;
}

If the aforementioned message is printed can you ensure that you called the connect method on the ThingsBoard Instance, before trying to start the OTA firmware update. And additionally check if the connection is established with the connected() method and only then start the OTA firmware update.

@tvanesse
Copy link

Many thanks @MathewHDYT ! I feel so stupid, I indeed forgot to call tb.connect before starting the firmware update...

Anyway, now that I make sure the MQTT connexion is established before attempting to flash OTA, the problem seems to arise from the partition scheme of the ESP32 not being set properly. Here is the log (there is not additional log set in Prepare_Firmware_Settings because this now seems to be OK):

Sending firmware update request to ThingsBoard
[TB] Sending data to server over topic (v1/devices/me/telemetry) with data ({"current_fw_title":"<my-project>-fw","current_fw_version":"1.0"})
[TB] Sending data to server over topic (v1/devices/me/attributes/request/1) with data ({"sharedKeys":"fw_checksum,fw_checksum_algorithm,fw_size,fw_title,fw_version,"})
[TB] Received data from server over topic (v1/devices/me/attributes/response/1)
[TB] Calling subscribed callback for request with response id (1)
[TB] =================================
[TB] A new Firmware is available:
[TB] (1.0) => (0.3)
[TB] Attempting to download over MQTT...
[TB] Sending data to server over topic (v1/devices/me/telemetry) with data ({"fw_state":"DOWNLOADING"})
[TB] Failed to receive requested chunk (0) in (5000000) us. Internet connection might have been lost
[TB] Failed to receive requested chunk (0) in (5000000) us. Internet connection might have been lost
[TB] Failed to receive requested chunk (0) in (5000000) us. Internet connection might have been lost
[TB] Failed to receive requested chunk (0) in (5000000) us. Internet connection might have been lost
[TB] Failed to receive requested chunk (0) in (5000000) us. Internet connection might have been lost
[TB] Failed to receive requested chunk (0) in (5000000) us. Internet connection might have been lost
[TB] Received data from server over topic (v2/fw/response/0/chunk/0)
[TB] Receive chunk (0), with size (4096) bytes
[TB] Failed to initalize flash updater, ensure that the partition scheme has two app sections
[TB] Failed to receive requested chunk (0) in (5000000) us. Internet connection might have been lost
[TB] Failed to receive requested chunk (0) in (5000000) us. Internet connection might have been lost
[TB] Failed to receive requested chunk (0) in (5000000) us. Internet connection might have been lost
[TB] Failed to receive requested chunk (0) in (5000000) us. Internet connection might have been lost
[TB] Failed to receive requested chunk (0) in (5000000) us. Internet connection might have been lost
[TB] Failed to receive requested chunk (0) in (5000000) us. Internet connection might have been lost
[TB] Received data from server over topic (v2/fw/response/0/chunk/0)
[TB] Receive chunk (0), with size (4096) bytes
[TB] Failed to initalize flash updater, ensure that the partition scheme has two app sections

Two things:

  1. not sure why it times out during the download of chunk 0, because there is no problem with the internet connection
  2. clearly, the problem is more likely to come from the partition scheme because I'm using Platform.io with
board_build.partitions = min_spiffs.csv

and the project is using the Arduino framework (no ESP-IDF).

I think you might need the full picture to understand the context: I used to perform OTA flashing with this project by having the ESP32 broadcasting a Wifi SSID and running a webserver that allows user to upload a firmware file through a form that is being served by the ESP32.

There is also a config file in SPIFFS that holds a few variables allowing to tweak the firmware at runtime and that config file can also be updated through the form served by the ESP32. Now we want to move to ThingsBoard, mainly because we need to be able to flash a lot of devices with a new firmware at once.

With that background story in mind, I'm not sure what to do with the partition scheme. Maybe I can get ridd of the config file in SPIFFS and rely on Shared Attributes in ThingsBoard instead. That would be neat. But is there a nicely_crafted_scheme.csv I could provide in platformio.ini to get the right partition scheme for OTA to work with ThingsBoard?

Many thanks already for your support, that's really appreciated!

@MathewHDYT
Copy link
Contributor

MathewHDYT commented Jun 24, 2024

The method fails because the call to the specific implementation of fw_update.begin() fails. In your case I am assuming it uses the Espressif_Updater.

Mainly because that is the one recommended in the examples, because the Arduino Update version is not implemented very well.


Additonally I do not think it is because of the app partitions because the Arduino Framework min_spiffs.csv should contain 2 OTA app portions which is correct.

To actually find the underlying issue, can you adjust the begin method in the EspressifUpdater.cpp file like shown below:

bool Espressif_Updater::begin(size_t const & firmware_size) {
    printf("Attempting to start flashing binary data");
    esp_partition_t const * const running = esp_ota_get_running_partition();
    esp_partition_t const * const configured = esp_ota_get_boot_partition();

    if (configured != running) {
        printf("Configured boot partition does not contain a valid app, running partition was choosen as a fallback. We do not want to overwrite fallback partition, so we do not update");
        return false;
    }

    esp_partition_t const * const update_partition = esp_ota_get_next_update_partition(nullptr);

    if (update_partition == nullptr) {
        printf("Invalid OTA data partition, or no eligible OTA app slot partition was found");
        return false;
    }

    // Temporary handle is used, because it allows using a void* as the actual ota_handle,
    // allowing us to only include the esp_ota_ops header in the defintion (.cpp) file,
    // instead of also needing to declare it in the declaration (.h) header file
    esp_ota_handle_t ota_handle;
    esp_err_t const error = esp_ota_begin(update_partition, firmware_size, &ota_handle);

    if (error != ESP_OK) {
        printf("Initializing OTA update failed, because of error (%s)", esp_err_to_name(error));
        return false;
    }

    printf("Starting to flash binary data successfull");
    m_ota_handle = ota_handle;
    m_update_partition = update_partition;
    return true;
}

@tvanesse
Copy link

tvanesse commented Jun 26, 2024

I realise I'm actually using the SDCard_Updater. I switched to Espressif_Updater but now OTA_Update_Callback complains that it doesn't expect an updater of type Espressif_Updater.

Here is the relevant code:

#include <Espressif_Updater.h>
...
Espressif_Updater updater();
...

// FIXME: compilation error here
const OTA_Update_Callback callback(&progressCallback, &updatedCallback, "myproject-fw", "1.0", &updater, FIRMWARE_FAILURE_RETRIES, FIRMWARE_PACKET_SIZE);

More specifically, OTA_Update_Callback expects anything that implements IUpdater as the updater argument, but Espressif_Updater doesn't seem to? Strange because it does implement all the methods defined in IUpdater and is declared as such in Espressif_Updater.h

class Espressif_Updater : public IUpdater
...

Full compilation error:

no instance of constructor "OTA_Update_Callback::OTA_Update_Callback" matches the argument listC/C++(289)
main.cpp(731, 36): argument types are: (void (*)(const size_t &currentChunk, const size_t &totalChuncks), void (*)(const bool &success), const char [12], const char [4], Espressif_Updater (*)(), const uint8_t, const uint16_t)

@MathewHDYT
Copy link
Contributor

MathewHDYT commented Jun 26, 2024

EspressifUpdater implements IUpdater it's weird I had this problem reported a few times already.

But I don't get why probably some problems depending on the compiler used, because normally it should convert from the class to its interface wihtout any problem, because it is public.

Fow now to fix it you can simply add this line to make sure it has the expected type.

const OTA_Update_Callback callback(&progressCallback, &updatedCallback, "myproject-fw", "1.0", (IUpdater *)&updater, FIRMWARE_FAILURE_RETRIES, FIRMWARE_PACKET_SIZE);

@tvanesse
Copy link

Ok now I'm getting an error from the linker

Linking .pio/build/esp32dev/firmware.elf
.pio/build/esp32dev/src/main.cpp.o:(.literal._Z41__static_initialization_and_destruction_0ii+0x24): undefined reference to `updater()'

@MathewHDYT
Copy link
Contributor

Does the code look like this?

Espressif_Updater updater;
const OTA_Update_Callback callback(&progressCallback, &updatedCallback, "myproject-fw", "1.0", (IUpdater *)&updater, FIRMWARE_FAILURE_RETRIES, FIRMWARE_PACKET_SIZE);

@tvanesse
Copy link

tvanesse commented Jun 26, 2024

Ok super cool, there is some progress :-)
Now it's downloading the first chunk after a lot of timeouts and it fails to go through more than chunk 0.

Here is the complete log

Sending firmware update request to ThingsBoard
[TB] Sending data to server over topic (v1/devices/me/telemetry) with data ({"current_fw_title":"myproject-fw","current_fw_version":"1.0"})
[TB] Sending data to server over topic (v1/devices/me/attributes/request/1) with data ({"sharedKeys":"fw_checksum,fw_checksum_algorithm,fw_size,fw_title,fw_version,"})
[TB] Received data from server over topic (v1/devices/me/attributes/response/1)
[TB] Calling subscribed callback for request with response id (1)
[TB] =================================
[TB] A new Firmware is available:
[TB] (1.0) => (0.3)
[TB] Attempting to download over MQTT...
[TB] Sending data to server over topic (v1/devices/me/telemetry) with data ({"fw_state":"DOWNLOADING"})
[TB] Failed to receive requested chunk (0) in (5000000) us. Internet connection might have been lost
[TB] Failed to receive requested chunk (0) in (5000000) us. Internet connection might have been lost
[TB] Failed to receive requested chunk (0) in (5000000) us. Internet connection might have been lost
[TB] Failed to receive requested chunk (0) in (5000000) us. Internet connection might have been lost
[TB] Failed to receive requested chunk (0) in (5000000) us. Internet connection might have been lost
[TB] Failed to receive requested chunk (0) in (5000000) us. Internet connection might have been lost
[TB] Received data from server over topic (v2/fw/response/0/chunk/0)
[TB] Receive chunk (0), with size (4096) bytes
Downloading firmware progress 0.35%
Attempting to start flashing binary dataStarting to flash binary data successfull[TB] Failed to receive requested chunk (1) in (5000000) us. Internet connection might have been lost
[TB] Failed to receive requested chunk (1) in (5000000) us. Internet connection might have been lost
[TB] Failed to receive requested chunk (1) in (5000000) us. Internet connection might have been lost
[TB] Failed to receive requested chunk (1) in (5000000) us. Internet connection might have been lost
[TB] Failed to receive requested chunk (1) in (5000000) us. Internet connection might have been lost
[TB] Failed to receive requested chunk (1) in (5000000) us. Internet connection might have been lost
[TB] Received data from server over topic (v2/fw/response/0/chunk/0)
[TB] Received chunk (0), not the same as requested chunk (1)
[TB] Failed to receive requested chunk (1) in (5000000) us. Internet connection might have been lost
[TB] Failed to receive requested chunk (1) in (5000000) us. Internet connection might have been lost
[TB] Received data from server over topic (v2/fw/response/0/chunk/0)
[TB] Received chunk (0), not the same as requested chunk (1)

and it goes on.

I am going to investigate any access restriction on the server side.

@MathewHDYT
Copy link
Contributor

Can you increase the timeout? Simply adjust the constructor like this. I've increased it to 30 seconds now, perhaps this helps.

Espressif_Updater updater;
const OTA_Update_Callback callback(&progressCallback, &updatedCallback, "myproject-fw", "1.0", (IUpdater *)&updater, FIRMWARE_FAILURE_RETRIES, FIRMWARE_PACKET_SIZE, 30000000UL);

@tvanesse
Copy link

tvanesse commented Jun 26, 2024

Apparently that helps, but this is extremely slow and we still experience timeouts on some chunks, even with the 30s value.

[TB] Received data from server over topic (v2/fw/response/0/chunk/0)
[TB] Receive chunk (0), with size (4096) bytes
Downloading firmware progress 0.35%
Attempting to start flashing binary dataStarting to flash binary data successfull[TB] Failed to receive requested chunk (1) in (30000000) us. Internet connection might have been lost
[TB] Received data from server over topic (v2/fw/response/0/chunk/1)
[TB] Receive chunk (1), with size (4096) bytes
Downloading firmware progress 0.69%
[TB] Received data from server over topic (v2/fw/response/0/chunk/1)
[TB] Received chunk (1), not the same as requested chunk (2)
[TB] Failed to receive requested chunk (2) in (30000000) us. Internet connection might have been lost
[TB] Received data from server over topic (v2/fw/response/0/chunk/2)
[TB] Receive chunk (2), with size (4096) bytes
Downloading firmware progress 1.04%
[TB] Received data from server over topic (v2/fw/response/0/chunk/2)
[TB] Received chunk (2), not the same as requested chunk (3)
[TB] Received data from server over topic (v2/fw/response/0/chunk/3)
[TB] Receive chunk (3), with size (4096) bytes
Downloading firmware progress 1.39%
[TB] Received data from server over topic (v2/fw/response/0/chunk/4)
[TB] Receive chunk (4), with size (4096) bytes
Downloading firmware progress 1.74%
[TB] Received data from server over topic (v2/fw/response/0/chunk/5)
[TB] Receive chunk (5), with size (4096) bytes
Downloading firmware progress 2.08%
[TB] Received data from server over topic (v2/fw/response/0/chunk/6)
[TB] Receive chunk (6), with size (4096) bytes
Downloading firmware progress 2.43%
[TB] Received data from server over topic (v2/fw/response/0/chunk/7)
[TB] Receive chunk (7), with size (4096) bytes
Downloading firmware progress 2.78%

At this pace it would require 30 min to flash a single device...

I wonder if this is not related to the ESP32 broadcasting a Wifi SSID while running the update, hence putting a lot of load on the RF unit. I'll try to switch off the Wifi broadcast before starting the update and see if that helps.

EDIT: I disabled the Wifi SSID from the ESP but that doesn't help reduce the download time.

@MathewHDYT
Copy link
Contributor

I'm not sure I can help with this, because if the underlying connection is slow this library can't do anything about it.

Are you doing anything else besides updating OTA with the same device and it's WiFi connection and is it possible to pause that behaviour while the udpate is ongoing?

Perhaps reducing or increasing the packet size changes something. Simply change this attribute FIRMWARE_PACKET_SIZE to 1024 then 8192 and 16384 and see if the speed increases or decreases.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants