-
Notifications
You must be signed in to change notification settings - Fork 307
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
OTA redesign discussion (once again) #346
Comments
I vote 1 (@brodtm) |
I vote 2 (@boneskull) |
I vote current |
I vote for my own 🙂 Is firmware stored on a broker (or traveling multiple times over a LAN) really a concern? Because if the target is home automation, users are unlikely to have 100s or 1000s of devices to flash. Also, if we don't set the retain bit (and use, say, QoS 2 instead), it won't be stored any longer than it takes to send an ACK, right? |
my use case: I was trying to automate OTA updates with Node-RED, which did not work well due to the RPC. |
@boneskull Could you please use the But without retained message, deep sleep devices won't be able to accept OTA. With retained messages, moreover, the device will receive ~350kB of useless data on each boot. Which is a lot considering the bandwidth of the ESP8266. Two solutions:
That might sound like a good compromise, and I prefer the ESP8266 cleans the firmware approach. What about you? |
Yes, good idea. Agree. (Mobile interface doesn't have "reaction" functionality) |
I was just thinking the same on gitter! |
But--shouldn't QoS 2 work? It guarantees delivery so it'll hang on to the firmware until it's delivered regardless of whether the device is online. It won't resend that data once it's been received. |
Doesn't the ESP disconnect properly on deep sleep and hence unsubscribes? In that case it won't receive the update. |
@bertmelis depends if 'clean session' is set or not, I think |
@boneskull are you sure about that? In QoS 2, the broker keeps the message until it is received once? Even if not retained? I will try, but if it is the case that would be great! @bertmelis It subscribes to the OTA topic on |
yes, it will (afaik), if you’re not using a clean session. good info here:
http://www.hivemq.com/blog/mqtt-essentials-part–7-persistent-session-queuing-messages
|
@boneskull true, on persistent connections (clean session flag to false), the broker keeps the messages untill delivered once as per OASIS standard. Could it interfere with the normal way of work? |
I voted for approach 2 (@boneskull) assuming that some sort of clean-up is happening after flashing the device. I don't have concerns with the payload being (temporarily) stored on the broker, because I can't think of a situation where someone has more than say 3-5 devices of the exact same type at home which would accept the same firmware. |
Alright, indeed the QoS 2 approach would work great here. One message to flash a firmware (no need to clean it as it will be delivered only once). Sounds great! |
@bertmelis I think of one way it could cause an issue. If we keep a persistent session across firmwares, then we will potentially have stale subscriptions hanging around. So, if we can ensure the persistent session is not retained after a firmware upgrade, we should be fine. I'm not sure of the best way to do this, but a dynamic client ID could be one of them. An API should be exposed to reset the session (clean session after next restart or disconnect & reconnect w/ new session, then unsubscribe and re subscribe to any topics) in case of unforeseen issues. |
Keep in mind that the ESP8266 must subscribe ar QoS 2 for this to work and the publishing client must do the same. |
Which stale subs? Do you have an example? There's also a problem with the AsyncMqttClient, it is not spec compliant when session is enabled (see the docs of the client, in the limitations page). |
if esp8266 w firmware 1.0 subscribes to topic A then you upgrade to firmware 2.0 which subscribes to topic B instead, it will still be subscribed to A if the session is persisted |
(subscribed to both) |
So it'll have to unsubscribe just before disconnecting. Don't we have to unsubscribe to all topics if we use a persistent connection? Otherwise a lot of stale subscriptions are potentially left behind. |
I'm still learning about firmware updates and how they're done today, but I've also been using Homie-OTA. I really like the simplicity of the web interface, but ultimately want to get things automated via Node-RED. Not sure which of the three approaches listed here would be best for that. Thinking out loud, ideally I'd like a way to control firmware updates as either push or pull. I could see a set of devices that I'd want to have automatically check for updated firmware and update themselves (aka automatic pull). There could be an option to have an adjustable update delay - that way if you had many devices they all wouldn't update at once. Some type of random prime number delay or a broadcast flag that tells the other devices to wait while a node is updating. A manual pull would require some type of interaction with a device to engage an update. For example, a critical device may not update until a scheduled maintenance window. The other option, a push, would be triggered by the server. This would effectively force devices to update. The use case I'm thinking of would be for a security update - you'd want all the devices to update ASAP. After writing this, I imagine much of this could be coded into the device firmware, or would have to be since there's no "server" to speak of. Unless you're using a setup like Homie-OTA or maybe Node-RED. Really enjoying the Homie project! I've been using Particle devices, which are great, but I appreciate the Homie approach where everything is "in-house". |
@boneskull @bertmelis to avoid this added complexity of stale subscriptions, and the fact that persistent sessions are not spec compliant on the async-mqtt-client (and will never be as we would have to store entire messages in memory...), let's go with the retained approach? @Raiderj the most straightforward way is the one we're going to implement (@boneskull's one). |
I'll forward to it! Shall we have the ESP publish an empty message to clear the retained firmware after flashing? |
@bertmelis it is already implemented on my side, and I refactored a lot of code so it is now cleaner. I'll soon push. 😉 |
I haven't been following this discussion in detail but I would be careful about any approach that depends on QOS 2 since some mqtt brokers and clients don't implement it. |
Chiming in with a different subject concerning OTA. I'd like to hook into the OTA progress, like it is possible with ArduinoOTA.onProgress(...), to show the progress on an attached OLED display. |
A remark tough, as this is a breaking change, you need to upgrade your nodes with care, as you're unable to distinguish the different OTA-procedures based on the Homie-version. |
Actually, that's why the v2 is not yet considered final and released as
such, because the API/procedures might still change.
2017-06-12 13:27 GMT+02:00 Bert Melis <notifications@github.com>:
… A remark tough, as this is a breaking change, you need to upgrade your
nodes with care, as you're unable to distinguish the different
OTA-procedures based on the Homie-version.
—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub
<#346 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AA8eNSo0rMOdu81Da_ANy764Pb5Qgq-Lks5sDSCLgaJpZM4NtYRn>
.
|
@marvinroger That sounds nice! |
* 🔥 Remove hardcoded keepalive - fix homieiot#301 (homieiot#314) Remove hardcoded KeepAlive for MQTT connection. Default value of 15sec is already present in AsyncMQTT library * 🐎 Improve uptime accuracy (homieiot#315) improve uptime accuracy by storing milliseconds and only rounding when publishing value drawback: less time before rollover, but still long enough * 🐛 Fix truncated IP (homieiot#318) published IP is truncated. * 🎨 Fix warning with parenthesis * 🐛 Fix pio library.json bad dependency name * ✨ Add support for static IP and BSSID/MAC, channel of AP (homieiot#327) * Support for static IP and BSSID/MAC, Channel of AP - added parameters to config.json which allow to define static ip, mask, gateway, dns - added parameters to config.json which allow to define BSSID and channel of AP To run device with defined static IP you have to define ip, mask and gateway together. To point device to connect to specific BSSID and channel you haveto define bssid and channel together. ``` { "name": "The kitchen light", "device_id": "kitchen-light", "wifi": { "ssid": "Network_1", "password": "I'm a Wi-Fi password!", "bssid": "DE:AD:BE:EF:BA:BE", "channel": 1, "ip": "192.168.1.5", "mask": "255.255.255.0", "gw": "192.168.1.1" }, "mqtt": { "host": "192.168.1.10", "port": 1883, "base_topic": "devices/", "auth": true, "username": "user", "password": "pass" i }, "ota": { "enabled": true }, "settings": { "param1": 55, "param2": "abcdefghijklm", "param3": true, "param4": false, "param5": 2147483647, "param6": -2147483647, "param7": 55, "param8": "abcdefghijklm", "param9": true, "param10": false } } ``` * Addjustments for travis * Another addjustments for travis * Fix problem with prepareToSleep * 🐎 Pass all callbacks by reference Still store it by value * 🐎 Make isActive const * 🎨 Add custom settings value to initial log * ✨ Abort if default setting value does not pass validator function Fix homieiot#324 * 📝 Clarify ISSUE_TEMPLATE docs location for homieiot#331 * 📝 Implement new versioned Git docs (homieiot#341) * ✨ Add versionned in-repo docs * 🐛 Attempt to fix encrypted key * 🐛 Fix permission issue on python exec * 🐛 Add mission import * 🐛 Fix relative path * 🐛 Add missing commit * ✨ Add index * 🎨 Change index design * 📝 Update all URLs to new docs * 🎨 Fix broken doc link in README * 📝 Add edit link to docs * 📝 Move firmware_parser.py to scripts folder * 🐛 Fix out of limits abort message not showing * 👕 Be less strict on whitespace in comments * 📝 Add dummy OTA updater script (TODO) * 🎨 🐎 Use CircleCi instead of Travis CI (homieiot#348) * ✨ Add Circle CI build * 🐛 Add working_directory * 🐛 Attempt to fix perm issue * ✨ Generate docs from CircleCI * 🐛 Fix path * 🐛 Fix chmod permission * 🐛 Try to run sudo * 🐛 Attempt to use remote docker * 🐛 Change bad directories * 🐛 Remove dependency on Docker * 🐛 Fix perm problem * 🐛 Chamge tmp folder * 🐎 Use CircleCI instead of Travis CI * 🎨 Add SonoffDualShutters example * 🐛 Install current lib to platformio * 🐛 Actually ignore gh-pages in CI * 🐛 Add missing SonoffDual dep * 🐛 Fix ArduinoJSON 5.11.0 (homieiot#363) * ⬆️ Update pio dependencies * 🎨 Cleanup code a bit * 🐛 Make sure every announcements packet are sent - closes homieiot#345 * 🎨 Refactor code and implement new OTA system Closes homieiot#346 * 👕 Fix lint * ✅ Use new workflow feature from Circle * ✅ Ignore gh-pages at workflow level * ✨ Add OTA_PROGRESS event * 📝 Update docs for new OTA system * ⬆️ Upgrade AsyncMqttClient dependency to 0.8.0 * 🐛 IHomieSetting::settings first, HomieNode-settings second! (homieiot#335) * IHomieSetting::settings first, HomieNode-settings second! Set initialization priority of IHomieSetting::settings to value of highest allowed priority (101). This allows other static variables to be of type HomieSetting<T>. See https://github.com/euphi/HomieNodeCollection/blob/master/src/RGBWNode.cpp for example. Note: As shown in the example it makes sense to have a static HomieSetting member, if you have a class that may be instantiated multiple times. (e.g. two LED Strips connected to the same ESP8266, as shown in https://github.com/euphi/ESP-LEDCtrl). * Removed extra whitespaces * Removed another whitespace * 🐛 Fix not returning a value in setConfigurationApPassword (homieiot#378) `HomieClass& HomieClass::setConfigurationApPassword(const char* password)` did not returned a reference to the Homie instance. This PR fixes this. * 🐛 Fix crash when starting up without any defined node (homieiot#379) * Fix crash when starting up without any defined node Skip node publication if HomieNode::nodes.size() == 0 * Update BootNormal.cpp * 🐛 Fix topic check for OTA upload (homieiot#375) * Change topic check for OTA upload * Change firmware topics to remove the 's' * 🐛 Rename last OTA topic instance * ✨ Add OTA updater script (homieiot#384) * 🐍Add python ota updater script * 💼 Update documentation of ota update script * 😑 Add comments to ota updater script * 🔮 Use 127.0.0.1:1883 as default broker setting For the ota updater script * 📝 Add details on how to interact with range property (homieiot#393) Add hint to help people figure out how to interact with range properties. Especially document the `_` separator. * 📝 Add warning to input-handlers.md about concurrency (homieiot#400) * 🐛 Interpret firmware file as an bytearray (homieiot#403) Fix homieiot#397 * 📝 Update input-handlers.md (homieiot#401) * Update input-handlers.md Sorry, it seems that I used the `!!! warning` block in a wrong way. * Update input-handlers.md * Update input-handlers.md * ✨ Use AsyncWebServer + Refactoring (homieiot#425) * Initial AsyncWebServer * Fixed Proccessing Body Requests (JSON) * Doc Fixes + typo * Added Missing Method in Timer.cpp * Lots of Refactoring + Moved Reset Button to its own helper class for boots to use * 🎨 Update BootNormal.cpp (homieiot#426) Solve "else" errors from https://circleci.com/gh/marvinroger/homie-esp8266/136#tests/containers/0 * ⬆️ Update dependency to "ESP Async WebServer" (homieiot#434) * 📝 Add instruction for @platformio (homieiot#435) * Instruction for @platformio * Explain how to used tagged version with @platformio * Explain how to use tagged version with @platformio * 🐛 Use v2.0.0-beta.2 as a working tagged example for @platformio (homieiot#437) * 🎨 Simplify CI with @platformio (homieiot#438) * 🐛 Install library via @platformio with all dependencies (homieiot#439) * 💚 CI: Install staging version of Arduino Core for ESP8266 & @platformio (homieiot#440) * 🐛 Pin Shutters dep version * 🔥 Don't fail on CI docs step when testing a fork * 🎨 More Refactoring + Deep Sleep + Prevent WiFi Reconnect when reboot (homieiot#432) * Initial AsyncWebServer * Fixed Proccessing Body Requests (JSON) * Doc Fixes + typo * Added Missing Method in Timer.cpp * Lots of Refactoring + Moved Reset Button to its own helper class for boots to use * Lots of Refactoring. * Refactored ResetHandler + More Refactoring + Testing * Added deep sleep function + Prevent Wifi reconnect before reboot (homieiot#380)[homieiot#380] * Small typo fixes * Small Rearange of code in BootConfig + Minor Refactor of function names * Minior Commit to Triger a Github Action * circleci build fix * 🎨 First pass to fix linting * 🎨 Second pass of lint * 🎨 Final lint fix * 📝 Update docs deps * 📝 🎨 Update docs manifest * 📝 🎨 Adjust HTTP JSON API doc * 📝 Add configurators on website * 📝 Update links to configurator * Fix links under Features (homieiot#452) * Set Device Stats Interval (homieiot#451) (homieiot#455) * Add last step to uibundle README (homieiot#460) * Add last step to uibundle README * Added Arduino Support for doc * Update updater script addressing quirks (homieiot#461) * Update API for /wifi/connect from GET to PUT (homieiot#468) Docs show `/wifi/connect` as `GET` when it should actually be `PUT` * Proposal to optionally run HomieNode::loop() also in disconnected state * Show Homie version * Fix homieiot#446 homieiot#477 (homieiot#501) * Fix homieiot#446 CORS Issue * Fix for homieiot#477 * Fix Lint * Fix Safari not displaying the config bundle HTML page (Fix homieiot#476) (homieiot#502) * Fix Safari not displaying the config bundle HTML page Safari cannot deal with gzip files that have a "*.gz" file extension. Simply faking the filename solves the problem though. * Update Readme Homie Version * Fix Warnings (homieiot#503) * Update Readme - Homie Convention * Update to Homie Convention v2.0.1 (homieiot#507)
There are 2 OTA proposals, so I guess there's some interest in changing the OTA design. As discussed with some of you, MQTT is not meant to handle RPC style communication. The current implementation does RPC.
Please understand that the v2.0.0 is not yet final, so it's "legit" to change the way OTA works before final release.
Here are the choices:
1. As @brodtm proposed (simplified a bit, see #344):
homie/$esp8266_firmwares/<fw checksum>
$implementation/ota/target
its target firmware checksumhomie/$esp8266_firmwares/<fw checksum>
and flash the firmwareBenefits
Drawbacks
2. As @boneskull proposed (modified a bit, see #325):
$implementation/ota/firmware/<fw checksum>
its target firmware checksumBenefits
One message to flash a firmware (two if we want to cleanup the firmware after the OTA)
Drawbacks
3. Currently (simplified):
$implementation/ota/checksum
$implementation/ota/firmware
, and the firmware updatesBenefits
No need to store the firmware beforehand on the broker.
Drawbacks
Comparison
Thinking about it, I think I still prefer the current approach... But let me know what you think. You can of course propose something else.
If you don't want to be involved in the discussion (even if you do actually), please vote below:
The text was updated successfully, but these errors were encountered: