-
Notifications
You must be signed in to change notification settings - Fork 13.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Error in BearSSL when attempting to connect to secure MQTT broker #7801
Comments
Can you please enable full debugging in the IDE and get the logs? That often gives a better idea about handshake difficulties and such. There were also 3.0.0 changes to fix reporting of some errors that you may profit from, so consider trying the master, too. Your crash is pretty obviously a nullptr deref inside MQTTClient (the execVADDR = 0x0000 + some offset). You would need to check the library code for that, it's not part of our stuff. Could be an OOM error on a |
On the logs I will give it a go and post them. Thanks for the pointers. |
Removing JSON didn't change anything except it loads a lot quicker. |
IDE Build Log
|
This comment has been minimized.
This comment has been minimized.
Failing output log with all debug enabled.This log has an example of failure (the exception is decoded above).
|
Successful output logThis is an example of when it works with full debug info. embedded in the below log is this line:
which is the result of a MQTT message sent via AWS IoT Core.
|
Sorry, but that doesn't look like it has BearSSL debugging enabled. The certificates sent and received (they're public certs and no private cert/key dumped so should be safe to post) as well as more handshake info is generally dumped. Make sure you're using the 2nd to last setting (longest one) as that also catches malloc fails and full logs elsewhere. For example:
|
Quickly ooking at the code, it's related to the BearSSL x509 context info where the x509 cert is null. W/o code, can't really say how that could happen. If you can carve out a MCVE w/o the 3rd party libs it would be helpful. W/o it, you're going to need to be self sufficient as it's using a private client cert, to a private AWS instance, using 3rd party code which means nobody but you can repro your runs. :( |
I'm also using the Basic SSL package (Low Memory) |
Yeah, a hear you:( When you say "third party libs" which ones you talking about so I am clear about what I need to do? I'll give it a try. Thanks for the encouragement. |
The JSON lib and MQTT libs aren't part of the core. Something simple and minimal that does a SSL connection attempt and gets to your failed state is what we'd need. That said, I still think you're just out of memory and not catching it properly somewhere or in a library. It's not stack usage related, the present amount is fine for every case we know of, your reported usage is small, and when an overflow occurs it is reported as a SSL stack overflow via canaries or via massive memory corruption (not nulls). |
I woke up with a similar thought. What if that transaction is processing things to slowly? The evidence is this:
And, if i had turned on time stamps you would see that this time out is always 15 seconds. What if sometimes that transaction takes less than 15 seconds and succeeds but most of the time it doesn't because inadequate hardware performance. No, I like your explination better. Low memory issue draws program execution out in the sticks (exception) and it can't get back in time. Okay the logs from above were without the JSON Lib. Next I'll try pulling the MQTT to get at the raw SSL connection. Thanks again. |
After a few days and a broken leg I still haven't figured this out. I tried to decypher how to test BearSSL without the MQTT library and after looking at openssl I can't figure that out. Is there an example that anyone can point me to? The other complication is doing it without a network connection. |
Sorry to hear about the leg. I guess on the bright side, more time inside to hack? FWIW, there was a very minor change to catch a malloc() failure while reading an EC private key. Most keys are RSA, so it may not be related at all, but see #7823. Root problem would still be OOM, and the old code would crash in the EC key reader immediately anyway. For MQTT testing, you can set up a local mosquitto instance (that's how I debugged the client cert code while writing the BearSSL stuff) and come close to simulating AWS...while being able to see both sides of the conversation w/Wireshark and having both encryption keys. |
@s-hadinger has modified/optimized bearssl (lower footprint) so it can be used with Tasmota and AWS IoT. It works stable :-) |
Doc is here, I've been using for 18 months and it's super stable. https://tasmota.github.io/docs/AWS-IoT/ Implementation details can be found here |
While I have reservations about Tasmota dynamically allocating the BSSL stack(heap fragmentation could cause failure easily in my experience when trying to get a 6kb block), the other stuff looks interesting. If you or the Tasmota team would be interested in adding that option to the IDE menus (like the "minimal codes" option already) I think it would be a good addition! |
I understand your reservation but ram is so precious that I need to spare it. Based on experience, tls handshakes are rare with Mqtt, while they would be common with https. This makes it sustainable on Tasmota and actually does not hurt fragmentation. |
We dont use Arduino IDE, we use Platformio. I think i can speak for the whole Tasmota team we dont know how to add stuff to Arduino IDE. Sorry we cant contribute. |
Okay this works but... It is not stable but... The while loop in 'connectToMqtt' will generally run until it connects. And then I can control my robot over the internet. I will pull the latest and see if the fixes above make it more stable.
Here are the other files.
|
In case it helps, I'm getting a similar crash in a RESTful client with much more memory available (typically > 28K heap free, >20K largest available block) which seems to point away from the MQTT client or low memory. However, I am using an EC certificate. Target: ESP12E, Core:3.1.1 SDK:2.2.2-dev(38a443e) LWIP:2.1.3 The crashes (which are intermittent), always seem to occur around the same place; see a few samples below: It looks like the exceptions are happening during the SSL connection phase. The first attempt to contact the server is an HTTPS GET request for current time (the GET time... debug output). The crash appears to happen before the http.GET() returns. Note that most of the time this code works; the exceptions are intermittent. More of the traces:
|
Basic Infos
Platform
Hardware: ESP-12
Core Version: SDK:2.2.2-dev(38a443e)/Core:2.7.4=20704000/lwIP:STABLE-2_1_2_RELEASE/glue:1.2-30-g92add50/BearSSL:5c771be and with straight from master.
Development Env: Arduino IDE
Operating System: Ubuntu & MacOS
Settings in IDE
Problem Description
Attempting to connect to AWS IoT. MQTT client sometime is rarely successful most of the times it errors out in BearSSL. I know the AWS IoT setup works as the credentials used on other platforms and as I said it sometimes works. Same setup used on ESP8266 with sometime success. When success works as expected. When it fails the wifi connects, the SNTP gets the time, the MQTT client gets an IP addess from AWS, then attempts to connect and fails which causes and exception. The exception is either a 28 or a 9 depending on the configuration but is always associated with the same line of code ( br_ssl_hs_client_run at src/ssl/ssl_hs_client.c line 1871).
This version uses the MQTT client but the PubSubClient has the same behavior so i don't think it has to do with these libraries.
I have built this on both my MAC Powerbook and Ubuntu 18.04 to the same effect.
I have attempted to use GDB with some success but tracing this problem usually results in a segfault on target and then GDB exits abruptly. This eliminates the ability to do a backtrace.
I have looked at the max stack usage and this doesn't look like the problem. One interesting thing is that when it fails the max has like three different levels. When successful it is always the same value.
When it fails it looks like this:
When successful it is always:
When successful the code will print out messages sent via the AWS IOT Core test client as would be expected.
What i show is with 2.7.4 but i have pulled master and it has very similar behavior but doesn't print out the exception stack so i didn't include that.
MCVE Sketch
Debug Messages
Decoded Exception
The text was updated successfully, but these errors were encountered: