-
Notifications
You must be signed in to change notification settings - Fork 3.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
HTTP module and SSL don't play nice on dev #1707
Comments
I'm not sure if this is the same issue, but I'm unable to make any https requests at all on the 2.0.0 firmware. This worked on the previous version. Any time I try a GET or POST to a https URL I get HTTP client errors. Am I missing something here?
NodeMCU version:
|
I'm quite convinced it is. My table with URLs also contains a Google URL. |
@marcelstoer I'm relieved that I'm not the only one having issues with the |
Sorry Nate, I do pretty much everything around here - except firmware coding 😞 The last substantial contributions to the module were from @pjsg and @luismfonseca.
Well, the Lua veneer for the HTTP library is here: https://github.com/nodemcu/nodemcu-firmware/blob/dev/app/modules/http.c. The library itself is here: https://github.com/nodemcu/nodemcu-firmware/tree/dev/app/http. The last commit that tinkered with HTTPS is a592af7 from @djphoenix but it still looks fine to me. |
@marcelstoer Thanks for the pointers. I looked around in the code and see that SSL support is flagged by the presence of the How does your cloud build service work? Is there some process that un-comments this definition when the SSL/TLS package is enabled? Are we sure that's working? I noticed in the |
Correct.
Yes, with near certainty. I'm quite convinced that I tested with images built from my cloud builder and with manually built images (Docker). I will test again though just to be sure.
Almost. This is triggered from |
I tried building the firmware myself tonight with your docker build project and had the same problems. So it seems there is an actual bug somewhere (not in the build process).
😢 |
I captured some output from debug mode that I wanted to share. My app is doing a https POST to a https-only API,
I'm able to reproduce the problem with a simple GET to https://google.com:
Here you can see that error 0x7880 is "The peer notified us that the connection is going to be closed." I'm not really sure where to look from here. Any ideas? |
Is there a documentation somewhere about espconn error codes that are passed to http_error_callback? |
I'm afraid this isn't an appropriate place to ask this question, but I'm getting desperate. |
@Kaiser442 The simplest thing to do to get an older firmware is use @marcelstoer's Docker build container: https://hub.docker.com/r/marcelstoer/nodemcu-build/ You can check out whatever branch or tag of the firmware that you want from git and then edit the |
The I'm completely not in ESP{8266,32}-land at the moment, but if someone has traces I can probably find some to have a quick look. @marcelstoer espconn error codes can be found in Another idea would be for someone to play with |
@jmattsson there are debugging flags in |
@djphoenix Have you found something interesting? |
This is a key issue for us. Can anyone help resolve this? I don't know if this is appropriate, but would a bounty help improve the priority of getting this fixed? |
I have been using the frozen 1.5.4.1 branch for now due to this problem. Even sending REST api calls via TCP connections will fail (in many cases) using this branch. So there isn't really an alternative. Both TCP and HTTP connections don't work well for secured connections (it's a hit & miss depending on the server you connect to) |
I would contribute $$ to a bounty to get this fixed. I'm also stuck on the 1.5.4.1 branch for my project due to this issue. |
I will put up a $500 bounty to get this fixed. This is really important to us. I have to believe that it is a serious issue for anyone who needs to send secure data. |
@heythisisnate will you add something to the bounty to see if we can get this issue resolved? |
Wow @Jonathan411 that's quite a generous bounty! I can't afford that much right now, but I'd be willing to chip in 0.035 BTC (~$40 USD) to the developer that gets a PR merged that fixes this issue. |
Hey guys, I will also contribute to the bounty with a 50$ USD for anyone who can solve this issue...though I am hoping that maybe the next SDK will address it. Refer to issue #1810. |
@heythisisnate , @dtran123 thanks for the bounty support, every bit helps! |
Hi, I've looked into the issue a little bit because I was experiencing similar errors. In fact I couldn't use the tls module in a consistent fashion. I tried the dev branch of the firmware mostly (with some tests with 'master' branch and some tests with the 2.0.0.0 tag from February) and I used the docker build process. First of all, to do a decent SSL certificate chain verification, I needed the correct time. This requires the rtctime, rtcmem and sntp modules. After compiling and flashing, the below commands can set up time (with a 1000 second recurring sync according to the documentation):
After running this command, you should see the "Seconds: ..." line filled with the current GMT time in EPOCH format. The "rtctime.get()" command should give the same result. (Use epochconverter.com to convert to a readable format.) After this, I tried to add the certificate authorities to the trusted certificates list using tls.cert.verify, like this: tls.cert.verify([[ Note that the documentation states that multiple CA certificates can be added by comma-separated strings. This might be essential because I didn't know how mbedtls is validating the certificates. (Here's an interesting entry about this here.) Afterwards you can use the tls module to open a secure connection. I hooked the "receive" and "connection" events, but none of them got called because the connection attempt fails at the SSL handshake. Namely the At this point I simply moved over to MicroPython, to try to do SSL validation there. I can't believe there's no working example on the Internet about this critical feature (provided you want to develop something that hooks to a network). From the top of my head, I might need to run the floating point firmware instead of the integer, but it's a weak argument at this point. (It should be noted in the docs if this was the case.) If I sprung any ideas for anyone, please share. EDIT1: Notably, I'm interested if you have to add all intermediate certificates to the TLS trusted certificates store or if mbedtls will go through the chain received from the server and it's enough to store the root CA. EDIT2: So my current suspect is the mbedtls library itself. The http module might have issues too, but unless mbedtls is consistent, there's no point in trying to fix the http module. Regards, |
@greg-szabo I agree with your EDIT2. that there is no point to look into http issue before we address secured TLS tcp socket connections. I have many projects that work well (socket connection is setup successfully) with branch 1.5.4.1 (based on previous SDK) but same code will not work on master or dev branches. |
I finally had a chance to try this again. I tested bumping the Here's a very simple example (tested both on
As you can see, after issuing the I'm able to reproduce the same behavior with several other API hosts (not just github.com). Anyone have an idea of what this |
Well, it originates from httclient's Out of curiosity I ported the espconn apps for lwip and mbedtls from SDK's third_party in master. @heythisisnate's testcase passes smoothly now:
That might be a good indication - I haven't done any further tests, though. |
I did some testing with the examples mentioned in this issue. There appears to be no regression for |
Can anyone else reproduce this failure? A
I can't figure out what changed. Is it just me? |
@heythisisnate What version, exactly, are you testing against? If you're able to produce your own images, debug information would likely be illuminating. |
@nwf I'm on branch 1.5.4.1-final. I rebuilt the fw with debug enabled. Here's the error:
It looks like |
On the current dev build it is working though:
|
@heythisisnate If you're experiencing issues with |
@FrankX0 it is kinda working on the
|
@heythisisnate IIRC then I tried your command on #2269 and the |
Ok awesome. I'm cheering for #2269 to be merged soon then so we can put this issue to rest. As of now my program that uses the Github API isn't working properly on either 1.5.4.1 or dev 😢 |
If the browser accesses then the response comes quickly. If the request from NodeMCU
then HTTP client: hostname=api.telegram.org How to search for this error? |
Just a though, but looking the code, it does a dns resolve in the loop so to speak, and if this takes too long then your client might timeout. However, the esconn stack caches the last 4 resolutions so have you tried doing a net.dns.resolve('api.telegram.org', funcToDoTheGet) and that way the DNS name is resolved and cached before you start the HTTP dialogue. |
@TerryE This way the same does not work |
@AlexSmok, @heythisisnate : I can confirm that the above example is indeed fixed by #2269.
This will again make things better, but there are still secure websites which cannot be accessed correctly.
This even results in a reset of the device. |
There are a few references in the mbedTLS forums for error -0x7200: https://tls.mbed.org/discussions/generic/0x7200-error, https://tls.mbed.org/discussions/bug-report-issues/handshake-error-an-invalid-ssl-record-was-received-error-7200, https://tls.mbed.org/discussions/generic/0x7200-mbedtls_err_ssl_invalid_record-on-aws-iot-mqtt |
0x7200 is, in particular, MBEDTLS_ERR_SSL_INVALID_RECORD and is the return code when a TLS message is longer than the buffer allocated. See the chatter above around MBEDTLS_SSL_MAX_CONTENT_LEN . It would be great if you could crank up the SSL debug level as well, though, to confirm. The reset of the device seems more likely a bug in the http module than anything SSL-specific. The 8192 and 0 chunk sizes sure seem suspicious. |
Strange. If I call the
code before the http server starts (Immediately after receiving the IP address.) , then the code works like this:
If after the start of the HTTP server, I get a timeout. UPD. tmr.create (): alarm (7000, tmr.ALARM_AUTO, telegram) and watched. I did not notice any dependence. The code either works many times in a row or a timeout. |
@nwf: the 0x7200 error when accessing nodemcu-build.com is indeed caused by the ssl buffer being too small. Increasing it to a minimum of 8192, results in (without resetting):
But I guess this shows that the limited amount of RAM is too small for this website (sorry @marcelstoer). Maybe now is the time to close this topic and merge SDK 2.2? One addition. |
I have no idea how to pin the maximum version... I would have guessed that version negotiation would have done the right thing here. This suggests something broken about mbedTLS, espressif's use of it, or the remote server. What is the remote, speaking of? It may be useful to point https://www.ssllabs.com/ssltest/ at it and see what it says? Please open a separate bug for the device reset induced by the http client. I would be in favor of this particular bug being closed; I think it has outlived its usefulness. |
I got confirmed by the developers of mbed TLS that the issue I found (minor version mismatch) is due to a problem in de client. So my assumption is that somehow the received data is malformed. |
Frank the LFS patch will be out in a week or so, after our next staging of It's either that or lots of |
@FrankX0 |
Some issues are fixed with SDK 2.2, some can't be fix on this platform and for all others there should be dedicated issues -> closing |
When I happened to see another (or a new) HTTP & SSL issue the other day I went "Grrr, why me again?". Some of you may remember 😉
TL;DR
Many HTTPS requests from the http module fail while connecting to the same resources with the net/TLS module usually succeeds.
Test code
Test result
Then I checked heap: 25312. That was suspiciously low, I started with ~44k, so I ran
test("raw.githubusercontent.com", "/espressif/esptool/master/MANIFEST.in")
again and got-> no successful feedback from net/TLS code anymore.
Does "HTTP client: Disconnected with error: 46" indicate that the client was still maintaining the previous (failed) connection which it tried to kill first? I have my doubts because I sometimes also see this when the test runs after a clean reboot.
I tested a few more URLs, each after a clean reboot with both
http
andnet
modules.NodeMCU version
Hardware
NodeMCU devkit v2
The text was updated successfully, but these errors were encountered: