Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

HTTP module and SSL don't play nice on dev #1707

Closed
marcelstoer opened this issue Jan 6, 2017 · 127 comments
Closed

HTTP module and SSL don't play nice on dev #1707

marcelstoer opened this issue Jan 6, 2017 · 127 comments
Labels

Comments

@marcelstoer
Copy link
Member

marcelstoer commented Jan 6, 2017

When I happened to see another (or a new) HTTP & SSL issue the other day I went "Grrr, why me again?". Some of you may remember 😉

TL;DR

Many HTTPS requests from the http module fail while connecting to the same resources with the net/TLS module usually succeeds.

Test code

function test(host, path)
  local url = "https://" .. host .. path;
  http.get(url, nil, function(code, data)
    if (code < 0) then
      print("HTTP request to " .. url .. " failed")
    else
      print("HTTP request to " .. url .. " succeeded")
    end
  end)
  local srv = tls.createConnection(net.TCP, 0)
  srv:on("receive", function(sck, c) print("net/TLS to " .. url .. " succeeded") end)
  srv:on("connection", function(sck, c)
    sck:send("GET " .. path .. " HTTP/1.1\r\nHost: " .. host .. "\r\nConnection: keep-alive\r\nAccept: */*\r\n\r\n")
  end)
  srv:connect(443, host)
end
test("raw.githubusercontent.com", "/espressif/esptool/master/MANIFEST.in")

Test result

net/TLS to https://raw.githubusercontent.com/espressif/esptool/master/MANIFEST.in succeeded
HTTP client: Connection timeout

Then I checked heap: 25312. That was suspiciously low, I started with ~44k, so I ran test("raw.githubusercontent.com", "/espressif/esptool/master/MANIFEST.in") again and got

E:M 528
E:M 272
HTTP client: Disconnected with error: 46
HTTP client: Connection timeout
HTTP client: Connection timeout

-> no successful feedback from net/TLS code anymore.

Does "HTTP client: Disconnected with error: 46" indicate that the client was still maintaining the previous (failed) connection which it tried to kill first? I have my doubts because I sometimes also see this when the test runs after a clean reboot.

I tested a few more URLs, each after a clean reboot with both http and net modules.

URL http net/TLS
https://raw.githubusercontent.com/espressif/esptool/master/MANIFEST.in
https://httpbin.org/ip ❌ no output at all, not even error
https://nodemcu-build.com
https://clients5.google.com/pagead/drt/dn/

NodeMCU version

NodeMCU custom build by frightanic.com
	branch: dev
	commit: 5425adefff62f9ea2094e3e4581a79f1424e4433
	SSL: true
	modules: file,gpio,http,net,node,tmr,uart,wifi,tls
 build 	built on: 2017-01-06 19:39
 powered by Lua 5.1.4 on SDK 2.0.0(656edbf)

Hardware

NodeMCU devkit v2

@heythisisnate
Copy link
Contributor

I'm not sure if this is the same issue, but I'm unable to make any https requests at all on the 2.0.0 firmware. This worked on the previous version. Any time I try a GET or POST to a https URL I get HTTP client errors. Am I missing something here?

=http.get('https://google.com')
> HTTP client: Disconnected with error: 8
HTTP client: Connection timeout
HTTP client: Connection timeout

NodeMCU version:

NodeMCU custom build by frightanic.com
	branch: master
	commit: b96e31477ca1e207aa1c0cdc334539b1f7d3a7f0
	SSL: true
	modules: file,gpio,http,net,node,tmr,uart,wifi,tls
 build 	built on: 2017-02-20 23:43
 powered by Lua 5.1.4 on SDK 2.0.0(656edbf)

@marcelstoer
Copy link
Member Author

I'm not sure if this is the same issue

I'm quite convinced it is. My table with URLs also contains a Google URL.

@heythisisnate
Copy link
Contributor

@marcelstoer I'm relieved that I'm not the only one having issues with the http library and HTTPS requests. I have not had any success at all on the 2.0.0 SDK (built from master) but it does work ok on 1.5.4.1. Do you have any indication where the problem may be? I'm happy to help as much as I can but I'm not sure of the bug reporting/fixing process for this project. It looks like you've contributed a ton to NodeMCU development (thanks for this!!) so can you let me know how else I can help get this resolved?

@marcelstoer
Copy link
Member Author

It looks like you've contributed a ton to NodeMCU development (thanks for this!!) so can you let me know how else I can help get this resolved?

Sorry Nate, I do pretty much everything around here - except firmware coding 😞 The last substantial contributions to the module were from @pjsg and @luismfonseca.

I'm happy to help as much as I can but I'm not sure of the bug reporting/fixing process for this project.

Well, the Lua veneer for the HTTP library is here: https://github.com/nodemcu/nodemcu-firmware/blob/dev/app/modules/http.c. The library itself is here: https://github.com/nodemcu/nodemcu-firmware/tree/dev/app/http. The last commit that tinkered with HTTPS is a592af7 from @djphoenix but it still looks fine to me.
If you want to dig into code and start building firmware have a look at http://nodemcu.readthedocs.io/en/latest/en/build/ (Linux/Linux VM or Docker from yours truly).

@heythisisnate
Copy link
Contributor

@marcelstoer Thanks for the pointers. I looked around in the code and see that SSL support is flagged by the presence of the CLIENT_SSL_ENABLE constant. It seems like maybe this constant is not getting defined when building but I haven't verified this yet. It looks like the constant is commented out here: https://github.com/nodemcu/nodemcu-firmware/blob/master/app/include/user_config.h#L67

How does your cloud build service work? Is there some process that un-comments this definition when the SSL/TLS package is enabled? Are we sure that's working? I noticed in the tools/pr-build.sh script there's something that uncomments the line, but maybe that's only for dev builds?

@marcelstoer
Copy link
Member Author

Is there some process that un-comments this definition when the SSL/TLS package is enabled?

Correct.

Are we sure that's working?

Yes, with near certainty. I'm quite convinced that I tested with images built from my cloud builder and with manually built images (Docker). I will test again though just to be sure.

I noticed in the tools/pr-build.sh script there's something that uncomments the line, but maybe that's only for dev builds?

Almost. This is triggered from .travis.yml when someone creates a pull request. Having an automatic CI build with all modules and SSL enabled gives us greater confidence that a PR won't break stuff even though we don't have any unit or integration tests.

@heythisisnate
Copy link
Contributor

I tried building the firmware myself tonight with your docker build project and had the same problems. So it seems there is an actual bug somewhere (not in the build process).

even though we don't have any unit or integration tests.

😢

@heythisisnate
Copy link
Contributor

I captured some output from debug mode that I wanted to share. My app is doing a https POST to a https-only API, https://graph-na02-useast1.api.smartthings.com. This is the result every time:

client handshake ok!
client's data invalid protocol
Reason:[-0x7880]
HTTP client: Disconnected with error: 8
HTTP client: Connection timeout

I'm able to reproduce the problem with a simple GET to https://google.com:

=http.get("https://google.com")
> client handshake start.
please start sntp first !
please start sntp first !
client handshake ok!
client's data invalid protocol
Reason:[-0x7880]
HTTP client: Disconnected with error: 8
HTTP client: Connection timeout
HTTP client: Connection timeout

Here you can see that error 0x7880 is "The peer notified us that the connection is going to be closed." I'm not really sure where to look from here. Any ideas?

@marcelstoer
Copy link
Member Author

Is there a documentation somewhere about espconn error codes that are passed to http_error_callback?

@Kaiser442
Copy link

I'm afraid this isn't an appropriate place to ask this question, but I'm getting desperate.
I'm having this same issue - https requests aren't working. I'd like to just downgrade and get a build using 1.5.1 if I can, because the scripts I'm using were developed under that SDK, but I am not sure how to do that. This product was originally made using a build from nodemcu-build.com, but I don't have access to that bin file and I'm not sure how to recreate it. It looks like that site doesn't let you specify anything other than the current master or dev branch. I found a "how to compile NodeMCU" page, but those instructions use Linux (which I have access to if required, but it's not one of my strengths). Help!

@heythisisnate
Copy link
Contributor

@Kaiser442 The simplest thing to do to get an older firmware is use @marcelstoer's Docker build container: https://hub.docker.com/r/marcelstoer/nodemcu-build/

You can check out whatever branch or tag of the firmware that you want from git and then edit the app/include/user_modules.h and app/include/user_config.h files to choose the packages/options you want. Then build the firmware using the Docker image.

@jmattsson
Copy link
Member

The client's data invalid protocol appears to come from the SDKs libssl.a. Does anyone have the ability to get a packet trace of an attempted connection? Ideally in a way that gives you the ephemeral keys so we can look inside the encrypted stream. It might also be an idea to double-check the available cipher suites against what the relevant servers are configured to use/accept.

I'm completely not in ESP{8266,32}-land at the moment, but if someone has traces I can probably find some to have a quick look.

@marcelstoer espconn error codes can be found in sdk/*/include/espconn.h, though they're not very informative. In this case, 8 = connection aborted. The Reason: -0x7880 is from mbedTLS, and is simply the notification from the remote side that it's going to terminate the connection.

Another idea would be for someone to play with openssl s_server in seriously verbose mode, and have NodeMCU connect to it - that might give some good clues.

@djphoenix
Copy link
Contributor

@jmattsson there are debugging flags in user_mbedtls.h so you can enable each log level that you want. Can't explain it carefully right now. So I suspect there are fatal architecture issues in http module because it's "don't play nice" for plain requests too. I can debug it carefully some later (maybe in this weekend).

@Josepdal
Copy link

Josepdal commented Mar 26, 2017

@djphoenix Have you found something interesting?

@Jonathan411
Copy link

This is a key issue for us. Can anyone help resolve this? I don't know if this is appropriate, but would a bounty help improve the priority of getting this fixed?

@dtran123
Copy link

dtran123 commented Apr 7, 2017

I have been using the frozen 1.5.4.1 branch for now due to this problem. Even sending REST api calls via TCP connections will fail (in many cases) using this branch. So there isn't really an alternative. Both TCP and HTTP connections don't work well for secured connections (it's a hit & miss depending on the server you connect to)

@heythisisnate
Copy link
Contributor

I would contribute $$ to a bounty to get this fixed. I'm also stuck on the 1.5.4.1 branch for my project due to this issue.

@Jonathan411
Copy link

I will put up a $500 bounty to get this fixed. This is really important to us. I have to believe that it is a serious issue for anyone who needs to send secure data.

@Jonathan411
Copy link

@heythisisnate will you add something to the bounty to see if we can get this issue resolved?

@heythisisnate
Copy link
Contributor

Wow @Jonathan411 that's quite a generous bounty! I can't afford that much right now, but I'd be willing to chip in 0.035 BTC (~$40 USD) to the developer that gets a PR merged that fixes this issue.

@dtran123
Copy link

dtran123 commented Apr 8, 2017

Hey guys, I will also contribute to the bounty with a 50$ USD for anyone who can solve this issue...though I am hoping that maybe the next SDK will address it. Refer to issue #1810.

@Jonathan411
Copy link

@heythisisnate , @dtran123 thanks for the bounty support, every bit helps!

@greg-szabo
Copy link

greg-szabo commented Apr 16, 2017

Hi,

I've looked into the issue a little bit because I was experiencing similar errors. In fact I couldn't use the tls module in a consistent fashion. I tried the dev branch of the firmware mostly (with some tests with 'master' branch and some tests with the 2.0.0.0 tag from February) and I used the docker build process. DEVELOP_VERSION was set to true I've added some extra os_printf's to the mbedtls_parse_internal function in mbedtls's espconn_mbedtls.c. I also commented out line 88 and 89 in mbedtls's debug.c (this enables verbose debugging for mbedtls, which seems to be badly implemented - or I didn't use MBEDTLS_DEBUG_C properly somehow.). I flashed the integer binary.

First of all, to do a decent SSL certificate chain verification, I needed the correct time. This requires the rtctime, rtcmem and sntp modules. After compiling and flashing, the below commands can set up time (with a 1000 second recurring sync according to the documentation):

sntp.sync("pool.ntp.org",function(sec,us,server,info) print ("Seconds: "..sec.." Server: "..server.." Stratum: "..info.stratum) end, function(errorcode,info) print ("SNTP errorcode: "..errorcode.." Info: "..info) end, true)

After running this command, you should see the "Seconds: ..." line filled with the current GMT time in EPOCH format. The "rtctime.get()" command should give the same result. (Use epochconverter.com to convert to a readable format.)

After this, I tried to add the certificate authorities to the trusted certificates list using tls.cert.verify, like this:

tls.cert.verify([[
-----BEGIN CERTIFICATE-----
MIImyfirstcertificatedata...
-----END CERTIFICATE-----
]],[[
-----BEGIN CERTIFICATE-----
MIImysecondcertificatedata...
-----END CERTIFICATE-----
]])

Note that the documentation states that multiple CA certificates can be added by comma-separated strings. This might be essential because I didn't know how mbedtls is validating the certificates. (Here's an interesting entry about this here.)

Afterwards you can use the tls module to open a secure connection. I hooked the "receive" and "connection" events, but none of them got called because the connection attempt fails at the SSL handshake. Namely the mbedtls_ssl_handshake(&TLSmsg->ssl) request in line 880 in espconn_mbedtls.c (within the mbedtls_parse_internal function) returns with an error message. Before I set up time, I usually received a MBEDTLS_ERR_SSL_INVALID_RECORD (0x7200) and when I realized that time might be essential and set up sntp and the rtc clock, it changed into a MBEDTLS_ERR_SSL_ALLOC_FAILED (0x7F00). So, by enabling all those required modules, I might have run out of memory... Savage...

At this point I simply moved over to MicroPython, to try to do SSL validation there. I can't believe there's no working example on the Internet about this critical feature (provided you want to develop something that hooks to a network).
I might come back to this, because it intrigues me (and MicroPython doesn't have an ntp library for time-keeping or any examples either...)

From the top of my head, I might need to run the floating point firmware instead of the integer, but it's a weak argument at this point. (It should be noted in the docs if this was the case.) If I sprung any ideas for anyone, please share.

EDIT1: Notably, I'm interested if you have to add all intermediate certificates to the TLS trusted certificates store or if mbedtls will go through the chain received from the server and it's enough to store the root CA.

EDIT2: So my current suspect is the mbedtls library itself. The http module might have issues too, but unless mbedtls is consistent, there's no point in trying to fix the http module.

Regards,
Greg

@dtran123
Copy link

@greg-szabo I agree with your EDIT2. that there is no point to look into http issue before we address secured TLS tcp socket connections. I have many projects that work well (socket connection is setup successfully) with branch 1.5.4.1 (based on previous SDK) but same code will not work on master or dev branches.

@nwf nwf mentioned this issue Dec 29, 2017
4 tasks
@heythisisnate
Copy link
Contributor

I finally had a chance to try this again. I tested bumping the MBEDTLS_SSL_MAX_CONTENT_LEN value to 5120 as suggested. This might be helping because the test request is successful, but not before spewing two errors and about a 1-second delay.

Here's a very simple example (tested both on dev and master) with the same results:

=http.get("https://api.github.com/repos/nodemcu/nodemcu-firmware/releases/latest", {}, function(code,data) print(code, data) end)
> HTTP client: Disconnected with error: 8
HTTP client: Connection timeout
200	{"url":"https://api.github.com/repos/nodemcu/nodemcu-firmware/releases/7514562","assets_url":"https://api.github.com/repos/nodemcu/nodemcu-firmware/releases/7514562/assets","upload_url":"https://uploads.github.com/repos/nodemcu/nodemcu-firmware/releases/7514562/assets{?name,label}","html_url":"https://github.com/nodemcu/nodemcu-firmware/releases/tag/2.1.0-master_20170824","id":7514562,"tag_name":"2.1.0-master_20170824","target_commitish":"master","name":"2.1.0-master_20170824","draft":false,"author":{"login":"marcelstoer","id":624195,"avatar_url":"https://avatars0.githubusercontent.com/u/624195?v=4","gravatar_id":"","url":"https://api.github.com/users/marcelstoer","html_url":"https://github.com/marcelstoer","followers_url":"https://api.github.com/users/marcelstoer/followers","following_url":"https://api.github.com/users/marcelstoer/following{/other_user}","gists_url":"https://api.github.com/users/marcelstoer/gists{/gist_id}","starred_url":"https://api.github.com/users/marcelstoer/starred{/owner}{/repo}","subscriptions_url":"https://api.github.com/users/marcelstoer/subscriptions","organizations_url":"https://api.github.com/users/marcelstoer/orgs","repos_url":"https://api.github.com/users/marcelstoer/repos","events_url":"https://api.github.com/users/marcelstoer/events{/privacy}","received_events_url":"https://api.github.com/users/marcelstoer/received_events","type":"User","site_admin":false},"prerelease":false,"created_at":"2017-08-24T19:26:33Z","published_at":"2017-08-24T19:32:27Z","assets":[],"tarball_url":"https://api.github.com/repos/nodemcu/nodemcu-firmware/tarball/2.1.0-master_20170824","zipball_url":"https://api.github.com/repos/nodemcu/nodemcu-firmware/zipball/2.1.0-master_20170824","body":"Primarily a patch release to fix a fundamental bug in the cron module.\r\n- Fix for #2080\r\n- New native module for DS18B20, #2003 \r\n- Rewrite of the Lua DS18B20 module, #1996 \r\n\r\nTake a look at the ['2.1 follow-up patch' milestone](https://github.com/nodemcu/nodemcu-firmware/milestone/7) for the full list of all issues included in this release."}

As you can see, after issuing the http.get, I always get the HTTP client: Disconnected with error: 8 error. Then, after about a second pause, the request is successful and the callback is executed. I see that @FrankX0 observed the same in testing #2214 (related to this issue).

I'm able to reproduce the same behavior with several other API hosts (not just github.com).

Anyone have an idea of what this HTTP client: Disconnected with error: 8 really means and where it is coming from. I'm confused about what the error means when the request is ultimately successful.

@devsaurus
Copy link
Member

Anyone have an idea of what this HTTP client: Disconnected with error: 8 really means and where it is coming from.

Well, it originates from httclient's http_error_callback() that's invoked as the reconnection callback. The error code 8 is not a defined value/parameter in espconn.h and is probably nonsense, I presume.

Out of curiosity I ported the espconn apps for lwip and mbedtls from SDK's third_party in master. @heythisisnate's testcase passes smoothly now:

http.get("https://api.github.com/repos/nodemcu/nodemcu-firmware/releases/latest", {}, function(code,data) print(code, data) end)
> 200	{"url":"https://api.github.com/repos/nodemcu/nodemcu-firmware/releases/7514562","assets_url":"https://api.github.com/repos/nodemcu/nodemcu-firmware/releases/7514562/assets","upload_url":"https://uploads.github.com/repos/nodemcu/nodemcu-firmware/releases/7514562/assets{?name,label}","html_url":"https://github.com/nodemcu/nodemcu-firmware/releases/tag/2.1.0-master_20170824","id":7514562,"tag_name":"2.1.0-master_20170824","target_commitish":"master","name":"2.1.0-master_20170824","draft":false,"author":{"login":"marcelstoer","id":624195,"avatar_url":"https://avatars0.githubusercontent.com/u/624195?v=4","gravatar_id":"","url":"https://api.github.com/users/marcelstoer","html_url":"https://github.com/marcelstoer","followers_url":"https://api.github.com/users/marcelstoer/followers","following_url":"https://api.github.com/users/marcelstoer/following{/other_user}","gists_url":"https://api.github.com/users/marcelstoer/gists{/gist_id}","starred_url":"https://api.github.com/users/marcelstoer/starred{/owner}{/repo}","subscriptions_url":"https://api.github.com/users/marcelstoer/subscriptions","organizations_url":"https://api.github.com/users/marcelstoer/orgs","repos_url":"https://api.github.com/users/marcelstoer/repos","events_url":"https://api.github.com/users/marcelstoer/events{/privacy}","received_events_url":"https://api.github.com/users/marcelstoer/received_events","type":"User","site_admin":false},"prerelease":false,"created_at":"2017-08-24T19:26:33Z","published_at":"2017-08-24T19:32:27Z","assets":[],"tarball_url":"https://api.github.com/repos/nodemcu/nodemcu-firmware/tarball/2.1.0-master_20170824","zipball_url":"https://api.github.com/repos/nodemcu/nodemcu-firmware/zipball/2.1.0-master_20170824","body":"Primarily a patch release to fix a fundamental bug in the cron module.\r\n- Fix for #2080\r\n- New native module for DS18B20, #2003 \r\n- Rewrite of the Lua DS18B20 module, #1996 \r\n\r\nTake a look at the ['2.1 follow-up patch' milestone](https://github.com/nodemcu/nodemcu-firmware/milestone/7) for the full list of all issues included in this release."}

That might be a good indication - I haven't done any further tests, though.
If anybody is interested: you'll find this quick&dirty hack at https://github.com/devsaurus/nodemcu-firmware/tree/sdk_v2.2_preview. For convenience the http module is already enabled and MBEDTLS_SSL_MAX_CONTENT_LEN bumped to 5120. Start make from a clean source tree.

@devsaurus
Copy link
Member

I did some testing with the examples mentioned in this issue. There appears to be no regression for http, just the HTTP client: Disconnected with error: 8 is fixed.
I'll start a dedicated PR for this update as it improves things and is independent from any SDK release.

@heythisisnate
Copy link
Contributor

Can anyone else reproduce this failure? A http.getl that was working fine before on 1.5.4.1 is now not working at all. I can't connect to any Github api from the NodeMCU. For example this request that worked fine before now is not:

http.get("https://api.github.com/repos/nodemcu/nodemcu-firmware/releases/latest", {}, function(code,data) print(code, data) end)
> -1 nil

I can't figure out what changed. Is it just me?

@nwf
Copy link
Member

nwf commented Mar 13, 2018

@heythisisnate What version, exactly, are you testing against? If you're able to produce your own images, debug information would likely be illuminating.

@heythisisnate
Copy link
Contributor

@nwf I'm on branch 1.5.4.1-final. I rebuilt the fw with debug enabled. Here's the error:

=http.get("https://api.github.com/repos/nodemcu/nodemcu-firmware/releases/latest", nil, function(code,data) print(code, data) end)
HTTP client: hostname=api.github.com
HTTP client: port=443
HTTP client: method=GET
HTTP client: path=/repos/nodemcu/nodemcu-firmware/releases/latest
HTTP client: DNS request
HTTP client: DNS pending
> HTTP client: DNS found api.github.com 192.30.253.117
net_socket_received is called.
client handshake start.
client handshake failed
Error: SSL error 70
HTTP client: Disconnected
http_status=-1
-1	nil

It looks like Error: SSL error 70 is the important part here. I wonder if something changed on Github's infrastructure. I don't think that this is related directly to the topic of this issue, but it's interesting because the Github API was one of our test cases above and it was working fine on 1.5.4.1 until recently.

@FrankX0
Copy link
Contributor

FrankX0 commented Mar 13, 2018

On the current dev build it is working though:

200 {"url":"https://api.github.com/repos/nodemcu/nodemcu-firmware/releases/7514562","assets_url":"https://api.github.com/repos/nodemcu/nodemcu-firmware/releases/7514562/assets","upload_url":"https://uploads.github.com/repos/nodemcu/nodemcu-firmware/releases/7514562/assets{?name,label}","html_url":"https://github.com/nodemcu/nodemcu-firmware/releases/tag/2.1.0-master_20170824","id":7514562,"tag_name":"2.1.0-master_20170824","target_commitish":"master","name":"2.1.0-master_20170824","draft":false,"author":{"login":"marcelstoer","id":624195,"avatar_url":"https://avatars0.githubusercontent.com/u/624195?v=4","gravatar_id":"","url":"https://api.github.com/users/marcelstoer","html_url":"https://github.com/marcelstoer","followers_url":"https://api.github.com/users/marcelstoer/followers","following_url":"https://api.github.com/users/marcelstoer/following{/other_user}","gists_url":"https://api.github.com/users/marcelstoer/gists{/gist_id}","starred_url":"https://api.github.com/users/marcelstoer/starred{/owner}{/repo}","subscriptions_url":"https://api.github.com/users/marcelstoer/subscriptions","organizations_url":"https://api.github.com/users/marcelstoer/orgs","repos_url":"https://api.github.com/users/marcelstoer/repos","events_url":"https://api.github.com/users/marcelstoer/events{/privacy}","received_events_url":"https://api.github.com/users/marcelstoer/received_events","type":"User","site_admin":false},"prerelease":false,"created_at":"2017-08-24T19:26:33Z","published_at":"2017-08-24T19:32:27Z","assets":[],"tarball_url":"https://api.github.com/repos/nodemcu/nodemcu-firmware/tarball/2.1.0-master_20170824","zipball_url":"https://api.github.com/repos/nodemcu/nodemcu-firmware/zipball/2.1.0-master_20170824","body":"Primarily a patch release to fix a fundamental bug in the cron module.\r\n- Fix for #2080\r\n- New native module for DS18B20, #2003 \r\n- Rewrite of the Lua DS18B20 module, #1996 \r\n\r\nTake a look at the ['2.1 follow-up patch' milestone](https://github.com/nodemcu/nodemcu-firmware/milestone/7) for the full list of all issues included in this release."}

@nwf
Copy link
Member

nwf commented Mar 13, 2018

@heythisisnate If you're experiencing issues with 1.5.4.1-final, please open a new ticket. This one is for the dev branch.

@heythisisnate
Copy link
Contributor

@FrankX0 it is kinda working on the dev branch ... returns successfully, but not until after a Disconnected with error: 8 message and a timeout. This is still what's holding me back from moving to the dev branch.

=http.get("https://api.github.com/repos/nodemcu/nodemcu-firmware/releases/latest", nil, function(code,data) print(code, data) end)
HTTP client: hostname=api.github.com
HTTP client: port=443
HTTP client: method=GET
HTTP client: path=/repos/nodemcu/nodemcu-firmware/releases/latest
HTTP client: DNS request
HTTP client: DNS pending
> HTTP client: DNS found api.github.com 192.30.253.117
client handshake start.
please start sntp first !
please start sntp first !
client handshake ok!
HTTP client: Connected
HTTP client: Sending request header
HTTP client: All sent
Reason:[-0x7880]
HTTP client: Disconnected with error: 8
HTTP client: Connection timeout
HTTP client: Calling disconnect
HTTP client: manually Calling disconnect callback due to error -12
HTTP client: Disconnected
http_status=200
strlen(full_response)=3123
response={"url":"https://api.github.com/repos/nodemcu/nodemcu-firmware/releases/7514562","assets_url":"https://api.github.com/repos/nodemcu/nodemcu-firmware/releases/7514562/assets","upload_url":"https://uploads.github.com/repos/nodemcu/nodemcu-firmware/releases/7514562/assets{?name,label}","html_url":"https://github.com/nodemcu/nodemcu-firmware/releases/tag/2.1.0-master_20170824","id":7514562,"tag_name":"2.1.0-master_20170824","target_commitish":"master","name":"2.1.0-master_20170824","draft":false,"author":{"login":"marcelstoer","id":624195,"avatar_url":"https://avatars0.githubusercontent.com/u/624195?v=4","gravatar_id":"","url":"https://api.github.com/users/marcelstoer","html_url":"https://github.com/marcelstoer","followers_url":"https://api.github.com/users/marcelstoer/followers","following_url":"https://api.github.com/users/marcelstoer/following{/other_user}","gists_url":"https://api.github.com/users/marcelstoer/gists{/gist_id}","starred_url":"https://api.github.com/users/marcelstoer/starred{/owner}{/repo}","subscriptions_url":"https://api.github.com/users/marcelstoer/subscriptions","organizations_url":"https://api.github.com/users/marcelstoer/orgs","repos_url":"https://api.github.com/users/marcelstoer/repos","events_url":"https://api.github.com/users/marcelstoer/events{/privacy}","received_events_url":"https://api.github.com/users/marcelstoer/received_events","type":"User","site_admin":false},"prerelease":false,"created_at":"2017-08-24T19:26:33Z","published_at":"2017-08-24T19:32:27Z","assets":[],"tarball_url":"https://api.github.com/repos/nodemcu/nodemcu-firmware/tarball/2.1.0-master_20170824","zipball_url":"https://api.github.com/repos/nodemcu/nodemcu-firmware/zipball/2.1.0-master_20170824","body":"Primarily a patch release to fix a fundamental bug in the cron module.\r\n- Fix for #2080\r\n- New native module for DS18B20, #2003 \r\n- Rewrite of the Lua DS18B20 module, #1996 \r\n\r\nTake a look at the ['2.1 follow-up patch' milestone](https://github.com/nodemcu/nodemcu-firmware/milestone/7) for the full list of all issues included in this release."}<EOF>
200	{"url":"https://api.github.com/repos/nodemcu/nodemcu-firmware/releases/7514562","assets_url":"https://api.github.com/repos/nodemcu/nodemcu-firmware/releases/7514562/assets","upload_url":"https://uploads.github.com/repos/nodemcu/nodemcu-firmware/releases/7514562/assets{?name,label}","html_url":"https://github.com/nodemcu/nodemcu-firmware/releases/tag/2.1.0-master_20170824","id":7514562,"tag_name":"2.1.0-master_20170824","target_commitish":"master","name":"2.1.0-master_20170824","draft":false,"author":{"login":"marcelstoer","id":624195,"avatar_url":"https://avatars0.githubusercontent.com/u/624195?v=4","gravatar_id":"","url":"https://api.github.com/users/marcelstoer","html_url":"https://github.com/marcelstoer","followers_url":"https://api.github.com/users/marcelstoer/followers","following_url":"https://api.github.com/users/marcelstoer/following{/other_user}","gists_url":"https://api.github.com/users/marcelstoer/gists{/gist_id}","starred_url":"https://api.github.com/users/marcelstoer/starred{/owner}{/repo}","subscriptions_url":"https://api.github.com/users/marcelstoer/subscriptions","organizations_url":"https://api.github.com/users/marcelstoer/orgs","repos_url":"https://api.github.com/users/marcelstoer/repos","events_url":"https://api.github.com/users/marcelstoer/events{/privacy}","received_events_url":"https://api.github.com/users/marcelstoer/received_events","type":"User","site_admin":false},"prerelease":false,"created_at":"2017-08-24T19:26:33Z","published_at":"2017-08-24T19:32:27Z","assets":[],"tarball_url":"https://api.github.com/repos/nodemcu/nodemcu-firmware/tarball/2.1.0-master_20170824","zipball_url":"https://api.github.com/repos/nodemcu/nodemcu-firmware/zipball/2.1.0-master_20170824","body":"Primarily a patch release to fix a fundamental bug in the cron module.\r\n- Fix for #2080\r\n- New native module for DS18B20, #2003 \r\n- Rewrite of the Lua DS18B20 module, #1996 \r\n\r\nTake a look at the ['2.1 follow-up patch' milestone](https://github.com/nodemcu/nodemcu-firmware/milestone/7) for the full list of all issues included in this release."}

@devsaurus
Copy link
Member

@heythisisnate IIRC then I tried your command on #2269 and the Disconnected with error: 8 is gone with this PR.

@heythisisnate
Copy link
Contributor

Ok awesome. I'm cheering for #2269 to be merged soon then so we can put this issue to rest. As of now my program that uses the Github API isn't working properly on either 1.5.4.1 or dev 😢
Is there anything I can do to help with that?

@AlexSmok
Copy link

  • NodeMCU 2.1.0 built with Docker provided by frightanic.com
  • branch: dev
  • SSL: true

If the browser accesses

https://api.telegram.org/bot446398269:AAF7jqBWZhj9oJlWT2f8mCS8_X5_LxJGLRA/sendMessage?chat_id=211698604&text=Hello!

then the response comes quickly.

If the request from NodeMCU

http.get(text_req, nil, function(code, data)
                if (code < 0) then 
                    print("telegram bot failed")
                    print(code, data) 
                else 
                    print(code, data) 
                end
            end) 

then

HTTP client: hostname=api.telegram.org
HTTP client: port=443
HTTP client: method=GET
HTTP client: path=/bot446398269:AAF7jqBWZhj9oJlWT2f8mCS8_X5_LxJGLRA/sendMessage?chat_id=211698604&text=Hello!
HTTP client: DNS request
HTTP client: DNS pending
HTTP client: DNS found api.telegram.org 149.154.167.220
HTTP client: Connection timeout
HTTP client: Calling disconnect
HTTP client: manually Calling disconnect callback due to error -12
HTTP client: Disconnected

How to search for this error?

@TerryE
Copy link
Collaborator

TerryE commented Mar 14, 2018

Just a though, but looking the code, it does a dns resolve in the loop so to speak, and if this takes too long then your client might timeout. However, the esconn stack caches the last 4 resolutions so have you tried doing a

  net.dns.resolve('api.telegram.org', funcToDoTheGet)

and that way the DNS name is resolved and cached before you start the HTTP dialogue.

@heythisisnate
Copy link
Contributor

@AlexSmok this is the exact same issue I'm seeing. Allegedly it's fixed in SDK 2.2 which is waiting on this PR: #2269

@AlexSmok
Copy link

@TerryE This way the same does not work

@FrankX0
Copy link
Contributor

FrankX0 commented Mar 14, 2018

@AlexSmok, @heythisisnate : I can confirm that the above example is indeed fixed by #2269.

HTTP client: hostname=api.telegram.org
HTTP client: port=443
HTTP client: method=GET
HTTP client: path=/bot446398269:AAF7jqBWZhj9oJlWT2f8mCS8_X5_LxJGLRA/sendMessage?chat_id=211698604&text=Hello!
HTTP client: DNS request
HTTP client: DNS pending
HTTP client: DNS found api.telegram.org 149.154.167.220
client handshake start.
client handshake ok!
HTTP client: Connected
HTTP client: Sending request header
HTTP client: All sent
Reason:[-0x7880]
HTTP client: Disconnected
http_status=200
strlen(full_response)=698
response={"ok":true,"result":{"message_id":91,"from":{"id":446398269,"is_bot":true,"first_name":"\u041c\u043e\u0438 \u0434\u0430\u0442\u0447\u0438\u043a\u0438","username":"smok_sensors_bot"},"chat":{"id":211698604,"first_name":"Alex","last_name":"Smok","username":"alexsmok","type":"private"},"date":1521057955,"text":"Hello!"}}<EOF>

This will again make things better, but there are still secure websites which cannot be accessed correctly.
Example:

HTTP client: hostname=nodemcu-build.com
HTTP client: port=443
HTTP client: method=GET
HTTP client: path=/
HTTP client: DNS request
HTTP client: DNS pending
HTTP client: DNS found nodemcu-build.com 217.26.52.40
client handshake start.
client handshake ok!
HTTP client: Connected
HTTP client: Sending request header
HTTP client: All sent
client's data invalid protocol
Reason:[-0x7200]
HTTP client: Disconnected with error: -114
HTTP client: Connection timeout
HTTP client: Calling disconnect
HTTP client: manually Calling disconnect callback due to error -12
HTTP client: Disconnected
HTTP client: Chunk Size:8192
HTTP client: Chunk Size:0

This even results in a reset of the device.

@nwf
Copy link
Member

nwf commented Mar 15, 2018

0x7200 is, in particular, MBEDTLS_ERR_SSL_INVALID_RECORD and is the return code when a TLS message is longer than the buffer allocated. See the chatter above around MBEDTLS_SSL_MAX_CONTENT_LEN . It would be great if you could crank up the SSL debug level as well, though, to confirm.

The reset of the device seems more likely a bug in the http module than anything SSL-specific. The 8192 and 0 chunk sizes sure seem suspicious.

@AlexSmok
Copy link

AlexSmok commented Mar 15, 2018

Strange.

If I call the

http.get(text_req, nil, function(code, data)
                if (code < 0) then 
                    print("telegram bot failed")
                    print(code, data) 
                else 
                    print(code, data) 
                end
            end) 

code before the http server starts (Immediately after receiving the IP address.) , then the code works like this:

HTTP client: hostname=api.telegram.org
HTTP client: port=443
HTTP client: method=GET
HTTP client: path=/bot446398269:AAF7jqBWZhj9oJlWT2f8mCS8_X5_LxJGLRA/sendMessage?chat_id=211698604&text=Hello!
HTTP client: DNS request
HTTP client: DNS pending
HTTP client: DNS found api.telegram.org 149.154.167.220
client handshake start.
please start sntp first !
please start sntp first !
client handshake ok!
HTTP client: Connected
HTTP client: Sending request header
HTTP client: Sending request body
Reason:[-0x7880]
HTTP client: Disconnected with error: 8
HTTP client: Connection timeout
HTTP client: Calling disconnect
HTTP client: manually Calling disconnect callback due to error -12
HTTP client: Disconnected
http_status=200
strlen(full_response)=699
response={"ok":true,"result":{"message_id":102,"from":{......

If after the start of the HTTP server, I get a timeout.

UPD.
Http server is not to blame. I wrote the script:
Immediately after receiving the ip

tmr.create (): alarm (7000, tmr.ALARM_AUTO, telegram)

  and watched.

I did not notice any dependence.

The code either works many times in a row or a timeout.

@FrankX0
Copy link
Contributor

FrankX0 commented Mar 15, 2018

@nwf: the 0x7200 error when accessing nodemcu-build.com is indeed caused by the ssl buffer being too small. Increasing it to a minimum of 8192, results in (without resetting):

HTTP client: hostname=nodemcu-build.com
HTTP client: port=443
HTTP client: method=GET
HTTP client: path=/
HTTP client: DNS request
HTTP client: DNS pending
> HTTP client: DNS found nodemcu-build.com 217.26.52.40
client handshake start.
client handshake ok!
HTTP client: Connected
HTTP client: Sending request header
HTTP client: All sent
E:M 6320
HTTP client: Response too long (6311)
E:M 6320
HTTP client: Response too long (6311)
E:M 5752
HTTP client: Response too long (5743)
client's data invalid protocol
Reason:[-0x45]
HTTP client: Disconnected with error: -69
HTTP client: Connection timeout
HTTP client: Calling disconnect
HTTP client: manually Calling disconnect callback due to error -12
HTTP client: Disconnected
http_status=-1

But I guess this shows that the limited amount of RAM is too small for this website (sorry @marcelstoer).

Maybe now is the time to close this topic and merge SDK 2.2?

One addition.
I encountered a site which also results in a 0x7200 error, but this is due to a minor version mismatch (with debugging enabled). The version reported: [3:4]. Does this suggest TLS v1.3?
If this is true, how can I set the maximum version supported to v1.2?

@nwf
Copy link
Member

nwf commented Mar 15, 2018

I have no idea how to pin the maximum version... I would have guessed that version negotiation would have done the right thing here. This suggests something broken about mbedTLS, espressif's use of it, or the remote server. What is the remote, speaking of? It may be useful to point https://www.ssllabs.com/ssltest/ at it and see what it says?

Please open a separate bug for the device reset induced by the http client.

I would be in favor of this particular bug being closed; I think it has outlived its usefulness.

@FrankX0
Copy link
Contributor

FrankX0 commented Mar 28, 2018

I got confirmed by the developers of mbed TLS that the issue I found (minor version mismatch) is due to a problem in de client. So my assumption is that somehow the received data is malformed.
I would like to dig into this further, but to do so I need te get access to the data actually received by the ESP8266.
Is there some debug option I can enable so I can get access to the network data received during setting up of the secure connection?

@TerryE
Copy link
Collaborator

TerryE commented Mar 28, 2018

Frank the LFS patch will be out in a week or so, after our next staging of dev->`master, and it inculdes an enhanced remote gebug facility. This gist gives you an idea of the sorts of things that you can do with this. Basically its the standard GDB debugger but with one arm chopped off: annoying but usable. In your case debugging set up stacks, you will need to up the timeouts into minutes.

It's either that or lots of lua_debug() print statements through the code.

@nwf
Copy link
Member

nwf commented Mar 28, 2018

@FrankX0 tcpdump is always available and not ESP-specific, if all you need is the TLS records and encrypted payloads (which, I suspect, is enough for this case). There's also, after #2267, tls.setDebug() to engage the existing mbedTLS debug infrastructure. If you call tls.setDebug(4), you should get really quite verbose chatter about the library's operation, which should include the raw bytes exchanged over the wire as well as information of utility to mbedTLS upstream.

@marcelstoer
Copy link
Member Author

Some issues are fixed with SDK 2.2, some can't be fix on this platform and for all others there should be dedicated issues -> closing

@nodemcu nodemcu locked as resolved and limited conversation to collaborators Mar 28, 2018
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Projects
None yet
Development

No branches or pull requests