Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

WiFiClientSecure invalid read buffer #3002

Closed
diegopx opened this issue Feb 28, 2017 · 6 comments
Closed

WiFiClientSecure invalid read buffer #3002

diegopx opened this issue Feb 28, 2017 · 6 comments

Comments

@diegopx
Copy link

diegopx commented Feb 28, 2017

Basic Infos

Hardware

Hardware: ESP-8266 in a Sonoff Switch
Core Version: 2.3.0

Description

WiFiClientSecure may read from an invalid buffer, causing any number of undefined behaviours.

To compute available(), the internal SSLContext in WiFiClientSecure calls axTLS's ssl_read(SSL* ssl, uint8_t** in_data), which gives access to a temporary internal buffer through in_data (See notes which indicate to use in_data before performing any other ssl_* operations). SSLContext then proceeds to save a reference to the internal buffer in its _read_ptr member field, as a cache for subsequent calls to SSLContext.read().

However, unless the read() calls are performed immediately after calling available(), the reference saved in _read_ptr may not be valid anymore. In particular, if the context is used for writing in between, the reference will be obsolete and calls to WiFiClientSecure.read() will return garbage (or access invalid memory).
The problem is aggravated by the fact that many functions call available(), including connected().

In particular, I'm using a modified PubSubClient to implement a MQTT client (modified to use an internal WiFiClientSecure as its underlying channel). Using a simple loop as shown below halts the program, since by the time the program actually reads the buffer, the MQTT packet's "message length" is an arbitrary number (typically larger than the real message length, which is very small) and the readPacket() function tries to read beyond what the broker has sent, blocking forever (or until the socket times out).

void loop() {
  // ...
  // The broker has sent a message to the device (waiting on the ssl socket)
  if (mqtt.connected()) {
    // connected() -> available() -> ssl_read()
    // WiFiClientSecure.read() would return 'device, I love you'
    
    if (!credentials) {
      // ask for credentials
      mqtt.publish(...) // -> WiFiClientSecure.write() -> ssl_write()
    }
    
    // The reference is invalid, WiFiClientSecure.read() may return the complete
    // works of Shakespeare (modulo FLASH_MEMORY_SIZE);
    
    mqtt.loop();  // -> readPacket() -> WiFiClientSecure.read()
    // this line may never be reached, because loop() halted
  }
  else {
    // connect mqtt
  }
}

If the order of the internal operations is reversed, everything works fine.

void loop() {
  // ...
  // The broker has sent a message to the device (waiting on the ssl socket)
  if (mqtt.connected()) {
    // connected() -> available() -> ssl_read()
    
    mqtt.loop();  // -> readPacket() -> WiFiClientSecure.read()
    // loop() has no problems, since nobody has called
    // an ssl_*() function before reading the buffer.
    // The device has gotten the message of love from the broker;
    // it may now work in peace
    
    if (!credentials) {
      // ask for credentials
      mqtt.publish(...) // -> WiFiClientSecure.write() -> ssl_write()
    }
    
    // The reference is invalid, but nobody cares
  }
  else {
    // connect mqtt
  }
}

Some possible solutions:

  1. Always make a copy of the ssl_read() output. (May be too memory hungry).
  2. Do not use ssl_read() for anything but actually reading the complete stream. (Would require changes to the axTLS library to expose functions to get available bytes and reading into a fixed-size buffer).
  3. Forbid write access (i.e. return error code) until all remaining bytes in the ssl_read buffer have been read. (Easiest to implement: if (available > 0) { return READ_BLOCK; }, but it seems too harsh).
  4. Modify axTLS to lift the requirement, i.e. ensure in_data is valid until the next call to ssl_read. (I don't know how entrenched is this into the library).

Settings in IDE

Module: Generic ESP8266 Module
Flash Size: 1MB
CPU Frequency: 80Mhz
Flash Mode: DIO
Flash Frequency: 40Mhz
Upload Using: SERIAL
Reset Method: ck

@igrr
Copy link
Member

igrr commented Feb 28, 2017

This is a very valid issue. The pointer will not be invalid (axTLS uses same buffer for reading and writing, and the buffer is not re-allocated), but if you do a write after an incomplete read, then the next read operation will happily return data from the buffer which now contains encrypted transmitted data. AxTLS uses one buffer for reading and writing, so is essentially half-duplex. You must read out all data at the application layer before writing data into axTLS, otherwise you get this issue. This doesn't happen with e.g. HTTP and SMTP due to half-duplex nature of these protocols. This does however happen with MQTT.

Of all the options listed, option 2 looks like it has highest chances of being implemented, even though I wouldn't like to deviate too much from the original axTLS API to make future updates easier.

@diegopx
Copy link
Author

diegopx commented Mar 3, 2017

I've implemented the solutions in options 2 and 3 in #3019.

@mtnbrit
Copy link

mtnbrit commented Mar 19, 2017

I am having trouble with MQTT over TLS and I suspect this is the issue, I'm using 2.3.0 stable and lmroy's pubsubclient.

My esp is connected to the broker on port 8883 and subscribed to a couple of topics (a command channel for example) and is also publishing data at intervals on other topics. Im finding the subscriptions are not receiving published messages from the broker if the esp itself is actively publishing, it basically ignores commands sent to it, and eventually crashes some time later with exception 9 and others, I assume due to corrupt buffers. If I switch to non-tls port 1883, I have no such issues, no crashes and the subscriptions always hear the published messages. I appear to have about 23k of heap free under tls.

What would be the shortest path to making this work properly over TLS? Can I deploy the #3019 patch over my 2.3.0? Or wait for a new version? or simply do the mqtt client.loop before the publish? (Im not sure that will solve the issue fully).

I appreciate your expertise and suggestions.

@diegopx
Copy link
Author

diegopx commented Mar 24, 2017

You should definitely try the #3019 and knolleary/pubsubclient#251 patches.
And try to publish outside the MQTT loop, in case you get multiple messages in one TCP packet.

After you set that up, tell us how it went ;)

@vshymanskyy
Copy link

We're experiencing the same problem when using Blynk (https://github.com/blynkkk/blynk-library) with WiFiClientSecure. Blynk is also a Full-Duplex protocol.

@igrr
Copy link
Member

igrr commented Dec 26, 2017

Workaround implemented in #4024.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants