Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

2.4.1 memory leak #4497

Closed
ghost opened this issue Mar 10, 2018 · 36 comments
Closed

2.4.1 memory leak #4497

ghost opened this issue Mar 10, 2018 · 36 comments

Comments

@ghost
Copy link

ghost commented Mar 10, 2018

Today I noticed 2.4.1 was released for my Arduino IDE so I gave it a try. Unfortunately my app crashed after a few minutes. Memory pressure is always a good first guess so I added calls to ESP.getFreeHeap() and sure enough, the heap was smaller after each wifi client reconnection. It started at about 32000 bytes and dropped by a few kB each time. When the heap reached a few hundred bytes it paniced.

This was using lwip 2 for higher bandwidth. lwip 2 for lower memory acts the same way. I tried using lwip 1.4 forbandwidth but can't tell about the leak because it just flat fails to connect after a few connections.

My app is about 850 kB program and 45 kB RAM and not easy to post here. Basically it opens several web connections in a round robbin fashion, calling ::stop after finished with each. I noticed in the 2.4.1 release notes that ::stop no longer destroys ClientContext so I wonder if this might be related.

2.4.0 with lwip 1.4 precompiled is still the golden combo for me, it runs beautifully forever.

Thanks.

@bobybobybob
Copy link

bobybobybob commented Mar 11, 2018

Yes that code https://blog.thesen.eu/wp-content/uploads/2017/05/ESP_TH_Logger_V2.zip - A Temperature Logger for ESP8266 ( https://blog.thesen.eu/wlan-lufttemperatur-und-feuchte-logger-mit-grafischer-darstellung-fuer-esp8266/ ) works perfect with 2.3.0 but 2.4.0 do not work... every time load the site it eats free RAM until RAM gets out and no working any more.

With 2.4.1 it is worser and the pages are not every time reachable and eating RAM every page refresh into dead faster.

but when i enter
delay(1000);
before client.stop(); the memory leak is some time away but trouble too.

2.3.0 work best

@ghost
Copy link
Author

ghost commented Mar 11, 2018

Sorry, I meant to say 2.3.0 is the best, not 2.4.0.

@reaper7
Copy link
Contributor

reaper7 commented Mar 12, 2018

I have similar observations in my code with two clients, first client for ThingSpeak lib and second for pubsubclient.
Every minute I upload data to ts and to local mqtt server and this operation eats 112bytes (56bytes per client).
Does not matter LWIP 2 or 1.4, I think that some changes in wifi client in the second half of february...because before nothing bad was happening.

so, I move thingspeak client declaration from global to local (available only inside upload procedure and, I hope, is destroyed on end) and replace pubsubclient with esp-mqtt-arduino (by i-n-g-o),
these two steps solved the disappearance of memory.

But clearly something has changed...maybe client resource is not released?
Someone from specialists can say something more (56bytes per client, this is the trail)

half hour chart (with fixed thingspeak but still on pubsubclient as explained above) 1minute = -56bytes from heap:
mqtt

@mikekgr
Copy link

mikekgr commented Mar 12, 2018

No, @reaper7 in my case this does not improve anything... unfortunately.

@reaper7
Copy link
Contributor

reaper7 commented Mar 12, 2018

please check this simple sketch for show difference between global and local client declaration and memory leak when global...
simply comment or uncomment #define GLOBALCLIENT (and change ssid/pass and server ip!)

#include <ESP8266WiFi.h>

//#define GLOBALCLIENT

const uint32_t defaultwait = 2000;

const uint16_t port = 80;
const char * host = "192.168.0.254";

uint32_t memcurr = 0;
uint32_t memlast = 0;
uint32_t counter = 0;

#ifdef GLOBALCLIENT
WiFiClient client;
#endif

void setup() {
  Serial.begin(115200);
  delay(10);

  WiFi.mode(WIFI_STA);
  WiFi.begin("SOMESSID", "SOMEPASS");

  Serial.println();
  Serial.println();
  Serial.print("Wait for WiFi...");

  while( WiFi.status() != WL_CONNECTED ) {                                      //  Wait for WiFi connection
    delay(50);
    Serial.print(".");
  }

  Serial.println("");
  Serial.print("WiFi connected, local IP: ");
  Serial.println(WiFi.localIP());

#ifdef GLOBALCLIENT
  Serial.println("Example with global defined client");
#else
  Serial.println("Example with local defined client");
#endif

  memcurr = ESP.getFreeHeap();
  Serial.printf("FREEHeap: %d; DIFF %d\n", memcurr, memcurr - memlast);
  memlast = memcurr;

  delay(500);
}


void loop() {
  Serial.printf("PROBE: %d -----------------\n", ++counter);

  Serial.print("connecting to ");
  Serial.println(host);

#ifndef GLOBALCLIENT
  WiFiClient client;
#endif

  if (!client.connect(host, port)) {
    Serial.println("connection failed");
    Serial.println("wait 5 sec...");
    delay(5000);
    return;
  }

  // This will send the request to the server
  client.println("Send this data to server");

  //read back one line from server
  String line = client.readStringUntil('\r');
  Serial.println(line);

  Serial.println("closing connection");
  client.stop();

  memcurr = ESP.getFreeHeap();
  Serial.printf("FREEHeap: %d; DIFF %d\n", memcurr, memcurr - memlast);
  memlast = memcurr;

  Serial.printf("wait %d ms...\n\n", defaultwait);
  delay(defaultwait);
}

if client is declared inside loop, then memory is reduced every loop by the first 17 connections
but after this stabilizes at 42424bytes and it does not decrease

WiFi connected, local IP: 192.168.0.118
Example with local defined client
FREEHeap: 46024; DIFF 46024
PROBE: 1 -----------------
connecting to 192.168.0.254

closing connection
FREEHeap: 45616; DIFF -408
wait 2000 ms...

PROBE: 2 -----------------
connecting to 192.168.0.254

closing connection
FREEHeap: 44944; DIFF -672
wait 2000 ms...

PROBE: 3 -----------------
connecting to 192.168.0.254

closing connection
FREEHeap: 44776; DIFF -168
wait 2000 ms...
.
.
.
PROBE: 17 -----------------
connecting to 192.168.0.254

closing connection
FREEHeap: 42424; DIFF -168
wait 2000 ms...

PROBE: 18 -----------------
connecting to 192.168.0.254

closing connection
FREEHeap: 42424; DIFF 0
wait 2000 ms...
.
.
.
PROBE: 24 -----------------
connecting to 192.168.0.254

closing connection
FREEHeap: 42424; DIFF 0
wait 2000 ms...

but if client is declared globally at top of sketch, then memory is reduced much more every loop by the first 17 connections and from connection number 18 never stop and consumes 56bytes per loop

WiFi connected, local IP: 192.168.0.118
Example with global defined client
FREEHeap: 46008; DIFF 46008
PROBE: 1 -----------------
connecting to 192.168.0.254

closing connection
FREEHeap: 45600; DIFF -408
wait 2000 ms...

PROBE: 2 -----------------
connecting to 192.168.0.254

closing connection
FREEHeap: 44872; DIFF -728
wait 2000 ms...

PROBE: 3 -----------------
connecting to 192.168.0.254

closing connection
FREEHeap: 44648; DIFF -224
wait 2000 ms...
.
.
.
PROBE: 17 -----------------
connecting to 192.168.0.254

closing connection
FREEHeap: 41512; DIFF -224
wait 2000 ms...

PROBE: 18 -----------------
connecting to 192.168.0.254

closing connection
FREEHeap: 41456; DIFF -56
wait 2000 ms...
.
.
.
PROBE: 24 -----------------
connecting to 192.168.0.254

closing connection
FREEHeap: 41120; DIFF -56
wait 2000 ms...

@mikekgr
Copy link

mikekgr commented Mar 12, 2018

@reaper7 Dear friend, I followed your suggestion to put inside the related function the "WiFiClient client;" so it is not global and the result is way better. In this case I loose 4*168 bytes + 424 bytes but then the leakage stopped... My project still under testing but your suggestion and analysis drove me to the right direction. Thanks a lot.
P.S. my project update values at ThingSpeak among other...

@devyte
Copy link
Collaborator

devyte commented Mar 12, 2018

Note: this sounds like something not being freed in WiFiClient::operator=()

@igrr
Copy link
Member

igrr commented Mar 13, 2018

Will make a PR later today.

@mikekgr
Copy link

mikekgr commented Mar 14, 2018

Dear @igrr
when we will expect this PR? Please give us a time frame if possible.
Many thanks for your nice work.

@s0170071
Copy link

hm... 56 bytes. This may be related to #4487 and #4488 then as I was also losing 56 bytes at a time.

@igrr
Copy link
Member

igrr commented Mar 14, 2018

Created #4516 with a fix.

Workaround for 2.4.1:

client = WiFiClient();    // add this line for 2.4.1
client.connect(host, port);

@igrr
Copy link
Member

igrr commented Mar 14, 2018

Milestoned as 2.5.0, but this might justify releasing 2.4.2 along with some other fixes.

@reaper7
Copy link
Contributor

reaper7 commented Mar 14, 2018

@igrr, I can add this:

client = WiFiClient();    // add this line for 2.4.1
client.connect(host, port);

to my own code

but how do this when external library like pubsubclient is used,
where client is passed to lib at top of sketch:

#include <ESP8266WiFi.h>
#include <PubSubClient.h>

WiFiClient espClient;
PubSubClient client(espClient);

@igrr
Copy link
Member

igrr commented Mar 14, 2018

Do you have a line in your sketch where you call client.connect? Then add espClient = WiFiClient(); before that line, I think it should work. The only case when it doesn't work is when the library reconnects automatically. Yes, this is not a perfect workaround.

@reaper7
Copy link
Contributor

reaper7 commented Mar 14, 2018

so, at this moment, completely correct "client" operation without editing external libs is not possible?

@igrr
Copy link
Member

igrr commented Mar 14, 2018

In general case, no, not possible.

However if you need to use 2.4.1 with PubSubClient, the workaround is applicable.
E.g. in mqtt_esp8266 example:

void loop() {

  if (!client.connected()) {
    espClient = WiFiClient();  // workaround
    reconnect();
  }
  client.loop();

If you are using git version, you can also do git fetch origin && git checkout pull/4516/head to get the PR branch.

@reaper7
Copy link
Contributor

reaper7 commented Mar 14, 2018

My last question :)
As I understand, this is temporary solution for 2.4.1 and current git but not needed in the future?
or maybe the general rule for use from now?

@igrr
Copy link
Member

igrr commented Mar 14, 2018

Yes, this is just a workaround if you are having this issue and need to use 2.4.1.

@debsahu
Copy link

debsahu commented Mar 14, 2018

@igrr Does this also fix WiFiClientSecure as well? Looks like edits are only to WiFiClient.

Edit: WiFiCientSecure does call WiFiClient. Just making sure.

@earlephilhower
Copy link
Collaborator

WiFiClientSecure should have the same issue (and same fix) because it uses the WiFiClient::connect() to initiate the actual TCP-level connection. It's worth trying his fix, if you can do so...

@s0170071
Copy link

s0170071 commented Mar 15, 2018

How would I apply this fix for a re-entrant function, e.g.


rxtx{
 switch case (state){
 case tx: 
  wificlient c; 
  c.connect();
  c.write(...);
  state= rx; 
 break;
 case rx:
   c ??? <- gone !
  break;
 }
}

edit: maybe do it with pointers ? Create a client with new() before TX and delete it after RX ?

@rexlee666
Copy link

rexlee666 commented May 18, 2018

FWIIW , after getting rid of all Strings in my code and only using char arrays, I got down to having the 56 byte leak/call, which got me to this thread.
I then added the suggested work around, which has got around the problem. No leaks for hours!
Thanks for tip!

client = WiFiClient(); // add this line for 2.4.1
if (client.connect("api.thingspeak.com" , 80 ))
{
...
}

@d-a-v d-a-v modified the milestones: 2.5.0, 2.4.2 May 31, 2018
@d-a-v d-a-v reopened this May 31, 2018
@paai
Copy link

paai commented Jun 7, 2018

I had the same problem. My program connects every minute to a server and writes some data (temperature and time) to it. Kept running out of memory and crashing.

After declaring the client local, a test run of 1500 iterations shows me that Freeram keeps varying between 43K and 32K when the program does nothing else. I can live with that. Still wondering what happens though.

Paai

@d-a-v
Copy link
Collaborator

d-a-v commented Jun 7, 2018

@paal This is a bug in 2.4.1 fixed in latest git version.

@paai
Copy link

paai commented Jun 8, 2018

Now I am confused. In the library manager of the Arduino IDE I see the esp8266wifi library of Ivan Grothotkov version 1.0.0 and here you are talking about 2.4.1. Is this the same library, and is the Arduino IDE just not up to date? Or is 2.4.1. a different animal? I'd rather not mess up my system, now that things are working nicely.

Paai

@ryanfort
Copy link

ryanfort commented Jun 12, 2018

I must be confused, but before using ST_Anything, I didn't seem to have this leak, I didn't see it in the console of any of my units, and they've been running for months without issue. I just got into ST_Anything yesterday, and noticed the memory leak happening, and just cloned the repository yesterday, so I know I have all the latest libraries and sketches. So I am not sure what I need to downgrade in order to fix this memory issue, specifically? I checked what libraries I have installed, and I saw nothing pertaining to "2.4" and WiFi... (basically, same thing @paai commented on).

Otherwise, awesome product you have here, really enjoying it!

@ryanfort
Copy link

Okay, figured it out...it's the board manager version, not a library version.

@paai - in Arduino, Tools -> Board -> Boards Manager, "esp8266 by ESP8266 Community". It'll show you have 2.4.1 installed, select 2.3.0 and install it, then recompile and upload your sketch.

@cyberlink1
Copy link

Ryanfort, Thanks for posting that. I have been tracking a memory leak in my program for a week. It was loosing 56 bytes every 30 seconds or so and crashing in 6 hours. Downgraded my "esp8266 by ESP8266 Community" board from 2.4.1 to 2.3.0 and it all went away! It is all now stable!

@dccharacter
Copy link

Can somebody please explain what is happening here and why a permanent fix is not possible? If downgrading to 2.3.0 solves the problem, why 2.4.1 cannot be updated to incorporate the fix?

@devyte
Copy link
Collaborator

devyte commented Jul 9, 2018

@dccharacter I invite to look at the labels for this issue, where you will notice that this issue is already staged for release, which means it is already fixed. I invite you to then look at the milestones, where you will notice that this issue is targeted for release 2.4.2. If you investigate the milestones for 10 seconds, you can find out that release 2.4.2 is targeted for 1/Aug, which means it hasn't happened yet.

@dccharacter
Copy link

Thanks for explanation! I just read through the thread and that wasn't clear.
Gosh, it pains me that I wasted a few months for swapping modules and troubleshooting my code, than gave up and just by accident discovered today that I might have a memory leak. Thanks to google found this thread. If not for this luck, would probably totally abandon the project :-(

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests