Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Move C++ vtables into IRAM, out of HEAP #4179

Merged
merged 1 commit into from
Jan 17, 2018

Conversation

earlephilhower
Copy link
Collaborator

This linker change saved my own app about 600 bytes of heap by pushing class vtables into .text and out of .rodata. Only 1 line in the linker was added, really, with the rest of the diff simply moving .rodata and .bss after .text so that it can catch the specific mangled name.

...snip...
GCC places vtables in .rodata, with a mangled name of "_ZTV*." Because
these are simply address jump tables, there is no need to place them in
RAM. Instead, have the linker place them in the TEXT (aka ROM) section.
This will free up a variable amount of heap space, depending on the number
of classes with virtual functions used in any particular project.

@devyte
Copy link
Collaborator

devyte commented Jan 17, 2018

I'm a bit confused here. Does this mean that virtual class methods can no longer be called from an ISR?

@igrr
Copy link
Member

igrr commented Jan 17, 2018

@devyte no, .text is loaded into IRAM, which is accessible from ISRs.

The MR description says "TEXT (aka ROM)", but actually the section is moved into .text, which is in IRAM. It is .irom0.text that is in Flash.

@igrr
Copy link
Member

igrr commented Jan 17, 2018

Am i not reading the diff correctly? Looks like it is in .text:

   .text : ALIGN(4)
   {
     _stext = .;
     _text_start = ABSOLUTE(.);
      *(.UserEnter.text)
     . = ALIGN(16);
     *(.DebugExceptionVector.text)
     . = ALIGN(16);
     *(.NMIExceptionVector.text)
     . = ALIGN(16);
     *(.KernelExceptionVector.text)
     LONG(0)
     LONG(0)
     LONG(0)
     LONG(0)
     . = ALIGN(16);
     *(.UserExceptionVector.text)
     LONG(0)
     LONG(0)
     LONG(0)
     LONG(0)
     . = ALIGN(16);
     *(.DoubleExceptionVector.text)
     LONG(0)
     LONG(0)
     LONG(0)
     LONG(0)
     . = ALIGN (16);
     *(.entry.text)
     *(.init.literal)
     *(.init)
      *(.literal .text .literal.* .text.* .stub .gnu.warning .gnu.linkonce.literal.* .gnu.linkonce.t.*.literal .gnu.linkonce.t.*)
      *.cpp.o(.iram.text)
      *.c.o(.iram.text)
 +    *(.rodata._ZTV*) /* C++ vtables */
      /* PlatformIO */

@earlephilhower
Copy link
Collaborator Author

Mybad on the .irom0.text, please ignore. I confused myself. :(

@devyte
Copy link
Collaborator

devyte commented Jan 17, 2018

That's why I was confused, thanks!

@earlephilhower
Copy link
Collaborator Author

@devyte is there an example I could try with the behaviour you're worried about? I've done several large and small apps w/this change and seen no trouble, but I don't know of any ISR based examples, or beyond that ones which use C++ class virtual functions...

@igrr
Copy link
Member

igrr commented Jan 17, 2018

E.g. call a virtual function from a GPIO ISR, while doing flash write operation (e.g. SPIFFS.format) in the loop. Click the button a few times and it should trigger.

GCC places vtables in .rodata, with a mangled name of "_ZTV*."  Because
these are simply address jump tables, there is no need to place them in
RAM.  Instead, have the linker place them in the .text (aka IRAM) section.
This will free up a variable amount of heap space, depending on the number
of classes with virtual functions used in any particular project.
@earlephilhower earlephilhower changed the title Move C++ vtables into PROGMEM, out of RAM Move C++ vtables into IRAM, out of HEAP Jan 17, 2018
@earlephilhower
Copy link
Collaborator Author

Just ran the suggested test and it seems to work just fine. The last commit just updated the commit message to be more correct.

#include <FS.h>

volatile bool inFlashOP;

class a1
{
  public:
  a1() { cntr = -1; hits = -1;};
  virtual ~a1() {};
  virtual void incr() {  };
  volatile int cntr;
  volatile int hits;
};

class a2 : public a1
{
  public:
  a2() { cntr=0; hits = 0;};
  virtual ~a2() {};
  virtual void incr() { cntr++; if (inFlashOP) hits++; };
};

a1 *obj;

void toggle() {
  obj->incr();
}

void setup() {
  Serial.begin(115200);
  obj = (a1 *)new(a2);
  pinMode(D5, INPUT_PULLUP);
  attachInterrupt(D5, toggle, CHANGE);
  Serial.println("begin");
}

void loop() {
  static int c=0;
  inFlashOP = true;
  SPIFFS.format();
  inFlashOP = false;
  Serial.printf("%d: cnt=%d, hits=%d\n", c++, obj->cntr, obj->hits);
  delay(1000);
}

output:

0: cnt=35, hits=35
1: cnt=190, hits=153
2: cnt=190, hits=153
...

@devyte devyte merged commit bb794f9 into esp8266:master Jan 17, 2018
@slaff
Copy link

slaff commented Feb 2, 2018

@earlephilhower Thanks a lot for that optimization! It works great especially for heavy C++ projects like Sming

@earlephilhower
Copy link
Collaborator Author

@slaff If you've got full toolchain control, rebuild GCC with the option mentioned https://github.com/esp8266/Arduino/projects/5 (JUMP_TABLES_IN_TEXT) and you'll get a free bump as well, assuming you have anywhere with large switch() tables.

@slaff
Copy link

slaff commented Feb 3, 2018

@earlephilhower As if it's Christmas! From the first improvement with the linker script I got circa 2.5 K more free heap. With the second one with the JUMP_TABLES I gained ... (I thought I can get probably 200 bytes max).. but no - I got circa 2K. In total almost 5K extra free heap, which is great! And up to now I haven't noticed any stability issues .

I would like to thank you again for theses improvements!!! Awesome stuff!

@kylefleming
Copy link
Contributor

@igrr @earlephilhower Should this be moved to flash to save space in ram?

@igrr
Copy link
Member

igrr commented Feb 19, 2018

I think we have discussed this on Gitter, but the catch is that Flash is not accessible at times, for example when an interrupt handler is called while cache is disabled. So if you have a VTable access in an ISR, and cache is disabled, then there will be a problem.

@kylefleming
Copy link
Contributor

@igrr When you say the cache is disabled, you're referring to the instruction cache, correct? Do you have a reproducible way of disabling the cache? I wasn't able to make it crash in my testing.

@igrr
Copy link
Member

igrr commented Feb 19, 2018

Yes, esp8266 only has an instruction cache. It is normally disabled when flash read/write/erase operation is performed. You can repeatedly call ESP.flashRead function in the loop and attach an interrupt handler to a GPIO or timer interrupt. If the interrupt happens when flash operation is in progress, the issue will be reproduced.

@earlephilhower earlephilhower deleted the vtable_to_flash branch March 9, 2018 05:42
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants