Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Automatic stack location selection (SYS or HEAP), enable per library AR-chive in arduino build system #5018

Merged
merged 20 commits into from
Aug 20, 2018

Conversation

d-a-v
Copy link
Collaborator

@d-a-v d-a-v commented Aug 8, 2018

No description provided.

Copy link
Member

@igrr igrr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Left some comments, mostly cosmetic. Rest looks very good!

#include "cont.h"
#include "coredecls.h"

// calls to this function must *always* be inlined
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

must never be inlined?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was meaning: callers to this function must *always* be inlined

/* Call the entry point of the SDK code. */
call_user_start();
}

static void ICACHE_RAM_ATTR app_entry_custom (void) __attribute__((weakref("app_entry_redefinable")));

extern "C" void ICACHE_RAM_ATTR app_entry (void)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we drop this extra call level?

In this file, you define:

extern "C" void ICACHE_RAM_ATTR app_entry_sysstack(void)
{
    cont_t s_cont __attribute__((aligned(16)));
    g_pcont = &s_cont;
    call_user_start();
}
extern "C" void app_entry(void) __attribute__((weak, alias("app_entry_sysstack")));

In the other file, you define app_entry as a strong symbol:

extern "C" void ICACHE_RAM_ATTR app_entry(void)
{
    g_pcont = &g_cont;
    call_user_start();
}

I think it should work...

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tried that, and just re-tried your way (weak+alias instead of weakref) and it still does not work, the strong one is always used. I don't know why for sure.

I had put that comment just above about that.

(note: setting app_entry() itself as "weak" is not sufficient and always ends up with the other "noextra4k" one linked, maybe because it has a default value in linker scripts).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, you are right. That's a side effect of specifying something as ENTRY, it also makes the symbol undefined, and the linker picks it up from core_esp8266_app_entry_noextra4k.cpp before it sees core_esp8266_main.cpp (probably due to alphabetical ordering when creating the archive file)...

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tried again (after dot_a_linkage, just to be sure), still the same. The indirection is still needed.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No problem, that only cost us 16 bytes of the stack, which we can later remove by re-writing app_entry as a one-line assembly function which call0-s app_entry_redefinable.

@@ -0,0 +1,84 @@

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGPL license missing


extern bool beginWPSConfig(void);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why does this need to be a freestanding function?

* so far only WPS_TYPE_PBC is supported (SDK 1.2.0)
* @return ok
*/
bool beginWPSConfig(void) {
Copy link
Member

@igrr igrr Aug 8, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can keep this a member function of ESP8266WiFiSTAClass, just move implementation to a separate file, as you did.

return WPS_is_unavailable_in_this_configuration__Please_check_FAQ_or_board_generator_tool();
inline bool beginWPSConfig(void) __attribute__((always_inline))
{
disable_extra4k_at_link_time(); // this call must always be inlined
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

as far as i can tell this can never be inlined, unless we enable LTO?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have updated the code/comment (yet to push) as follow:

    protected:
    
        static void WPSStatusCB(wps_cb_status status);
        bool beginRealWPSConfig(void);

    public:

        inline bool beginWPSConfig(void) __attribute__((always_inline))
        {
            /*
             Some notes:

             This call to disable_extra4k_at_link_time() must not be
             "compiled-in" when WPS is not in use.  Calling it from
             beginRealWPSConfig() would always link it.

             'inline ...  attribute(always_inline)' is not required since gcc
             will optimize this function, but it is left as a reminder.

             Moving beginRealWPSConfig from ESP8266WiFiSTA.cpp to
             ESP8266WiFiSTA-WPS.cpp only allows to same some flash space.
            */
            disable_extra4k_at_link_time();

            return beginRealWPSConfig();
        }

When it is not inlined, or not optimized by gcc, or if disable_extra4k_at_link_time() is called from beginRealWPSConfig(), heap stack is always linked in.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, i finally understood why you were mentioning inlining in other comments and why you needed to make it inline. This is because Arduino IDE does not combine object files from libraries into static archives, it links them directly. This is in contrast with the core object files, which are first placed into an archive.

However, recent Arduino versions support putting library object files into .a archive before linking, which should remove the need for inlining. Please check https://github.com/arduino/Arduino/wiki/Arduino-IDE-1.5:-Library-specification and add dot_a_linkage=true to ESP8266WiFi library.properties file.

Copy link
Collaborator Author

@d-a-v d-a-v Aug 9, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It works with dot_a_linkage=true and the need for an inline function/method in the class is now useless.

Things are now more clear, thanks for this tip. I was not understanding in the first place why I had to use the inline trick and was wrongly thinking that objects files were stored in a library. I admit I don't often try to read the long-like-hell IDE commands.

But I had to cheat and create an empty archive {tmp}/libraries/ESP8266WiFi/ESP8266WiFi.a at the linker command request, while the build process put everything in {tmp}/arduino.ar.
It seems to be the same bug as this esp32 one. This issue is still present in both latest nightly and latest beta.

edit: now trying to find the arduino environment variable for the AR object
edit2: found it, it's {archive_file_path}. Now arduino.ar no longer exists, digging this out :)
edit3: found this, {...}/arduino.ar needs to be replaced by "{archive_file_path}" "-L{build.path}" (all doc is in your above link)

@@ -1260,13 +1260,13 @@ bool WiFiClientSecure::loadPrivateKey(Stream& stream, size_t size) {

extern "C" {
#include <cont.h>
extern cont_t g_cont;
extern cont_t *g_pcont;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

let's put this into coredecls.h as well

@@ -0,0 +1,42 @@

/*
This is a debugging / validation test only
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

suggest placing it into tests/device and not into examples

@d-a-v
Copy link
Collaborator Author

d-a-v commented Aug 9, 2018

Cosmetics turned out to changes in build system !
Ready for a second review :)

@d-a-v d-a-v changed the title Automatic stack location selection (SYS or HEAP), remove unnecessary board generator option Automatic stack location selection (SYS or HEAP), enable per library AR-chive in arduino build system Aug 9, 2018

extern "C" void ICACHE_RAM_ATTR app_entry (void)
{
return app_entry_custom();
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can't we call app_entry_redefinable() directly here? or is that what is meant about the extra level of indirection in the comments above?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See comment above: tried that 3 times with different context with no luck. It is because (maybe) app_entry is defined as an ENTRY in ldscripts.

tools/sdk/ld/eagle.app.v6.common.ld.h:ENTRY(app_entry)

@@ -1478,7 +1472,7 @@ def usage (name,ret):
nofloat=True

elif o in ("--noextra4kheap", "--allowWPS"):
noextra4kheap=True
print('option ' + o + ' is now useless')
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

-> ... is now deprecated and has no effect

* WPS callback
* @param status wps_cb_status
*/
void ESP8266WiFiSTAClass::WPSStatusCB(wps_cb_status status) {
Copy link
Collaborator

@devyte devyte Aug 9, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This function doesn't use any class members or methods, and there is no user-facing usage (i.e.: is private to this file). It should be a static non-member function, or a non-member function defined in an anonymous namespace.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It was indeed an unfinished static non-member function, I can revert that back.

About non user-facing usage, there is this comment
// TODO user function to get status.

Moving this into the class has really no other goal than to keep things tight.


// ----------------------------------------------------------------------------------------------
// ------------------------------------ STA remote configure -----------------------------------
// ----------------------------------------------------------------------------------------------

protected:

static void WPSStatusCB(wps_cb_status status);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This method shouldn't be part of the class, because it is private to the internal code (see comment above).

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It uses the same model as what we have in for example WiFiClient about the lwIP's C-style callbacks (there's no context pointer with WPS though).

Copy link
Collaborator

@devyte devyte left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I lost a bit of sanity with this link-time trick, but if it works, awesome.
One requested change about the WPS status CB being private. The rest looks good.

Copy link
Member

@igrr igrr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some very minor comments, but looks very good overall!

* Placed into noinit section because we assign value to this variable
* before .bss is zero-filled, and need to preserve the value.
*/
extern cont_t* g_pcont __attribute__((section(".noinit")));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

__attribute__ seems to be needed only on definition, not on declaration. Details about placement into noinit section can live there as well.


`Read more <a05-board-generator.rst>`__.

From release 2.4.2 and ahead, using WPS will reduce available heap space to
user by around 4.5KB. In other words, from release 2.4.2 without using WPS,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggest rewording this to make it clear that this is not yet available in 2.4.2. E.g. "starting from 2.5.0, ...".

Copy link
Collaborator

@earlephilhower earlephilhower left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like magic. I'm sure it took some serious hair pulling to get right.

My only possible concern is, what happens to 3rd party libs which don't use dot_a_linkage with this? I will see if I can pull and test locally w/my own ESP8266Audio libs, but even if it works I'm not clear on why the internal ones were all changed.

@@ -55,16 +55,19 @@ How can I get some extra KBs in flash ?
About WPS
~~~~~~~~~

In release 2.4.2 only, WPS is disabled by default. To enable WPS, use this
boards generator option:
From release 2.4.2 and ahead, not using WPS will give an exra ~4.5KB in
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

s/exra/extra/

@@ -7,3 +7,4 @@ paragraph=With this library you can instantiate Servers, Clients and send/receiv
category=Communication
url=
architectures=esp8266
dot_a_linkage=true
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't understand why these are all being converted to .a. Does this mean anyone providing a library for the 8266 now needs to modify their lib.properties?

Copy link
Collaborator Author

@d-a-v d-a-v Aug 16, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not mandatory (except for the ESP8266WiFi library). It is not needed by anyone. Such library's object files are stored in an intermediate libraryname.a (the regular way after all) and this one is used instead of all library's object files in the linker command line.
Other libraries not using "dot_a_linkage" are still usable.
I checked with ESP8266Audio (local branch with this PR merged with latest master) and it's OK. I also tried with "dot_a_linkage" in your lib and the final sketch was reduced by 25KB (!).
Not to mention the linker command looking like a real one (with some .a instead of a huge bunch of .o - well, this is minor cosmetic effect, but still :) ).
What I do not understand is why this was not done by default from the beginning in Arduino.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's awesome news! Glad to hear about it, I'll throw it in my own lib as well now...

extern size_t br_esp8266_stack_proxy_usage();

void _BearSSLCheckStack(const char *fcn, const char *file, int line) {
static int cnt = 0;
register uint32_t *sp asm("a1");
int freestack = 4 * (sp - g_cont.stack);
int freestack = 4 * (sp - g_pcont->stack);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for keeping this updated.

@d-a-v
Copy link
Collaborator Author

d-a-v commented Aug 16, 2018

not clear on why the internal ones were all changed.

This was not mandatory for all of them but doing this with ESP8266WiFi much simplified this PR, the core being already compiled into static libcore.a (thanks to @igrr for his sixth sense).

What I still do not understand is that

  • creating a separate .cpp for symbols we don't want to be included works without using static libs
  • using static libs seems to have a larger effect (25KB spared with ESP8266Audio/dot_a_linkage=true, same much lower effect observed with internal libs) - This is a little LTO, further investigation with map file is needed to understand this effect.

@devyte
Copy link
Collaborator

devyte commented Aug 19, 2018

Is there anything pending here?

@d-a-v
Copy link
Collaborator Author

d-a-v commented Aug 19, 2018

I don't think so

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants