Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG]WifiTask running on core 0 may result in halts during prints #10

Closed
vivian-ng opened this issue Feb 14, 2020 · 14 comments
Closed

[BUG]WifiTask running on core 0 may result in halts during prints #10

vivian-ng opened this issue Feb 14, 2020 · 14 comments
Labels
bug Something isn't working

Comments

@vivian-ng
Copy link

This was an issue raised when this was part of a custom Marlin fork.

The wifi task running on the same core as the rest of Marlin may cause random halts during prints when wifi connection is poor. The solution back then was to run wifi task on another core.

I believe esp3dlib.cpp can be amended to run wifi task on core 1 instead of 0.

void Esp3DLib::init()
{
    xTaskCreatePinnedToCore(
        WiFiTaskfn, /* Task function. */
        "WiFi Task", /* name of task. */
        10000, /* Stack size of task */
        NULL, /* parameter of the task */
        1, /* priority of the task */
        NULL, /* Task handle to keep track of created task */
        0 	 /* Core to run the task */
    );
}
@vivian-ng vivian-ng added the bug Something isn't working label Feb 14, 2020
@luc-github
Copy link
Owner

Marlin run on Core 1 like all arduino
Wifi run on on core 0 that is why ESP3DLib run on core 0 to not bother with Marlin

@vivian-ng
Copy link
Author

@luc-github Thanks for clarifying. I guess the random halts are caused by something else then. Will close this.

@luc-github
Copy link
Owner

luc-github commented Feb 14, 2020

@vivian-ng they are not same random pause that @felixstorm fixed ? they happened wifi off

these new ones happen only when wifi on ?

@vivian-ng
Copy link
Author

@luc-github Could this be the problem?

  xTaskCreate(stepperTask, "StepperTask", 10000, nullptr, 1, nullptr);

in i2s_init() of i2s.cpp.

Since the I2S stepper task is not pinned to any core, will it end up running on core 0 and become interfered by wifi?

@luc-github
Copy link
Owner

luc-github commented Feb 14, 2020

you may add xPortGetCoreID(); to know the core used by https://github.com/MarlinFirmware/Marlin/blob/bugfix-2.0.x/Marlin/src/HAL/HAL_ESP32/i2s.cpp#L142
so you can see when you have random pause which core is used

static inline uint32_t IRAM_ATTR xPortGetCoreID()

but may need to store in variable as I am not sure the output is already available when init task

@felixstorm
Copy link

Generally I would also run all non-Marlin tasks on core 0 (WiFi, ESP3DLib etc.) and run everything Marlin undisturbed on core 1, so also pinning the stepper task to CONFIG_ARDUINO_RUNNING_CORE sounds like a good idea.

@vivian-ng Do you still have any random pauses now with the latest Marlin versions?

@vivian-ng
Copy link
Author

@felixstorm I am using quite a new version of Marlin bugfix-2.0.x (fetched about two days back). Usually, no issues with printing, but yesterday, I was having a lot of connections problems on 2.4G wifi (either my router is giving me problems on 2.4G, or my area has a lot of 2.4G interference) and my ESP32 devices (printer and even ESP32-CAMs). I finally managed to get a connection on my Ender-3 running on my MRR ESPE board (with I2S), but it was quite a bad connection (loading webUI took longer time than usual, etc.) When printing, I noticed a lot of pauses. I ended up halting the prints.

@luc-github I think the I2S stepper stream is probably on core 0, since Marlin is already running on core 1, and the i2s_init() is called in HAL_init() which is before HAL_init_board() (where ESP3DLib is called).
Maybe can change the line in i2s.cpp to:

xTaskCreatePinnedToCore(stepperTask, "StepperTask", 10000, nullptr, 1, nullptr, CONFIG_ARDUINO_RUNNING_CORE);

The thing is, it is difficult for me to recreate the conditions to test if any solution actually solves this. The issue only occurs when I have problems on my 2.4G wifi, and fortunately, that is not often. Whatever the case, I think I will just push this fix as a PR to Marlin next week (after I test it out a bit to make sure it does not create more problems than solve them).

Maybe this fix will even help resolve the issue with babystepping on I2S. (wishful thinking on my part)

@vivian-ng
Copy link
Author

@luc-github Should init() be amended to purposely run WiFiTaskfn on the non-Marlin core?

Something like:

void Esp3DLib::init()
{
    xTaskCreatePinnedToCore(
        WiFiTaskfn, /* Task function. */
        "WiFi Task", /* name of task. */
        10000, /* Stack size of task */
        NULL, /* parameter of the task */
        1, /* priority of the task */
        NULL, /* Task handle to keep track of created task */
        1 - CONFIG_ARDUINO_RUNNING_CORE 	 /* Core to run the task */
    );
}

which will run on core 0 if CONFIG_ARDUINO_RUNNING_CORE is 1 and core 1 if CONFIG_ARDUINO_RUNNING_CORE is 0.

Anyway, I have submitted PR#16874 to Marlin to explicitly run I2S stepper task on CONFIG_ARDUINO_RUNNING_CORE.

@luc-github
Copy link
Owner

luc-github commented Feb 16, 2020

arduino core is build like this, changing core in sdkconfig means new arduino core and this may break backward compatibility so I really doubt it will happen ever on arduino core, it only may be done using IDF and arduino as component, core for ESP3DLib could be an overidable define yes, but I really doubt of need, anyway it is just 3 lines of code to let full flexibility same as core or not

I will add this possibility later

@luc-github
Copy link
Owner

@vivian-ng I have added the defines for ESP3DLIb task priority and targeted core
80609a7

@felixstorm
Copy link

@luc-github I have compiled the latest version with your PR already included against a slightly modified version of the Arduino-ESP32 libraries that lets me dump FreeRTOS task statistics.
This is the result after some idle time:

*** FreeRTOS Task Statistics ***
Name             State  Prio  HighWaM  TaskNum  Core    RunT Abs  RunT Perc
esp_timer            B    22     1360        1     0    11549691       <1 %
ipc0                 B    24      556        2     0       54313       <1 %
ipc1                 B    24      492        3     1      201868       <1 %
IDLE0                R     0      384        6     0  2679724309       39 %
IDLE1                R     0      592        7     1     5437467       <1 %
Tmr Svc              B     1     1624        8     0          34       <1 %
loopTask             R     1     6328       12     1  3244688930       47 %
StepperTask          B     1     9468       14     1   136967380        2 %
WiFi Task            B     1     7852       15     0   615283464        9 %
network_event        B    19     2368       16     1        4973       <1 %
eventTask            B    20     1644       17     0        2647       <1 %
tiT                  B    18     1676       18     0    16269359       <1 %
wifi                 B    23     1648       19     0    63836854       <1 %
mdns                 B     1     2052       20     0      544793       <1 %

To me, this looks ok now. ipc1 and IDLE1 should be fine on core 1, then there's loopTask and StepperTask which are both Marlin and then network_event is also on core 1, but that should hopefully not do too much harm.

Should we still encounter pauses due to WiFi issues in the future, I see 2 more viable options:

  • try to force network_event to core 0 also, either by
  • increase the priority of StepperTask to be above the one from network_event

All options might have side effects, so before implementing anything "just in case" we should IMHO wait if there are really still issues and if so, setup a test environment to duplicate the issues and then try out fixes...

@vivian-ng
Copy link
Author

@vivian-ng I have added the defines for ESP3DLIb task priority and targeted core

Thank you! It was just a suggestion because I am not a fan of magic numbers in code.

As for network_event, it seems to be very low load now, and I think it is really only called when something happens (like wifi connection is dropped, etc.). Anyway, I will wait and see if I get random pauses again (need to wait for a bad wifi day).

@luc-github
Copy link
Owner

luc-github commented Feb 17, 2020

the event task will call this function when event is catched :

ESP3DLib/src/wificonfig.cpp

Lines 178 to 191 in 3cf617b

void WiFiConfig::WiFiEvent(WiFiEvent_t event)
{
switch (event) {
case SYSTEM_EVENT_STA_GOT_IP:
Esp3DCom::echo ("Connected");
Esp3DCom::echo(WiFi.localIP().toString().c_str());
break;
case SYSTEM_EVENT_STA_DISCONNECTED:
Esp3DCom::echo("WiFi not connected");
break;
default:
break;
}
}

So probability it generate pause is very very low I think

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

3 participants