Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RMT Buffer Allocation Fix for Issue #375 #392

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

teknynja
Copy link

@teknynja teknynja commented Jun 9, 2024

NOTE: This pull requests precedes pull request #394 - I've since added a mutex for thread safety, I'm leaving this in place so both versions are available for evaluation.

This pull request addresses issue #375 [pixel.show() crash with more than 73 pixel on ESP32s3]

After reviewing the code and also getting some helpful feedback from the Espressif Forums (https://www.esp32.com/viewtopic.php?f=13&t=40270) and @robertlipe it was determined that the code in esp.c for handling the RMT item buffers when using the IDF v5 framework was allocating too much space on the stack when using more than 70ish pixels.

This pull request addresses that issue by allocating the RMT buffers from the heap instead. It will attempt to allocate a single block of memory to accommodate the largest configured instance (sharing the buffer between instances is fine, as the buffer is completely populated each each time the espShow() method is called).

I also took the time to improve the channel allocation management, previously the RMT channels were initialized on each call to espShow(), now the RMT channels are only de-initialized and re-initialized whenever the output pin is changed.

Finally, I was concerned about problems that may be caused by allocating large buffers on the heap without giving the user any way to free that memory, so the code allows a user to free that memory (and also release the RMT channels) by setting the number of pixels to zero using .updateLength(0) and then calling .show(). This will de-allocate the heap memory used for the RMT buffers and release the RMT channels held by driver. They will automatically be re-allocated if needed when setting the number of pixels back to a non-zero value.

It seems one of the primary goals on this project is to minimize the distribution changes across files, to that end you'll find that all the changes here are constrained to a single method in the esp.c file. A different approach would be required to address thread-safety and instance sharing issues, but it seems like there is existing code that is problematic in those areas as well so those goals were not prioritized. I think for a majority of the users of this library, those are not going to be issues anyways.

@robertlipe
Copy link

robertlipe commented Jun 9, 2024 via email

@teknynja
Copy link
Author

teknynja commented Jun 9, 2024

Cool, thanks!

As for the free/malloc vs realloc, I just copied how the pixels buffer is allocated (when in Rome... 😏).

@robertlipe
Copy link

This may or may not be the root of #380 and #332, too.

Looking more at overall context, there's hopefully some kind of lock held way above this to make it thread safe anyway. This code is about to honk on the RMT registers which are clearly globals. If you have two shows()s running on two different cores and there's not something above this to stop the instance from each reprogramming the RMT device, this code is already going to have a bad day.

Hopefully some code that's calling this is already responsible for grabbing an mtx_lock()/pthread_mutex_lock before multiple shows()s can be called. My concerns about multithreading are either unfounded (it's handled elsewhere) or a canary in a coal mine (it's not handled and it just a latent ticking bomb). Perhaps there is locking at the rmt driver level and we really DO need to implement something similar up here.

Arduino doesn't work like other systems I'm familiar with, so I don't know the expected locking heirarchy or appropriate tools here (C++ threading? FreeRTOS? C11?) or the synchronization model for rmtWrite() (is it guaranteed that it's done with the buffers when it returns or might it still be DMA'ing out of that block when it returns?) or such. I'll defer to an actual Arduino dev on that.

So this definitely leaves it better than we found it, but this code still looks 'sus.

@teknynja
Copy link
Author

I completely agree with you on these safety issues. There are a few different areas of consideration that come to mind regarding this:

  • RMT Channel Management – The Espressif documentation on the new RMT API is currently MIA, and as a placeholder they are using a bit of example code (In fact it is for driving NeoPixels, and looks suspiciously like the code I found in the Adafruit_NeoPixel espShow method 😏). I believe the intent of this API is to hide all the messy channel allocation/management code and to deal with all the buffering of data through those channels. In fact, I’m currently using it with two different “clients” in the same sketch, one being the Adafruit_NeoPixel library, and the other an IR Driver I wrote (because I couldn’t find a library that supports Arduino Core v3 and used the RMT hardware). Both libraries are working great together on an S3 chip (which has 8 read/write RMT channels) and a C3 chip (which has 2 read and 2 write channels with smaller buffers). The RMT API seems to be doing its job and keeping the channels separate and full.

  • Instance Management – Because the ESP32 code in the Adafruit_NeoPixel library is not a “first-class citizen” of the library, it doesn’t have access to instance-level data; it can only access ephemeral data on the stack or global data shared across all instances of the library. The only fix for this is to make the code in the esp.c module part of the Adafruit_NeoPixel class, which seems like something a project lead would need to be involved with. This could also potentially introduce some negatives, for example each instance might allocate heap for it’s own RMT symbol buffer, which could possibly eat up a large portion of memory. Another consideration are the functions that should be in IRAM, which makes pulling them into “class” code more challenging.

  • Thread Safety – This is arguably the harder problem to deal with. To date it looks like (other than enabling/disabling interrupts) only the NRF52 code is thinking about thread safety (and mostly just the use of rtos_malloc, which might be a benefit everywhere?). I’m pretty sure we can’t disable interrupts around the RMT code (it’s probably using them to pump data through the channels), but a mutex might help protect some of the critical code here. A quick search seems to point to several different ways (pthreads, multiple-cores, FreeRTOS xTaskX, etc) to implement multi-threading on an ESP32 Arduino project, all of which need to be considered. Again, this feels like project-lead level stuff here, and should probably be looked at by someone with more Arduino multi-threading experience than I have.

All this being said, I think the currently proposed code does the best it can given then current environment, and at least allows users the ability to drive larger numbers of pixels when building against Core v3 (of course, with caveats). Properly addressing the above issues will likely require a fair bit of restructuring of the library code, and feels out-of-scope for fixing the actual problem users are currently experiencing. When the above issues are addressed, all of the ESP32 code (including the old Core v2 code) will need to be re-worked to bring it up to the level of safety we are considering here.

@teknynja
Copy link
Author

teknynja commented Jun 10, 2024

Just for giggles, I created two instances of the Adafruit_NeoPixel class each with 500 pixels on the ESP32-C3 (along with my custom IR driver) and am calling .show() on every pass through loop (with no delay(n) calls) and everything seems to be working - no weird timing/dropped pixels on the strips, and the IR timing seems to be fine as well - so it looks like the RMT functions are doing their job. The Adafruit driver being called across instances seems like it isn't causing any problems either. I'm sure it would all blow up if i tried calling them from different threads though...

(BTW, I do not have 1000 pixels laying around to play with, I have 150 on one pin and 50 on the other, but neither of them are showing artifacts or glitches)

@robertlipe
Copy link

robertlipe commented Jun 10, 2024 via email

@teknynja
Copy link
Author

teknynja commented Jun 11, 2024

I created a torture-test program that drives 8 instances of Adafruit_NeoPixel driver with 150 pixels each, all running on their own thread:

#include <Adafruit_NeoPixel.h>

#define PIXELS_PER_PIN 150

#if defined(ARDUINO_XIAO_ESP32C3)
  const uint PIXEL_PINS[] { D0, D1, D2, D3, D4, D5, D6, D7 };
#elif defined(ARDUINO_ESP32_DEV)
  const uint PIXEL_PINS[] { 22, 21, 19, 18, 5, 17, 16, 4 };
#else
  #error Please define pins for your board here!
#endif

const uint PIN_COUNT = sizeof(PIXEL_PINS) / sizeof(uint);

uint task_ids[PIN_COUNT];

void PixelTask(void *pvParameters);

void setup() {
  Serial.begin(115200);
  delay(2000);
  Serial.printf("NeoPixel RMT Threaded Torture Test (%u Pins/Threads)\r\n", PIN_COUNT);

  for (uint i = 0; i < PIN_COUNT; i++) {
    task_ids[i] = i;
    xTaskCreate(PixelTask, "PixelTask", 2048, (void *)&task_ids[i], 1, NULL);
    Serial.printf("Created PixelTask %u for pin %u\r\n", i, PIXEL_PINS[i]);
  }
}

void loop() {
  Serial.println("Hello from the main thread...");
  delay(10000);
}

void PixelTask(void *pvParameters) {
  uint task_id = *((uint *)pvParameters);
  unsigned long period = 500 + random(1000);
  Serial.printf("Started PixelTask %u with period %ums\r\n", task_id, period);

  static const uint MAX_COLORS = 3;
  uint color = 0;

  Adafruit_NeoPixel *pixel_driver = new Adafruit_NeoPixel(PIXELS_PER_PIN, PIXEL_PINS[task_id], NEO_GRB + NEO_KHZ800);
  pixel_driver->begin();

  unsigned long update_timer = 0;

  while (true) {
    unsigned long now = millis();

    if (now - update_timer > period) {
      update_timer = now;

      Serial.printf("PixelTask %u updating pixels for pin %u\r\n", task_id, PIXEL_PINS[task_id]);
      for (uint i = 0; i < PIXELS_PER_PIN; i++) {
        switch (color) {
          case 0:
            pixel_driver->setPixelColor(i, 0x40, 0x00, 0x00);
            break;
          case 1:
            pixel_driver->setPixelColor(i, 0x00, 0x40, 0x00);
            break;
          case 2:
            pixel_driver->setPixelColor(i, 0x00, 0x00, 0x40);
            break;
          default:
            pixel_driver->setPixelColor(i, 0x00, 0x00, 0x00);
            break;
        }
      }
      color = (color + 1) % MAX_COLORS;
    }

    pixel_driver->show(); // Note we are calling .show() every scan for extra stress!!!
  }

  Serial.printf("Ending PixelTask %u\r\n", task_id);
  vTaskDelete(NULL);
}

As expected, this caused the devices to crash almost immediately. I then added mutex protection inside the espShow() method to protect the led_data buffer and the rmtXxxx() calls. Running the above code now works reliably with no glitches/artifacts on both my 'C3 and 'S3 chips. I think this solves the remaining safety questions as discussed. (It's not pretty, but it seems to work!)

Let me know if I should push the mutex changes here for review...

@robertlipe
Copy link

robertlipe commented Jun 11, 2024 via email

@teknynja
Copy link
Author

OIC - I wasn't sure about your relationship with this project, but I guess if you hadn't come along to encourage me (and the maintainers stayed quiet) then I probably wouldn't have tried to tackle this.

This is essentially my first pull request for an open-source project, so I know nothing about the protocol. I kind of thought that simply pushing additional changes to my fork would end up in this pull request. Would it be confusing for the maintainer to have two separate forks/pull requests from the same person for the same issue? TBH the changes to add the mutex weren't all that invasive (but it did change the code flow a bit).

I see your point about the array size, I don't do a lot of c/c++ coding (Internet to the rescue!), but I will keep that in mind for the future!

@robertlipe
Copy link

robertlipe commented Jun 12, 2024 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants