Skip to content

Commit 127199a

Browse files
mhightower83devyte
authored andcommitted
proposed umm_malloc improvements (#6274)
* Correct critical section with interrupt level preserving and nest support alternative. Replace ets_intr_lock()/ets_intr_unlock() with uint32_t oldValue=xt_rsil(3)/xt_wrs(oldValue). Added UMM_CRITICAL_DECL macro to define storage for current state. Expanded UMM_CRITICAL_... to use unique identifiers. This helpt facilitate gather function specific timing information. Replace printf with something that is ROM or IRAM based so that a printf that occurs during an ISR malloc/new does not cause a crash. To avoid any reentry issue it should also avoid doing malloc lib calls. Refactor realloc to avoid memcpy/memmove while in critical section. This is only effective when realloc is called with interrupts enabled. The copy process alone can take over 10us (when copying more than ~498 bytes with a 80MHz CPU clock). It would be good practice for an ISR to avoid realloc. Note, while doing this might initially sound scary, this appears to be very stable. It ran on my troublesome sketch for over 3 weeks until I got back from vacation and flashed an update. Troublesome sketch - runs ESPAsyncTCP, with modified fauxmo emulation for 10 devices. It receives lost of Network traffic related to uPnP scans, which includes lots of TCP connects disconnects RSTs related to uPnP discovery. I have clocked umm_info critical lock time taking as much as 180us. A common use for the umm_info call is to get the free heap result. It is common to try and closely monitor free heap as a method to detect memory leaks. This may result in frequent calls to umm_info. There has not been a clear test case that shows an issue yet; however, I and others think they are or have had crashes related to this. I have added code that adjusts the running free heap number from _umm_malloc, _umm_realloc, and _umm_free. Removing the need to do a long interrupts disabled calculation via _umm_info. Build optional, min/max time measurements for locks held while in info, malloc, realloc, and free. Also, maintain a count of how many times each is called with INTLEVEL set. * Fixed. travis build complaint. * Changes for #6274 (review) * Added requested comment and missing comment for UMM_CRITICAL_PERIOD_ANALYZE. * Updated comments and update xt_rsil() * Moved xt_rsil&co (pulled in __STRINGIFY) definitions out of Arduino.h, to cores/esp8266/core_esp8266_features.h Added esp_get_cycle_count() to core_esp8266_features.h. Updated umm_malloc and Esp.h to use new defines and location. * Added "#ifndef CORE_MOCK" around conflicted area. * Moved performance measurment and ESP specific definitions to umm_performance.h/cpp. Removed testing asserts. * Commented out umm analyze. Delay CRITICAL_SECTION_EXIT() in umm_realloc() to avoid exposing a transient OOM condition to ISR. * Missed file change. This commit has: Delay CRITICAL_SECTION_EXIT() in umm_realloc() to avoid exposing a transient OOM condition to ISR. * 2nd Path. Removed early release of critical section around memmove to avoid a possible OOM for an ISR. * improved variable name * Resolved ISR OOM concern with `_umm_realloc()` Updated realloc() to do a preliminary free() of unused space, before performing a critical section exit and memmove. This change was applied to the current _umm_realloc(). This change should reduce the risk of an ISR getting an OOM, during a realloc memmove operation. Added additional stats for verifying correct operation. * Resolved ISR OOM concern in _umm_realloc() Updated realloc() to do a preliminary free() of unused space, before performing a critical section exit and memmove. This change was applied to the current _umm_realloc(). This change should reduce the risk of an ISR getting an OOM when interrupting an active realloc memmove operation. Added additional stats for verifying correct operation. Updated: for clarity and Travis-CI fail. * Update to keep access to alternate printf in one file. * Updated to use ISR safe versions of memmove, memcpy, and memset. The library versions of memmove, memcpy, and memset were in flash. Updated to use ROM functions ets_memmove, ets_memcpy, and ets_memset. Additional note, the library version of memmove does not appear to have been optimized. It took almost 10x longer than the ROM version. Renamed printf macro to DBGLOG_FUNCTION and moved to umm_malloc_cfg.h. Changed printf macro usage to use DBGLOG_FUNCTION. * Update umm_malloc.cpp Fix comment
1 parent 6dd8474 commit 127199a

File tree

9 files changed

+478
-70
lines changed

9 files changed

+478
-70
lines changed

Diff for: cores/esp8266/Arduino.h

-21
Original file line numberDiff line numberDiff line change
@@ -142,30 +142,9 @@ void timer0_detachInterrupt(void);
142142
void ets_intr_lock();
143143
void ets_intr_unlock();
144144

145-
#ifndef __STRINGIFY
146-
#define __STRINGIFY(a) #a
147-
#endif
148-
149-
// these low level routines provide a replacement for SREG interrupt save that AVR uses
150-
// but are esp8266 specific. A normal use pattern is like
151-
//
152-
//{
153-
// uint32_t savedPS = xt_rsil(1); // this routine will allow level 2 and above
154-
// // do work here
155-
// xt_wsr_ps(savedPS); // restore the state
156-
//}
157-
//
158-
// level (0-15), interrupts of the given level and above will be active
159-
// level 15 will disable ALL interrupts,
160-
// level 0 will enable ALL interrupts,
161-
//
162-
#define xt_rsil(level) (__extension__({uint32_t state; __asm__ __volatile__("rsil %0," __STRINGIFY(level) : "=a" (state) :: "memory"); state;}))
163-
#define xt_wsr_ps(state) __asm__ __volatile__("wsr %0,ps; isync" :: "a" (state) : "memory")
164-
165145
#define interrupts() xt_rsil(0)
166146
#define noInterrupts() xt_rsil(15)
167147

168-
169148
#define clockCyclesPerMicrosecond() ( F_CPU / 1000000L )
170149
#define clockCyclesToMicroseconds(a) ( (a) / clockCyclesPerMicrosecond() )
171150
#define microsecondsToClockCycles(a) ( (a) * clockCyclesPerMicrosecond() )

Diff for: cores/esp8266/Esp.h

+1-3
Original file line numberDiff line numberDiff line change
@@ -209,9 +209,7 @@ class EspClass {
209209
#ifndef CORE_MOCK
210210
uint32_t EspClass::getCycleCount()
211211
{
212-
uint32_t ccount;
213-
__asm__ __volatile__("esync; rsr %0,ccount":"=a" (ccount));
214-
return ccount;
212+
return esp_get_cycle_count();
215213
}
216214

217215
#endif // !defined(CORE_MOCK)

Diff for: cores/esp8266/core_esp8266_features.h

+26-1
Original file line numberDiff line numberDiff line change
@@ -33,7 +33,32 @@
3333
#define WIFI_HAS_EVENT_CALLBACK
3434

3535

36+
#ifndef __STRINGIFY
37+
#define __STRINGIFY(a) #a
38+
#endif
3639

40+
// these low level routines provide a replacement for SREG interrupt save that AVR uses
41+
// but are esp8266 specific. A normal use pattern is like
42+
//
43+
//{
44+
// uint32_t savedPS = xt_rsil(1); // this routine will allow level 2 and above
45+
// // do work here
46+
// xt_wsr_ps(savedPS); // restore the state
47+
//}
48+
//
49+
// level (0-15), interrupts of the given level and above will be active
50+
// level 15 will disable ALL interrupts,
51+
// level 0 will enable ALL interrupts,
52+
//
53+
#ifndef CORE_MOCK
54+
#define xt_rsil(level) (__extension__({uint32_t state; __asm__ __volatile__("rsil %0," __STRINGIFY(level) : "=a" (state) :: "memory"); state;}))
55+
#define xt_wsr_ps(state) __asm__ __volatile__("wsr %0,ps; isync" :: "a" (state) : "memory")
3756

38-
#endif
57+
inline uint32_t esp_get_cycle_count() {
58+
uint32_t ccount;
59+
__asm__ __volatile__("rsr %0,ccount":"=a"(ccount));
60+
return ccount;
61+
}
62+
#endif // not CORE_MOCK
3963

64+
#endif

0 commit comments

Comments
 (0)