Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[sdk 1.5] - Random crashes when using SoftwareSerial #1326

Closed
vlast3k opened this issue Dec 29, 2015 · 12 comments
Closed

[sdk 1.5] - Random crashes when using SoftwareSerial #1326

vlast3k opened this issue Dec 29, 2015 · 12 comments

Comments

@vlast3k
Copy link

vlast3k commented Dec 29, 2015

I continue to face some strange issues with SDK 1.5. Now i am using SoftwareSerial from this repository
https://github.com/plerup/espsoftwareserial to communicate with a serial device using 9600 baud

This is the code sample. i am invoking in a loop, with 200 ms delay


void sendCmd(uint8_t *cmd, uint8_t *r) {
  Serial << "write" << endl;
  PM1106_swSer.write(cmd+1, cmd[0]);
  Serial << "read" << endl;
  delay(100);
  int i =0;
  for (i=0; i < 24 && PM1106_swSer.available(); i++){
    *(r++) = PM1106_swSer.read();
  }
  Serial << endl << i << endl;
}

after several iterations (usually between 20 - 30 seconds after it started). The ESP will crash with the stack below, after dumping "read"

If happens with the latest Git, with the git few changelists after introducing SDK 1.5
With the changelist right before introducing SDK 1.5 and with the SDK 1.3 binaries it works.

Exception (0):
epc1=0x40207fc4 epc2=0x00000000 epc3=0x00000000 excvaddr=0x00000000 depc=0x00000000

ctx: sys 
sp: 3ffffc60 end: 3fffffb0 offset: 01a0

>>>stack>>>
3ffffe00:  00000005 00000000 00000020 4010176e  
3ffffe10:  40004371 00000030 00000016 ffffffff  
3ffffe20:  60000200 00000000 3fffff00 80000000  
3ffffe30:  20000000 00000000 00000000 203fc000  
3ffffe40:  0000ffff 3fffc6fc 00000001 3fff32d0  
3ffffe50:  000002f4 003fc000 60000600 00000030  
3ffffe60:  40208004 40207fc4 3fff0550 40207ffc  
3ffffe70:  402138d5 3fff02f0 3fffc258 4000050c  
3ffffe80:  40000f83 00000030 00000011 ffffffff  
3ffffe90:  4022911e 00000023 ffffffff fffffc1b  
3ffffea0:  3ffefe28 4020efa0 00000000 bfffffff  
3ffffeb0:  ffffffff 3fffc6fc 00000001 3ffeef74  
3ffffec0:  3ffefe50 018e531d 60000600 00000030  
3ffffed0:  007c31b5 00000001 00003a98 00000001  
3ffffee0:  ffffffff 3fffc6fc fc70ffff 3fffdab0  
3ffffef0:  00000000 3fffdcb0 3ffefe60 00000030  
3fffff00:  00000000 400042db 4023e160 401056f1  
3fffff10:  40004b31 3fff32d0 000002f4 003fc000  
3fffff20:  401059c6 4010571e 3ffeeec0 00000000  
3fffff30:  4020ea61 3ffeeec0 3ffefe50 018e531d  
3fffff40:  3fff32d0 00001000 4020ef06 00000008  
3fffff50:  00000000 00000000 4020efb3 3ffeef74  
3fffff60:  3ffefe50 00000001 4020eec8 402016dc  
3fffff70:  40229125 3ffefe50 3ffefe28 40229125  
3fffff80:  4022916a 3fffdab0 00000000 3fffdcb0  
3fffff90:  3ffefe70 3fffdad0 3fff0814 402081b7  
3fffffa0:  40000f49 000187cc 3fffdab0 40000f49  
<<<stack<<<
@Links2004
Copy link
Collaborator

may same as: #1211
each char will block all interrupts for 1ms at 9600.
if you have more chars this goes very fast in a range where the WiFi stack gets problems.

you can create an objdump to see the function where the problem is;
"$(XTENSA_TOOLS_ROOT)/xtensa-lx106-elf-objdump" -S $(PROJECT_NAME).elf > $(PROJECT_NAME).dobj

@alltheblinkythings
Copy link
Contributor

I've seen ctx:sys, Exception 0 (IllegalInstructionCause) with
epc1=0x402xxxxx (address in flash) before and it was an interrupt handler
trying to execute some code that's stored in flash (which is forbidden in
the non-OS SDK which we use).

You can confirm if this is the case for you once you have the objdump
output: search it for "40207fc4" and identify the function that's being
called. From glancing at the SoftwareSerial source, I think at least
SoftwareSerial::handle_interrupt
and SoftwareSerial::rxRead need ICACHE_RAM_ATTR .

On Tue, Dec 29, 2015 at 10:21 AM, Markus notifications@github.com wrote:

may same as: #1211 #1211
each char will block all interrupts for 1ms at 9600.
if you have more chars this goes very fast in a range where the WiFi stack
gets problems.

you can create an objdump to see the function where the problem is;
"$(XTENSA_TOOLS_ROOT)/xtensa-lx106-elf-objdump" -S $(PROJECT_NAME).elf >
$(PROJECT_NAME).dobj


Reply to this email directly or view it on GitHub
#1326 (comment).

@vlast3k
Copy link
Author

vlast3k commented Dec 30, 2015

the current exception trace is pointing exactly to those functions

Exception (0):
epc1=0x40208014 epc2=0x00000000 epc3=0x00000000 excvaddr=0x00000000 depc=0x00000000

ctx: sys 
sp: 3ffffc60 end: 3fffffb0 offset: 01a0

>>>stack>>>
3ffffe00:  40000f83 00000030 00000011 ffffffff  
3ffffe10:  4000437d 00000030 00000016 ffffffff  
40208014 <_ZN17SoftwareSerialESP16handle_interruptEPS_>:

void SoftwareSerialESP::handle_interrupt(SoftwareSerialESP *swSerObj)  {
40208014:   f0c112          addi    a1, a1, -16
40208017:   21c9        s32i.n  a12, a1, 8
40208019:   3109        s32i.n  a0, a1, 12
4020801b:   02cd        mov.n   a12, a2
   if (!swSerObj) return;
4020801d:   32bc        beqz.n  a2, 40208054 <_ZN17SoftwareSerialESP16handle_interruptEPS_+0x40>

   // Check if this interrupt was was coming from the the rx pin of this object
   int pin = swSerObj->m_rxPin;
   uint32_t gpioStatus = GPIO_REG_READ(GPIO_STATUS_ADDRESS);
4020801f:   fffc31          l32r    a3, 40208010 <_ZN17SoftwareSerialESP6rxReadEv+0xa4>

void SoftwareSerialESP::handle_interrupt(SoftwareSerialESP *swSerObj)  {
   if (!swSerObj) return;

   // Check if this interrupt was was coming from the the rx pin of this object
   int pin = swSerObj->m_rxPin;
40208022:   4228        l32i.n  a2, a2, 16

i also tried to add ICACHE_RAM_ATTR to both functions, yet - i get again exception, but this time, strangely it is in micros

Exception (0):
epc1=0x402017c0 epc2=0x00000000 epc3=0x00000000 excvaddr=0x00000000 depc=0x00000000

ctx: sys 
sp: 3ffffc20 end: 3fffffb0 offset: 01a0

>>>stack>>>
3ffffdc0:  07430000 3ffee61c 3ffeb5b0 00000030  
402017c0 <micros>:

unsigned long micros() {
402017c0:   f0c112          addi    a1, a1, -16
402017c3:   036102          s32i    a0, a1, 12
    return system_get_time();
402017c6:   ffc901          l32r    a0, 402016ec <delay_end+0x18>
402017c9:   0000c0          callx0  a0

i will create an issue with this information in the other repository and until it is fixed then stick with HardwareSerial or SDK 1.3 (unfortunately i lack the skills to fix it myself)

@igrr
Copy link
Member

igrr commented Dec 30, 2015

Micros is also in flash and had to be moved into RAM since it is called
from an interrupt handler.

On Wed, Dec 30, 2015, 14:30 vlast3k notifications@github.com wrote:

the current exception trace is pointing exactly to those functions

Exception (0):
epc1=0x40208014 epc2=0x00000000 epc3=0x00000000 excvaddr=0x00000000 depc=0x00000000

ctx: sys
sp: 3ffffc60 end: 3fffffb0 offset: 01a0

stack>>>
3ffffe00: 40000f83 00000030 00000011 ffffffff
3ffffe10: 4000437d 00000030 00000016 ffffffff

40208014 <ZN17SoftwareSerialESP16handle_interruptEPS>:

void SoftwareSerialESP::handle_interrupt(SoftwareSerialESP *swSerObj) {
40208014: f0c112 addi a1, a1, -16
40208017: 21c9 s32i.n a12, a1, 8
40208019: 3109 s32i.n a0, a1, 12
4020801b: 02cd mov.n a12, a2
if (!swSerObj) return;
4020801d: 32bc beqz.n a2, 40208054 <ZN17SoftwareSerialESP16handle_interruptEPS+0x40>

// Check if this interrupt was was coming from the the rx pin of this object
int pin = swSerObj->m_rxPin;
uint32_t gpioStatus = GPIO_REG_READ(GPIO_STATUS_ADDRESS);
4020801f: fffc31 l32r a3, 40208010 <_ZN17SoftwareSerialESP6rxReadEv+0xa4>

void SoftwareSerialESP::handle_interrupt(SoftwareSerialESP *swSerObj) {
if (!swSerObj) return;

// Check if this interrupt was was coming from the the rx pin of this object
int pin = swSerObj->m_rxPin;
40208022: 4228 l32i.n a2, a2, 16

i also tried to add this parameter to both functions, yet - i get again
exception, but this time, strangely it is in micros

Exception (0):
epc1=0x402017c0 epc2=0x00000000 epc3=0x00000000 excvaddr=0x00000000 depc=0x00000000

ctx: sys
sp: 3ffffc20 end: 3fffffb0 offset: 01a0

stack>>>
3ffffdc0: 07430000 3ffee61c 3ffeb5b0 00000030

402017c0 :

unsigned long micros() {
402017c0: f0c112 addi a1, a1, -16
402017c3: 036102 s32i a0, a1, 12
return system_get_time();
402017c6: ffc901 l32r a0, 402016ec <delay_end+0x18>
402017c9: 0000c0 callx0 a0

i will create an issue with this information in the other repository and
until it is fixed then stick with HardwareSerial or SDK 1.3 (unfortunately
i lack the skills to fix it myself)


Reply to this email directly or view it on GitHub
#1326 (comment).

@vlast3k
Copy link
Author

vlast3k commented Dec 30, 2015

Great !
This solved the problem. I could not move micros() to ram using this attribute (somehow compilation failed) - but since it is just wrapping system_get_time(), i modified the SoftwareSerial code to use this call instead of micros() and now it is working :)
Is this then the correct way how the library should work ?
I will propose it to the author

@igrr
Copy link
Member

igrr commented Dec 30, 2015

The library should still use micros, and micros should be moved to RAM.

On Wed, Dec 30, 2015, 21:39 vlast3k notifications@github.com wrote:

Great !
This solved the problem. I could not move micros() to ram using this
attribute (somehow compilation failed) - but since it is just wrapping
system_get_time(), i modified the SoftwareSerial code to use this call
instead of micros() and now it is working :)
Is this then the correct way how the library should work ?
I will propose it to the author


Reply to this email directly or view it on GitHub
#1326 (comment).

@alltheblinkythings
Copy link
Contributor

alltheblinkythings commented Dec 31, 2015 via email

@vlast3k
Copy link
Author

vlast3k commented Dec 31, 2015

actually it was due to my low C skills :)
I included the file that defines the macro and put it in front (and not after) the function definitiob and it compiles and works.
I just do not know yet how to create a pull requst for reviewing it

@supersjimmie
Copy link

#1326 (comment)
can you explain what file you changed and included to change micros()?

@vlast3k
Copy link
Author

vlast3k commented Jan 12, 2016

I changed in core_esp8266_wiring.c

unsigned long ICACHE_RAM_ATTR micros() {
    return system_get_time();
}

just added the ICACHE_RAM_ATTR (apparently no including of stuff is needed, i was wrong before)

@supersjimmie
Copy link

But micros() just calls and returns system_get_time().
Shouldn't that function be changed too?

And also, does this not apply for millis() then?

@vlast3k
Copy link
Author

vlast3k commented Jan 12, 2016

this is an SDK method and is apparently already in the RAM, and thus - can be invoked by interrupt handlers

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants