-
-
Notifications
You must be signed in to change notification settings - Fork 725
Random hang at SERCOM.cpp:526 #222
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Does anyone have an idea of how to prevent this? |
From my experience with a multi-master environment, the problem is caused by conflict in the communication between the 2 masters. At some point the 2 masters eventually start communicating at the same time and one of the 2 loses arbitration of the bus. It is a normal scenario, but the library as it is right now does not handle loss of arbitration. I fixed it a couple of months ago in my local copy and used the system without any more hanging issues. I recently opened a pull request here to share the fix. Check out my very simple modification and try it on your system. Let me know if it fixes the problem. |
Hi, I finally got to try your fix. I'm still getting random hangs unfortunately, but this time at line 491 of SERCOM.cpp (infinite while loop)
|
When it hangs, try to manually retrieve the value of the registers checked by isBusIdleWIRE() and by isBusOwnerWIRE(). |
I've finally been able to do more debugging.
p454 from http://ww1.microchip.com/downloads/en/DeviceDoc/40001882A.pdf
There should be a way to set a time-out
|
Adding this prevents it from hanging at line 491
But now it hangs in
(my bus is pretty occupied to see if the code can recover from collisions) |
Do you mind explaining more in details the architecture of your system? How many masters? How many slaves? What are their addresses? Who is talking to who? How often? From the second screenshot you sent, you can see that after the end of the first message there is no repeated start or stop condition, but another byte is sent that looks like the address byte. If some device (including the current owner of the bus) is trying to begin a new message, that is done in violation of the protocol, because there is no stop bit that defines the end of the previous message. The reason why SDA is low is probably because the slave is giving the ack to the master, but the master never checks it (only 8 SCL pulses at the end, vs 9 of all the other bytes). The master never releases the bus so everything gets stuck. |
There are 2 Masters on the bus (Sparkfun SAMD21 Mini). I replaced the shipped Wire and Sercom versions (which were hanging too) by the version in your Git with the fix + the line from my last post. I didn't set an address (the communication didn't work with 1 year ago's libraries if I set it, I don't know now) There is 1 Slave on the bus (Nano Zero stepper, a SAMD21 based board; shipped libraries; address 12). I also tried with an Arduino Mega (non modified Arduino libraries). I never noticed hangs on the Slave with any of them. Both Masters are talking to the Slave every 10 ms (I tried longer values but it just takes longer to hang), at 100 kHz bus clock. The Slave responds each time it gets a message from one of the Masters. The response is read by both Masters, who check if they're getting the correct data (around 99% of the transfers) I uploaded the current version. Its only purpose is testing the Multimaster connection (to reuse in another project if I get it to be reliable). Arduino-I2C-2way-anydata 2-masters-1-slave I guess communication collisions led to the situation in my screenshot. But shouldn't the libraries handle that? The SAMD21 manual tells about a way to recover from violations of the protocol. But I don't know how to use it. Setting
|
I'm making a little progress but I'm not there yet... I made 2 more modifications to the Sercom library: The first to recover from loss of arbitration during data transfer from 1 master to the slave. It reduced the hangs from every 1000 transfers on average to 30000 The second to recover from loss of arbitration during Transmitting Address Packets. It reduced the hangs from every 30000 transfers on average to 170000-210000. There's still an issue left... both masters hung after 170000 to 210000 transfers. The code hangs in
I tried removing Maybe the state change of I2CM.INTLAG.bit.MB sometimes isn't recognized because it's not checked at the right time in the sercom library? |
I never got this to work. It might be linked to the bug described here: http://misfittech.net/blog/samd20-i2c-slave-bug/ |
I have very similar problems (see issue 9 at platformio), in particular my logic analyzer capture. I think there are 2 problems here: first, the SERCOM code assumes that it doesn't need to check the ERROR flag and second, the Wire and SERCOM code do not deal with misbehaving I2C devices and so the bus can enter a state that is valid for the MCU but not addressed in the library code. I should have some work towards a working patch next week. |
I ended up with code that checks the bus state before any read or write. Like this:
In effect, if the bus is busy but should not be, reset (calls |
I'm getting the problem too, except on a much simpler configuration. Its on the IndIO board of the Industruino - https://static.industruino.com/downloads/diagrams/Industruino_INDIO_D21G_pinout_map_April'17_S.pdf |
tinyusb: MIDI - Increase buffer size
I'm trying to set up multi-master I2C communication on board running an ATSAMD21G18 (2 masters and slave so far) with this code: https://github.com/kb-/Arduino-I2C-2way-anydata/tree/dd5597835c9a8a89ff23cc17605bb6f59faa5699
One of the masters randomly hangs after a large number of transfers (around 70 000 on average), always at line 526 of SERCOM.cpp
The same behavior occurred on 5 different boards I tried, 2 different models.
I'm getting this in the backtrace but can't figure out what's wrong :
It doesn't occur with one master only (4 000 000 successful transfers).
The text was updated successfully, but these errors were encountered: