Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

I2C Wire library might be blocking in case of communication failures (may need timeout) #134

Open
softhack007 opened this issue Nov 27, 2021 · 23 comments

Comments

@softhack007
Copy link

softhack007 commented Nov 27, 2021

There is an open issue on ArduinoCore-megaAvr, that might also be applicable to MegaCoreX - at least the code in question look very similar: arduino/ArduinoCore-megaavr#83

The origin is an enhancement in the "avr" core, which added a timeout to avoid "freezing" when the bus has a failure - see arduino/Arduino#1476,
arduino/ArduinoCore-avr#42 and arduino/reference-en#895. It seems that many user were suffering from such lockups, so the new timeout feature received kind of "standing ovations" ;-)

From looking at the code, it is very clear that the "avr" fix does not work in MegaCoreX or ArduinoCore-megaavr. However I found two scenarios in our code that might cause "blocking" behaviour. Maybe someone with better technical background can clarify if MegaCoreX would really hang, and a timeout would be needed to allow user scetches to proceed (maybe with some error recovery).

In function TWI_MasterWriteRead() we find:

/* Arduino requires blocking function */
while (master_result == TWIM_RESULT_UNKNOWN)
{
}
// in case of arbitration lost, retry sending
if (master_result == TWIM_RESULT_ARBITRATION_LOST)
{
goto trigger_action;
}

  1. could the "while" loop hang, if the ISR never updates the status? is it possible that the ISR never updates the status due to some error, so that the while loops forever?

  2. could there be another infitite loop with the "goto" instruction, in case of repeated TWIM_RESULT_ARBITRATION_LOST?

@SpenceKonde
Copy link
Contributor

@MCUdude consider jacking the new the version of Wire from DxCore, you need to add a tools submenu to select whether build,extra_flags contains -DTWI_MORS (master or slave ) or -DTWI_MANDS (master and slave) - but other than that, there's not much. that needs to be done to make it work over here, the new features are all covered in the readme. and a bunch more is covered in the readme too.

MX682X put together a killer reimplementation of Wire that lowers flash usage, supports simultaneous master and slave, (with or without dual mode), supports the slave address mask, secondary address, or general call and provides a way to find out what address triggered the interrupt, and I tacked on two unobtrusive API extensions, to see if there's a transaction in progress (so you can check before sleeping) and to count how many bytes the master read, so your I2C slave can use the register interface model like 99% of other I2C devices. There's one last change going in before it's will be believed to be done, and that is that we found a way to reduce the SRAM footprint in some cases.

@softhack007
Copy link
Author

softhack007 commented Nov 28, 2021

@SpenceKonde wow, a complete rewrite with lots of new features - nice XMas present :-)

There is some timout functionality protected with #if defined(TWI_TIMEOUT_ENABLE) - is this on by default, or would we need to create a submenu?

@MCUdude
Copy link
Owner

MCUdude commented Nov 28, 2021

@SpenceKonde Thanks, I'll look into it.

@SpenceKonde
Copy link
Contributor

SpenceKonde commented Mar 17, 2022

Yeah the timeout is enabled by default. We (myself and the main developer who did the rewrite, I'm not going to tag him over here because I've been tagging him into way too many conversations latel) were pretty pissed that Arduino came out with an implementation of a timeout right at the same time we did (of course, naturally, a much less efficient one that would cause compatibility problems with the rest of the core; I think it depended on millis or micros, which is obviously a no-no on my cores, - disabling millis/micros is one of the most important tools submenu options on my cores because otherwise you can't do anything truly timing critical while interrupts are disabled enabled. Too bad if the timing critical thing that requires deterministic timing itself is an interrupt! The version I use also gets ported between DxCore and megaTinyCore, with the only changes being in the docs, so it's gotta be usable on 2k parts while leaving room for at least minimal functionality.

I think we (mostly he) did a pretty good job of it.
The release I'm in the process of wrapping up right now is going to contain the FIFTH FULL REIMPLEMENTATION of the master baud rate calculation (how many times have people gotten you to reimplement tat? twice only I thought?, because people keep whining about it not being perfect and coming up with measured cases where ti's way off (nevermind the fact that on the new hardware, there is simply no way to specify the exact baud rate, because it considers the rise and fall time of the I2C lines, we can just try to make reasonable assumptions and hope it's pretty close. When you tell some people that, they act like you told them the sky was green.

@segadora
Copy link

segadora commented Apr 9, 2022

Im experiencing that my arduino with oled screen freezes/bugs out when a dc motor is added in the circuit. Its only happening when the OLED screen is connected, can this issue be because of it?

@SpenceKonde
Copy link
Contributor

Likely noise on the power rail caused by the inductive load of the DC is resetting the OLED. Use separate supply for the motor, or more aggressive decoupling between motor and sensitive electronics. If it is a single direction motor, and you don't have a snubber diode across the terminals of the motor, add one (just any diode capable of handling full current of motor, connected with it's band toward the positive side of the motor so it won't conduct - except the brief spike when the motor is turned off.

Also add more bulk capacitance and make sure that the power supply voltage isn't drooping too much.

@adelin-mcbsoft
Copy link

I'm also looking forward to having this feature implemented, had a situation where an I2C sensor froze and the entire MCU stopped as well since there was no communication going on. All blocked.

@SpenceKonde , how easy was it for you to replace the MegaCoreX's Wire.h library with the one from DXCore, on Arduino IDE? I read your post here and sounds like there are a lot of improvements to it.

In your post on this topic, you said that a special (sub)menu is required for compiling with the proper build flags, could you share some more step-by-step info on how it should be done?

Thank you!

@SpenceKonde
Copy link
Contributor

SpenceKonde commented Nov 13, 2022 via email

@adelin-mcbsoft
Copy link

Some info might’ve been lost in translation. 😁

@SpenceKonde
Copy link
Contributor

SpenceKonde commented Nov 13, 2022

what the heck happened to that message... I think I was falling asleep at the keyboard.....

@adelin-mcbsoft
Copy link

Well, I'm not using it for servo (as the original poster of the other thread), but to read data from two sensors HTU21D and HMC5883L .
The problem is, the HTU21D sometimes locks itself for some reason, and blocks the entire MCU.

I tried to port both DXCore's and your megaTinyCore version of Wire.h, however it does not compile on Mega4808 using MegaCoreX...

MegaTiny error:

In file included from C:\Data\Apps\Arduino_v1.8.19_Portable\v1\libraries\megaTinyCoreWire\src\twi.c:26:0:
C:\Data\Apps\Arduino_v1.8.19_Portable\v1\libraries\megaTinyCoreWire\src\twi_pins.h:41:93: error: expected ';', ',' or ')' before '.' token
   uint8_t TWI0_setConfig(bool smbuslvl, bool longsetup, uint8_t sda_hold, bool smbuslvl_dual. uint8_t sda_hold_dual);

DxCore error:

C:\Data\Apps\Arduino_v1.8.19_Portable\v1\libraries\DxCoreWire\src\twi_pins.c: In function 'TWI0_usePullups':
C:\Data\Apps\Arduino_v1.8.19_Portable\v1\libraries\DxCoreWire\src\twi_pins.c:397:9: error: expected '}' before 'else'
         else {
         ^~~~

I guess the compiler MegaCoreX uses, does not like the C versions of the TWI files (.h and .c).
(it's just a guess, I don't have much experience in C programming).

@SpenceKonde
Copy link
Contributor

no those are outstanding bugs in in the version of wire you grabbed, I'll go see if they're still in the code.

@adelin-mcbsoft
Copy link

I've pulled the master branch, is there another one I should try?

@SpenceKonde
Copy link
Contributor

The master branch as of 20 seconds ago.

@adelin-mcbsoft
Copy link

Missing badCall() ... is there any other library/file I should include?
(please note that I have not installed the entire package, just the Wire.h library).

C:\Data\Apps\Arduino_v1.8.19_Portable\v1\libraries\megaTinyCoreWire\src\Wire.cpp: In member function 'bool TwoWire::swapModule(TWI_t*)':
C:\Data\Apps\Arduino_v1.8.19_Portable\v1\libraries\megaTinyCoreWire\src\Wire.cpp:237:5: error: 'badCall' was not declared in this scope
     badCall("Only one TWI module available, nothing to switch with");
     ^~~~~~~
C:\Data\Apps\Arduino_v1.8.19_Portable\v1\libraries\megaTinyCoreWire\src\Wire.cpp: In member function 'uint8_t TwoWire::specialConfig(bool, bool, uint8_t, bool, uint8_t)':
C:\Data\Apps\Arduino_v1.8.19_Portable\v1\libraries\megaTinyCoreWire\src\Wire.cpp:365:9: error: 'badCall' was not declared in this scope
         badCall("the smbus level option is not present on these parts. You need a Dx for that.");
         ^~~~~~~
C:\Data\Apps\Arduino_v1.8.19_Portable\v1\libraries\megaTinyCoreWire\src\Wire.cpp:369:9: error: 'badCall' was not declared in this scope
         badCall("the smbus level option is not present on these parts. You need a Dx for that.");
         ^~~~~~~
C:\Data\Apps\Arduino_v1.8.19_Portable\v1\libraries\megaTinyCoreWire\src\Wire.cpp: In member function 'void TwoWire::selectSlaveBuffer()':
C:\Data\Apps\Arduino_v1.8.19_Portable\v1\libraries\megaTinyCoreWire\src\Wire.cpp:879:5: error: 'badCall' was not declared in this scope
     badCall("selectSlaveBuffer() was called, but simultaneous mode is not selected");
     ^~~~~~~
C:\Data\Apps\Arduino_v1.8.19_Portable\v1\libraries\megaTinyCoreWire\src\Wire.cpp: In member function 'void TwoWire::deselectSlaveBuffer()':
C:\Data\Apps\Arduino_v1.8.19_Portable\v1\libraries\megaTinyCoreWire\src\Wire.cpp:901:5: error: 'badCall' was not declared in this scope
     badCall("deselectSlaveBuffer() was called, but simultaneous mode is not selected");
     ^~~~~~~

@SpenceKonde
Copy link
Contributor

Aaaagh.... yeah you're going to want to put... I guess at the top of Wire.h

  void badArg(const char*) __attribute__((error("")));
  void badCall(const char*) __attribute__((error("")));
  #define TWI_MORS
  // #define TWI_MANDS   /* Comment out above and uncomment this to use wire as both master and slave using the same TWI instance on the same pins */

@SpenceKonde
Copy link
Contributor

SpenceKonde commented Nov 13, 2022

The badArg() and badCall() functions are used to provide compiletime errors when bogus values are supplied and known at compile time (badArg()) or if a function is called under circumstances where the call is nonsensical ( like a call to millis() when millis() has been disabled, which is an option on my core, or attempting to set Wire to use SMBus levels on a part like a mega0 that doesn't have that option and so on. Their function is simply to halt compilation and give an error, because at that point we know with certainty that no matter what the person had been hoping to do, if it relied on their doing that, it's not going to work.

The standard Arduino doctrine is that in that case, to fail silently. I don't know why they think that's a good idea. Any time the user does something that is guaranteed not to work, I try to make sure they're aware of it. Since their sketch isn't going to work, might as well make it a compile error, right?

@adelin-mcbsoft
Copy link

"Oh boy, oh boy, it works!" :) THANK YOU A LOT 🥇!

I finally managed to compile it.
In the beginning I tried to create another (badCall.h) header file and include it so I don't have to modify the original Wire.h file, but it didn't want to compile without having the badCall() declaration directly into Wire.h, can't figure out why but I'll investigate that later.

I tested I2C's behavior (by disconnecting the sensor) and no more MCU locking. Wonderful. :)
Thank you for your support once again.
I hope your version of Wire.h library will make it into the core of MegaCoreX.
(I saw that you also made some great improvements to Serial.h, but that's another topic).

Sending all the best thoughts out there! Thanks!

@SpenceKonde
Copy link
Contributor

SpenceKonde commented Nov 14, 2022

I put badCall and badArg into Arduino.h in my cores, because... basically anywhere where an there's code that implements API calls - there are badArg and badCall macros to catch inappropriate uses of them; I'd have expected including the header would have worked. I really like compiletime error checking. It helps to make the absence of exception handling a little less nasty.

The TWI_MORS or TWI_MANDS is the part usually handled by the tools menu.

@adelin-mcbsoft
Copy link

adelin-mcbsoft commented Nov 14, 2022

Just want to share my two cents with whomever might have the same issue as I have.
So, until using this library, due to some hardware glitches (I guess **), transmission would stop, Wire.endTransmission() would become a blocking function.
Fortunately, after having this non-blocking implementation of Wire.h, now Wire.endTransmission() returns uint8_t 4 (which is other error, according to documentation).

After this, nothing would work anymore, the I2C bus seemed locked.
Only after a second restart of the MCU it would start to work again.
(even if I rewrote the MCU, it still wouldn't start until I was doing two resets with the push-button).

Now, I know I'm "treating" the effect, not the cause, but I found this I2C_ClearBus implementation, which resets the I2 bus (apparently making use of the Internal Pull-Up Resistors), and the Wire library starts to work again.

I know it's not a standard Wire implementation, but may be useful to be included in the library too.

I also observed that decreasing the clock speed to 10000 ( Wire.setClock(10000) after Wire.begin() ) makes it lock less often.
-- later edit: nah, the issue still happens.

** Hardware Note Aside:
One would be tempted to say that I may miss pull-up resistors in my configuration.
I have an ATMega 4808 (Thinary Nano Every), SDA (D4) and SCL (D5) connected to:
- One bi-directional logic-level converter which comes with 10k pull-up resistors (on both sides), paired to a HTU21D board which also has 4.7k pull-up resistors, and
- a HMC5883L which also has 4.7k pull-up resistors, that goes directly into the SDA/SCL pins of the MCU.

Can't figure out why what the cause is, why the bus freezes after a while... maybe I'll buy an oscilloscope which would offer some more info of what's going on there.

@adelin-mcbsoft
Copy link

For anyone having the issue I had (I2C bus blocked after a while, due to SDA stuck LOW) when using GY-273 boards with HMC5883L on it: Those boards have 4.7k pull-up resistors on it. However, the datasheet of HMC5883L recommend having 2.2k pull-ups.

Added two 3.3k additional resistor (in parallel) to each line (SDA, SCL), making the final value a total of ~2k .
No more bus lock.

@Flole998
Copy link

@adelin-mcbsoft Did I understand that correctly that you got the library from @SpenceKonde to work? I would need the dual operation mode feature from it. If you got it ported over, could you please open a pull request so others can benefit from it aswell?

@goodchip
Copy link
Contributor

goodchip commented Aug 18, 2023

I've fixed the current version of this reimplementation of Wire library for megaAVR-0 series (some compiling errors).
Tested working fine with many I2C slave on 4808 : https://github.com/goodchip/megaTinyCore/tree/master/megaavr/libraries/Wire

@SpenceKonde :
#134 (comment) : pull-request sent :)

@MCUdude : have you possibility to merge this nice wire library to MegaCoreX in the future?

@softhack007 : nice Cannon-Lemming'Fodder avatar! :3

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

7 participants