-
I have a program that uses default Wire as master and slave on a multi-master bus and also uses Wire1 as standard master only on a separate bus. The device is a DA 32-pin. This has functioned generally well for a while with multiple devices, and then when testing recently broke significantly. After a lot of checking and trying things, I realized it was a result of updating to 1.5+ core version, including latest 1.5.x. Switching back to 1.4.10 has everything working again, and it's repeatable. Initializing as below (simplifying): #define SELF_ADDR 0x10
void setup(){
Wire.swap(2);
Wire.usePullups();
Wire.begin();
Wire.begin(SELF_ADDR, 0, 0);
Wire1.begin();
} I do have receive/request callers: Wire.onReceive(receiveEvent);
Wire.onRequest(requestEvent); Then many functions are writing to or requesting from devices using beginTransmission(), write(), endTransmission(), or requestFrom() in the usual way but are not working due to no master operation. I'm not using any special device libraries for any I2C devices, all the functions are within my multi-file project. I'm thinking there's an issue with how the peripherals or library are being initialized to function as master, unless there is an issue with the above master operation functions not being called correctly? I started to look through the library a bit and try to do comparisons between the versions but I know there was a major change to the Wire library so there's a lot to dig through. So far it appeared that a lot of function overloads were removed but I could be misunderstanding the diff results I was looking through. If anyone could point me in a direction to work out what could be going on and how to address it, much appreciated. |
Beta Was this translation helpful? Give feedback.
Replies: 9 comments 13 replies
-
@MX682X Any thought? I'm not exactly sure what the system that is causing the problem really looks like from above. |
Beta Was this translation helpful? Give feedback.
-
@jl-lewis Could you try to exchange to put the Slave begin() in front of the Master begin()? I'd like to know if the slave begin is overwriting something |
Beta Was this translation helpful? Give feedback.
-
Can we get a schematic of simplest setup that manifests this bug? |
Beta Was this translation helpful? Give feedback.
-
OK, I will work on getting setup to get the error codes out by serial probably. I will have to piece together schematics into more of a system overview schematic showing the buses and devices. The MCU is pretty much a minimalist setup on a castellated module, with just GPIOs and TWI connections, that attaches to a larger board that's part of a larger system. 3.3V, internal oscillator. In the meantime, I'll attach the output of a compile that may or may not give any clues, in case you see anything unusual. |
Beta Was this translation helpful? Give feedback.
-
This is the basic schematic of the MCU module: The main system controller is a PC with USB-I2C interface, and connects through an I2C bus switch array to multiple MCU boards. The PC can switch boards on/off via upstream I2C GPIO and can set the interrupt to an MCU to get exclusive control of that module's bus. This is necessary because the I2C bus switches are unintelligent and do not monitor or handle behavior based on downstream bus status. If the bus is busy when it connects, it will cause a crash. The interrupt is set for the MCU so that it inhibits master functions, the current loop operations will complete but the next loop will block master operations, then after some delay the PC controller connects and initiates transfers. I'm not doing any end() or messing with interface setup, just not initiating master operations at certain times. So it's not really a normal multi-master situation. Either master can share the same bus but arbitration shouldn't occur since bus control is exchanged by an interrupt status. In hindsight, there are better ways to build out a system like this but it's legacy that I have to stick with at this point, and there are mechanical/layout reasons for it to be done this way as well. Generally, there is a fast task loop that's 20 ms and slow task loop that's 500 ms, which is just done by gating with timed if() statements using millis() tracking. The PC acquisition happens asynchronously at usually 5-30 second intervals and takes about 100 ms or a bit more. All that said, this has worked well for some time with essentially the same codebase on ArduinoCore-avr with a 328P and then of course with DxCore 1.4.10 and prior. With only 1 TWI with 328P, I have the ability to operate with just a single bus for all devices if necessary. I was trying out the capability of going with 2x TWI to free up bandwidth on the main bus and also physically separate the bus into smaller ones potentially. Another "in hindsight", there are software I2C libraries that would probably handle my use case fine and be more portable code. I made another gist with a more complete listing of the code, overall there are something around 1500 lines between all the files so I'm trying to give a condensed version of it without stripping out something that might be relevant to what's going wrong. This is not expected to compile and I haven't tried yet. It's missing most function definitions/libraries and variable declarations to keep it simple for now. It does appear that communicating with specific TWI devices in the setup() section is successful, as I'm able to physically check the GPIO outputs and also get valid data from the ADC. So at that point, master operation is working, then I don't know what is altering the behavior after that point. Is there potentially any state the library or the peripheral itself could be put in at a later time, that it hangs or stops operating normally as master but slave operation still continues? Coming back to the fact that it seems to still operate correctly as slave, but no longer seems to be successfully performing master operations, neither writing bytes to slave devices or reading data back from them. Perhaps something with not driving the clock output correctly when switching to master which would result in devices just not responding with any data bit changes? I'm assuming the clock stretching support (talking to slaves that can stretch included) is there as well, as I've always counted on having that or some devices won't work correctly, and it seems that should still be supported from reading the documentation. I'm sure the unusual use of an interrupt to gate master operations would be suspect, but I can confirm that has worked normally with the other platforms and versions. There's some more testing I can probably do with that, but this is also something I'd like to eliminate from working example code as well. I have used an LED to see that the interrupt is triggering and reading the pin state correctly, but have not dug deeper into the masterState variable state and function execution yet. Obviously I have a fairly complex project and I'm trying to reframe my approach to figure out this issue, but I'm starting to see I may need to just start a new approach with a basic example code and see if I can replicate the issue. I want to do what I can to make sure there's not something weird I'm doing that happens to be interacting, but either way there is some kind of interaction. I will try to take some time to build a much simpler approach and see if the discrepancy still exists. |
Beta Was this translation helpful? Give feedback.
-
@MX682X - does this give you enough to go on? This is way out of my league. It is also a very high priority bug as it likely effects both cores. |
Beta Was this translation helpful? Give feedback.
-
@jl-lewis I've got to admit that I have no idea yet what might cause the issue - the only thing that was changed between Wire versions had something to do with sleep and slave operation, as far as I remember. Also, do you have an oscilloscope? If so, could you try to monitor the TWI0 bus? Also, feel free to post the entire code, It's always easier for me to have the entire picture. At least as far the company policy alows, if there is any. |
Beta Was this translation helpful? Give feedback.
-
Well, I have access to a Saleae analyzer, so I did some captures with both core versions and definitely see the different behavior. The first place I see a difference is when I call the function below, via loop() --> updateCurr() which calls setDACMulti() and this sends a new array of values to an I2C DAC. //***************************************************************************************
//set current setpoint DACs
int Load::setDACMulti(uint16_t dacValue[]){
// return 0 : success
// return 1 : wrong dac index
// return 2 : value over range
// return 3 : DAC 1 transmit error
// return 4 : DAC 2 transmit error
uint8_t ch_MSB = 0;
uint8_t ch_LSB = 0;
// Serial.print("SET DAC _");
// Serial.print(dac, DEC);
// Serial.print("_ to: ");
// Serial.println(dacValue, DEC);
Wire.beginTransmission(DAC1_ADDR);
for(int i=0; i < 3; i++){
dacValue[i] = (dacValue[i] / 5) << 2;
if(dacValue[i] > 4095){
//Serial.println("LD OVRNG,SET 0");
return 2;
}
ch_MSB = dacValue[i] >> 8;
ch_LSB = (dacValue[i] & 0xFF);
Wire.write(ch_MSB); //upper 8 bits
Wire.write(ch_LSB); //lower 8 bits
}
if(Wire.endTransmission(1) != 0){
return 3;
}
Wire.beginTransmission(DAC2_ADDR);
for(int i=3; i < 6; i++){
dacValue[i] = (dacValue[i] / 5) << 2;
if(dacValue[i] > 4095){
//Serial.println("LD OVRNG,SET 0");
return 2;
}
ch_MSB = dacValue[i] >> 8;
ch_LSB = (dacValue[i] & 0xFF);
Wire.write(ch_MSB); //upper 8 bits
Wire.write(ch_LSB); //lower 8 bits
}
if(Wire.endTransmission(1) != 0){
return 4;
}
return 0;
} With 1.4.10, this looks like I'd expect, you can clearly see the frame with 3x 2-byte values (repeated values in this case): And one with analog capture (sorry, did this after but it's also zoomed a bit more): And with 1.5.4, I see a start, address + write, ACK, then stop. Then repeats for other devices as well, just start, address byte, stop: The write to 0x41 is another section of code, but is another ADC I've shown code for before. It's in this case a write-read using a restart but same behavior, and since the initial write of register value doesn't happen, the read doesn't return expected value. Again with analog capture as well, on 1.5.4: I certainly wouldn't try to pretend I have a clean and beautiful waveform, capacitance is certainly high and there is plenty of crosstalk, but it's generally had enough margins to work well and based on what I know of the timing and when SDA is sampled, etc., it should be pretty well tolerated. It certainly can cause arbitrary transaction failures, particularly when other system noise is added, but I'm operating with most power circuits off for all of this debugging, just logic supply running. As a bigger picture, 1.4.10 from program start: 1.5.4 from program start: Setup() happens and that's the first cluster of transactions, nothing unusual here. D2 is the interrupt from PC controller so the MCU is responding to those requests only during the time that is high, also looks OK. Then the last cluster is where I highlighted the problems above. D3 is the UART output for debug. |
Beta Was this translation helpful? Give feedback.
-
Thanks. The graphs helped me a lot, as I found my Typo: |
Beta Was this translation helpful? Give feedback.
Thanks. The graphs helped me a lot, as I found my Typo:
twi.c, line 637. _toggleStreamFn should be set to 0. (Copy-Paste does it again).
this variable switches the access between master and slave array for writes, reads, available. When it's 1, all writes go to the Slave part, and this results in the "empty" Master write after the first slave transaction.