-
Notifications
You must be signed in to change notification settings - Fork 6.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
INA237 Sensor Bug Fixes and Enhancements (with unit tests) #60717
INA237 Sensor Bug Fixes and Enhancements (with unit tests) #60717
Conversation
f943526
to
8bc9761
Compare
c9f73c4
to
41ad41e
Compare
drivers/sensor/ina23x/ina237_emul.c
Outdated
* | ||
* SPDX-License-Identifier: Apache-2.0 | ||
* | ||
* Emulator for the TI INA237 I2C power monitor |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sad to see emulators being used to mock. why can't we just use FFF?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I had followed the pattern of some of the other sensors which did emulators. Also, in this case part of the complexity is the calculation done at compile time using the DT configuration for the struct ina237_config::cal
property in drivers/sensor/ina23x/ina237.c - INA237_DRIVER_INIT.
The FFF could potentially be used to replace the ina23x_reg_read*()' and 'ina23x_reg_write()
functions if the rest of the functionality works. Do you have a suggestion on best to replace those functions that are in zephyr/drivers/sensor/ina23x/ina23x_common.c
? I am used to other mocking frameworks and have limited experience with FFF. If we can replace the ina23x_common.c
functions, then that would allow eliminating the I2C message decode/encode portion of the ina237_emul.c
file. The register array would probably still be used since that is the easiest way to capture multiple writes done by ina237_init()
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it's better to fix the issue for now and open another PR to discuss testing approach.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I tried to use FFF to replace the ina23x_reg_read*()
and ina23x_reg_write()
functions and I wasn't able to solve the problem where the INA237 driver did reads and writes before the unit test started and I could set the .custom_fake to the mocked function. Not sure if there is a way to load the driver dynamically and if that is a possibility for testing.
I did do another version where I did link-time replacement of the ina23x_reg_read*()
and ina23x_reg_write()
functions with mocks. It avoids have to emulate individual I2C transactions and doesn't create a specific emulator, but there is still a generic I2C emulator created behind the scenes. It isn't any better than the emulated version and has the disadvantage of testing less. You can take a look here: Version using ina23x_reg_readXX() and ina23x_reg_write() mocks
Taking a look at other options, the zephyr/tests/drivers/sensor/icm42688
test case is interesting. It uses FFF, but just for the interrupt mock and still uses an emulator.
Here are all of the sensor tests that I am aware of and a summary of what they are doing:
tests/drivers/sensor
├── accel I2C and SPI emulators behind the scenes, but doesn't test much
├── akm09918c Uses I2C emulator in ./drivers/sensor/akm09918c/akm09918c_emul.c
├── generic Uses DEVICE_DEFINE() to create a dummy sensor_driver_api driver
├── icm42688 Uses SPI emulator in ./drivers/sensor/icm42688/icm42688_emul.c
├── ina237 The driver we are talking about
└── sbs_gauge I2C driver (but don't see any reference to an emulator). Focuses on making sure driver simply
returns a value vs. actually doing full testing.
So at this point, what would you like to do. Should we:
- merge this patch set as is and open a discussion in regards to the emulator
- just fix the bugs and remove the unit tests and open a discussion
- some other alternative
My preference is to merge this as is given that it follows the existing patterns. If there is an easier way to test it, I am all for that as well since it will be useful in the future, but after spending an entire day trying different options, I'm still convinced the first approach here is the best as the alternatives test less and are just as complicated.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@gmarull - please take another look and let me know if you are happy with this as-is or would like to refactor it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I moved the emulator out of the drivers/sensor/ina23x and included it with the test case. Hopefully the cluttering of the drivers folder with the emulators was your concern and you don't mind then in the emulator.
Changes have been included in the latest push to the PR. PTAL when you have a chance.
drivers/sensor/ina23x/ina237.c
Outdated
@@ -165,6 +183,14 @@ static int ina237_read_data(const struct device *dev) | |||
} | |||
} | |||
|
|||
if ((data->chan == SENSOR_CHAN_ALL) || (data->chan == SENSOR_CHAN_VSHUNT)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a real use case for reading the shunt voltage?
The current measurement and shunt voltage are the same value. The only difference is the current has been pre-divided by the constant cal value. If you want shunt voltage, one could just use a unity cal. And if you want both, just multiply current by the cal value to convert it to shunt voltage. It'll be much faster to do the one multiplication than to do another I2C transaction and use RAM to cache the result.
An extra I2C read is going to slow down the sensor for anyone using it the normal way to read voltage and current. Plus more RAM for the cache. IMHO, it doesn't seem worth it unless there is a reason beyond debugging the driver that I'm missing.
TBH, until someone ports Zephyr to the 8051, all the conversion and math stuff this chip does is obsolete. The most efficient driver would just read shunt voltage and voltage, ignore the cal stuff totally, and do all the math on the CPU, as any modern CPU can do this math in the fraction of the time it takes to read just one extra byte over I2C.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a real use case for reading the shunt voltage?
Very true, it is just for debug at this point, although I have considered doing the math in the micro instead just as you mentioned later. I'll remove it from the SENSOR_CHAN_ALL group. Based upon the extra cached value, did you want me to actually remove this completely?
TBH, until someone ports Zephyr to the 8051, all the conversion and math stuff this chip does is obsolete. The most efficient driver would just read shunt voltage and voltage, ignore the cal stuff totally, and do all the math on the CPU
Yes, I agree with you on this part after going through everything. I also looked at doing a bulk read of all of the registers so at least there is only one I2C transaction, but that is not supported by the chip :(
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Removed it from the ALL_CHANNELS list in the next patchset.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@xyzzy42 Please take another look and resolve this thread if you are happy with removing the shunt voltage from ALL_CHANNELS.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What I think doesn't really matter, but I'd probably make it optional via a menuconfig entry. I wouldn't want to enable it by default, since it has a continuous runtime cost with zero benefit to virtually all users. But it could be useful for debugging or some unusual use case like non-constant shunt resistance value or using the INA237 as a mV meter. And I do like to expose every feature I can.
Having it there, but not in the ALL_CHANNELS list seems like it would be confusing. I don't think there's a precedent for that and it's contrary to the sensor docs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Your comments do matter. I like the idea of making it a Kconfig item as I agree with the sensor not being in the ALL_CHANNEL group may be confusing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@xyzzy42 I have added the KConfig and kept the current shunt in the ALL_CHANNEL group. Please take another look when you have a chance.
drivers/sensor/ina23x/ina237.c
Outdated
@@ -63,7 +64,7 @@ static int ina237_channel_get(const struct device *dev, enum sensor_channel chan | |||
|
|||
case SENSOR_CHAN_POWER: | |||
/* see datasheet "Current and Power calculations" section */ | |||
power_uw = (data->power * INA237_POWER_SCALING * config->current_lsb) / 10000U; | |||
power_uw = (data->power * INA237_POWER_SCALING * config->current_lsb) / 10U; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wondered where the factor of 1000 was coming from in the previous PR for this sensor. I see it was a bug.
There is no reason to make power_uw 64 bit here. It won't overflow until the power reaches 4.29 kW, which is far more than this sensor is designed to measure.
The intermediate product doesn't overflow until 858 W, which is also more than the sensor can measure.
It can be improved by changing the … * 2 / 10
into … / 5
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is no reason to make power_uw 64 bit here. It won't overflow until the power reaches 4.29 kW, which is far more than this sensor is designed to measure.
There is no fixed limit that the sensor measures since the INA237 uses an external shunt resistor, so the measurement limit is just based upon the data types involved. Power is 24 bits, INA237_POWER_SCALING is 1 bit, and current_lsb can be up to 16 bits for a total of 41 bits in the numerator.
It can be improved by changing the … * 2 / 10 into … / 5.
Good point, will change INA237_POWER_SCALING to 5, use it as a divisor, and will adjust the name.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed to include the / 5 option. Kept as 64-bit due to number of bits in scaling since it impacts my use case which requires 34 bits.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@xyzzy42 PTAL
@@ -49,13 +49,9 @@ static int ina237_channel_get(const struct device *dev, enum sensor_channel chan | |||
break; | |||
|
|||
case SENSOR_CHAN_CURRENT: | |||
if (data->current & INA23X_CURRENT_SIGN_BIT) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Doesn't ina230 have this same problem? INA23X_CURRENT_SIGN_BIT
looks to be unused after this change.
The existing code is broken not just on ESP32-S3, but on all processors. The flaw is not understanding the integer promotions concept in the C standard. While data->current
is 16-bit, the expression ~data->current
does NOT mean to invert bits of the 16-bit value! The first step is promote data->current
from unsigned 16-bit to a signed 32-bit integer. Then this 32-bit integer (the upper 16-bits are of course zero) will be inverted. That's not what was desired. What would produce the current result would be to invert the 16-bit value, add 1 to the still 16-bit value, and then take the 16-bit result and promote it to 32-bits at the end. But 16->32 conversion happens first, not last.
Anyway, this code is somewhat inefficient, actually performing the multiply to invert the sign. Better code to take absolute value:
current_ua = data->current < 0 ? -data->current : data->current;
// or, producing the same code
current_ua = abs(data->current);
You can see the results of different methods: original, your method, and the above examples
https://godbolt.org/z/qz3vsG4M6
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the Compiler Explorer link. Probably just go with abs() in the negative case given the results.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Doesn't ina230 have this same problem?
Yes, it does have the same issue code issue, but I do not have INA230 hardware to test, so I didn't want to change the code and release it untested. Seems like a relatively safe thing to change and verify by inspection and/or find a volunteer to test it.
Update: I have modified the INA230 code as well and requested testing on Discord, but no takers so far
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FYI, I have added a unit test for the INA230 and tested the change, so should be fixed.
Changes have been included in the latest push to the PR. PTAL when you have a chance.
41ad41e
to
d9125b5
Compare
449b05a
to
08f7958
Compare
08f7958
to
02ab56b
Compare
Fix current sign extension logic and consolidate sensor scaling code into a single block. Signed-off-by: Eric Holmberg <eric.holmberg@northriversystems.co.nz>
The current-shunt calibration requires a factor of 4x if high-precision mode is selected. Signed-off-by: Eric Holmberg <eric.holmberg@northriversystems.co.nz>
8623c4d
to
0f9a079
Compare
@@ -43,6 +43,7 @@ struct ina237_data { | |||
uint16_t bus_voltage; | |||
uint32_t power; | |||
int16_t die_temp; | |||
int16_t shunt_voltage; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This can be ifdef protected, so it doesn't take space in the struct unless enabled.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point, done and pushed.
Add ability to retrieve the shunt voltage if the end user wants to do the power calculation manually to handle negative power values. Signed-off-by: Eric Holmberg <eric.holmberg@northriversystems.co.nz>
Add properties to replace the configuration register value. Signed-off-by: Eric Holmberg <eric.holmberg@northriversystems.co.nz>
Add emulator unit test of the INA230. Signed-off-by: Eric Holmberg <eric.holmberg@northriversystems.co.nz>
Fix sign handling for ina230 current calculation. Signed-off-by: Eric Holmberg <eric.holmberg@northriversystems.co.nz>
0f9a079
to
d5201e1
Compare
The `SENSOR_CHAN_VSHUNT` was added in zephyrproject-rtos#60717 but was never added to the `sensor_channel_name[SENSOR_CHAN_COMMON_COUNT]` table. Since the length of `sensor_channel_name` is fixed to `SENSOR_CHAN_COMMON_COUNT`, this means that the index at `SENSOR_CHAN_VSHUNT` points to `NULL`. When we use the `sensor get` command for anything bigger than `SENSOR_CHAN_VSHUNT`, we will deref that `NULL` pointer when we do `strcmp` in the for-loop of `parse_named_int`. Fix this by defining `SENSOR_CHAN_VSHUNT` in the table. Signed-off-by: Yong Cong Sin <ycsin@meta.com>
The `SENSOR_CHAN_VSHUNT` was added in zephyrproject-rtos#60717 but was never added to the `sensor_channel_name[SENSOR_CHAN_COMMON_COUNT]` table. Since the length of `sensor_channel_name` is fixed to `SENSOR_CHAN_COMMON_COUNT`, this means that the index at `SENSOR_CHAN_VSHUNT` points to `NULL`. When we use the `sensor get` command for anything bigger than `SENSOR_CHAN_VSHUNT`, we will deref that `NULL` pointer when we do `strcmp` in the for-loop of `parse_named_int`. Fix this by defining `SENSOR_CHAN_VSHUNT` in the table. Signed-off-by: Yong Cong Sin <ycsin@meta.com>
The `SENSOR_CHAN_VSHUNT` was added in #60717 but was never added to the `sensor_channel_name[SENSOR_CHAN_COMMON_COUNT]` table. Since the length of `sensor_channel_name` is fixed to `SENSOR_CHAN_COMMON_COUNT`, this means that the index at `SENSOR_CHAN_VSHUNT` points to `NULL`. When we use the `sensor get` command for anything bigger than `SENSOR_CHAN_VSHUNT`, we will deref that `NULL` pointer when we do `strcmp` in the for-loop of `parse_named_int`. Fix this by defining `SENSOR_CHAN_VSHUNT` in the table. Signed-off-by: Yong Cong Sin <ycsin@meta.com>
The `SENSOR_CHAN_VSHUNT` was added in zephyrproject-rtos#60717 but was never added to the `sensor_channel_name[SENSOR_CHAN_COMMON_COUNT]` table. Since the length of `sensor_channel_name` is fixed to `SENSOR_CHAN_COMMON_COUNT`, this means that the index at `SENSOR_CHAN_VSHUNT` points to `NULL`. When we use the `sensor get` command for anything bigger than `SENSOR_CHAN_VSHUNT`, we will deref that `NULL` pointer when we do `strcmp` in the for-loop of `parse_named_int`. Fix this by defining `SENSOR_CHAN_VSHUNT` in the table. Signed-off-by: Yong Cong Sin <ycsin@meta.com>
I have fixed several bugs in the INA237 and INA230 sensor drivers and also added unit tests to make sure the driver doesn't get broken in the future and/or make it easier to fix.
Fixes #60722
To run unit tests: