-
Notifications
You must be signed in to change notification settings - Fork 415
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Better error handling in the HAL #1076
Comments
Could you elaborate a bit more on what this would look like? It's true that on MCUs typically the only recovery method is a reset and/or power cycle. We also don't expect things like SPI initialization or GPIO writes to fail, because in a multithreaded environment (e.g. some RTOS), if the SPI/GPIO is to be considered a shared resource, it is up to the user of the library to guard that resource properly - RadioLib is agnostic about this, that's why HAL exists. However, I don't think it's up to the library itself to be deciding what is and is not recoverrable, that should be delegated to the HAL implementation for the specific platform. For example, some Linux HAL might implement additional error checking when accessing system resources (GPIO and SPI), but if a radio transciever refuses to execute a command (e.g. attempting to set an out-of-range carrier frequency), I don't think the error handling on Linux will be different than on any MCU. |
An example would be Things you might expect to be infallible can fail on Linux too. Even delay functions can fail, for instance, if a signal is received and we woke up too soon (though in that case it may be better to just handle EINTR). Or say an SPI transfer fails. Claiming/reading/writing pins also on Linux can fail. It doesn't make sense of course to put error returns everywhere. For instance, in theory something like Generally speaking, the right thing to do if an operation fails and it's not recoverable is tell the application about it somehow. If that's not possible for whatever reason, maybe call an |
You are correct that the current HAL interface is mostly based on the assumption that the library has exclusive access to the system resources. This assumption obviously does not hold in Linux, but the real problem is that this assumption falls apart in any multi-threaded environment. Even ones common on MCUs, like RTOS - for example, delay can fail very easily there, just by getting blocked by a higher-priority thread. While no exception will be generated, the behavior will not be what you intended. In other example, acquiring SPI access will likely involve obtaining some mutex, which may already be held by some other thread. So this should probably be resolved in the HAL. Having at least |
This is going to be an involved process, but I think it's a good idea to begin doing error checking in the HAL and reporting errors back up the stack and then aborting appropriately (maybe a HAL abort() function when things really go south? also returning errors when relevant).
Although it's safe to assume on Arduinos that functions will either succeed or fail in an irrecoverable way and therefore it doesn't really matter, on Linux and such this may not be the case. For example, we might not be able to open the GPIO device, or we can't acquire a resource.
I think this would be immensely helpful to consumers of the library, especially on non-Arduino platforms.
The text was updated successfully, but these errors were encountered: