-
-
Notifications
You must be signed in to change notification settings - Fork 3.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[powermax] Improved error handling for Powermax binding #9817
Conversation
Catch all exceptions, and take all Powermax things offline in the event of a communication failure. Also adds a 5-minute watchdog for connections in Standard (non-Powerlink) mode. Signed-off-by: Ron Isaacson <isaacson.ron@gmail.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You've changed a lot of log messages from debug to info or warn. In general if an error occurs the message should be reflected in the status. Also we try to minimize the logging in bindings. See also https://www.openhab.org/docs/developer/guidelines.html#f-logging . There I think you should not change the log levels here.
...owermax/src/main/java/org/openhab/binding/powermax/internal/connector/PowermaxConnector.java
Outdated
Show resolved
Hide resolved
...owermax/src/main/java/org/openhab/binding/powermax/internal/connector/PowermaxConnector.java
Outdated
Show resolved
Hide resolved
I did not yet find the time to look at this PR but please note I absolutely don't want that a CRC error leads to all things going OFFLINE. This will simply break the binding! |
I can confirm, I see that CRC error occasionally too. Currently it only goes offline in the event of an I/O exception where communication with the panel is lost. The 5-minute keepalive in standard mode is needed to ensure that the binding attempts to talk to the panel regularly, because otherwise we might not know that the connection is lost. I'm going to make the changes requested by Hilbrand and I might also add a check that goes offline if there are no messages received in a period of time. That would catch the case where the serial connection is still active but the panel just stops responding. I think this happened to me once. |
So, digging into this a bit more... there were a bunch of error messages being swallowed before. See here for example. If the connection failed for any of those reasons, the exact reason was logged only at debug level, and the binding would give you a generic failure message. Switching them all from debug to warn was the easiest change, but not consistent with the principle @Hilbrand mentioned above. I can work on propagating these messages up to the bridge state but it will be a bit more involved. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just one important question.
Regarding log level, almost all logs are DEBUG very probably because I was requested to do it like that during the review process of the binding. If every binding tries to use INFO/WARN logging, you have by default a lot of logs when starting openHAB. The idea is I believe to have a clean log when starting OH. And you have to take care to not have by default regular logging just for something which is finally normal.
...owermax/src/main/java/org/openhab/binding/powermax/internal/message/PowermaxCommManager.java
Show resolved
Hide resolved
Incidentally @lolodomo, this is the one CRC error I see consistently, every time the panel exits Downloading mode:
In this client, they skip the CRC check on 0xF1 messages. Maybe it's a panel bug? Nobody seems to know what's in the 0xF1 messages anyway, so maybe I'll put in the same fix. I also noticed that my settings download fails every time and forces unnecessary retries:
This is because I have a Powermax+ which doesn't support partitions. I'm going to add a check which skips the DL_PARTITIONS step if the panel type only has one partition. |
1. Mark the bridge and all things as offline when communication is lost 2. Bubble all I/O exceptions up to the bridge status 3. Set all channels to UnDefType.NULL when bridge goes offline 4. Ignore CRC errors on messages of type 0xF1 5. Don't try and download partition settings for panels that don't support partitions 6. Fix logging levels to avoid unnecessary non-DEBUG messages Signed-off-by: Ron Isaacson <isaacson.ron@gmail.com>
Signed-off-by: Ron Isaacson <isaacson.ron@gmail.com>
...rmax/src/main/java/org/openhab/binding/powermax/internal/connector/PowermaxReaderThread.java
Outdated
Show resolved
Hide resolved
...ermax/src/main/java/org/openhab/binding/powermax/internal/handler/PowermaxBridgeHandler.java
Outdated
Show resolved
Hide resolved
...ermax/src/main/java/org/openhab/binding/powermax/internal/handler/PowermaxBridgeHandler.java
Outdated
Show resolved
Hide resolved
...ermax/src/main/java/org/openhab/binding/powermax/internal/handler/PowermaxBridgeHandler.java
Show resolved
Hide resolved
Signed-off-by: Ron Isaacson <isaacson.ron@gmail.com>
Signed-off-by: Ron Isaacson <isaacson.ron@gmail.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
@Hilbrand - ready to merge if you're good. |
@Hilbrand - any additional comments? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. Although the change of the log to info could be better. Because these are now on by default the messages itself don't give any context about what thing is given these messages. So for the user it might not directly clear where these messages come from. Might be something to see if people will notice mention this.
* Improved error handling for Powermax binding Catch all exceptions, and take all Powermax things offline in the event of a communication failure. Also adds a 5-minute watchdog for connections in Standard (non-Powerlink) mode. * More Powermax error handling improvements 1. Mark the bridge and all things as offline when communication is lost 2. Bubble all I/O exceptions up to the bridge status 3. Set all channels to UnDefType.NULL when bridge goes offline 4. Ignore CRC errors on messages of type 0xF1 5. Don't try and download partition settings for panels that don't support partitions 6. Fix logging levels to avoid unnecessary non-DEBUG messages Signed-off-by: Ron Isaacson <isaacson.ron@gmail.com> Signed-off-by: John Marshall <john.marshall.au@gmail.com>
* Improved error handling for Powermax binding Catch all exceptions, and take all Powermax things offline in the event of a communication failure. Also adds a 5-minute watchdog for connections in Standard (non-Powerlink) mode. * More Powermax error handling improvements 1. Mark the bridge and all things as offline when communication is lost 2. Bubble all I/O exceptions up to the bridge status 3. Set all channels to UnDefType.NULL when bridge goes offline 4. Ignore CRC errors on messages of type 0xF1 5. Don't try and download partition settings for panels that don't support partitions 6. Fix logging levels to avoid unnecessary non-DEBUG messages Signed-off-by: Ron Isaacson <isaacson.ron@gmail.com>
* Improved error handling for Powermax binding Catch all exceptions, and take all Powermax things offline in the event of a communication failure. Also adds a 5-minute watchdog for connections in Standard (non-Powerlink) mode. * More Powermax error handling improvements 1. Mark the bridge and all things as offline when communication is lost 2. Bubble all I/O exceptions up to the bridge status 3. Set all channels to UnDefType.NULL when bridge goes offline 4. Ignore CRC errors on messages of type 0xF1 5. Don't try and download partition settings for panels that don't support partitions 6. Fix logging levels to avoid unnecessary non-DEBUG messages Signed-off-by: Ron Isaacson <isaacson.ron@gmail.com>
Catch all exceptions, and take all Powermax things offline in the event
of a communication failure. Also adds a 5-minute watchdog for
connections in Standard (non-Powerlink) mode.
Signed-off-by: Ron Isaacson isaacson.ron@gmail.com
Closes #9688