Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[hue] Error handling improvements #15460

Closed
jlaur opened this issue Aug 19, 2023 · 12 comments · Fixed by #15477
Closed

[hue] Error handling improvements #15460

jlaur opened this issue Aug 19, 2023 · 12 comments · Fixed by #15477
Labels
enhancement An enhancement or new feature for an existing add-on

Comments

@jlaur
Copy link
Contributor

jlaur commented Aug 19, 2023

Placeholder for some minor issues found.

Status description missing

This was observed in my logs:

2023-08-19 13:43:23.638 [INFO ] [ab.event.ThingStatusInfoChangedEvent] - Thing 'hue:bridge-api2:home' changed from ONLINE to OFFLINE (COMMUNICATION_ERROR): An unexpected exception null occurred.

When looking in the UI:

  • OFFLINE
  • COMMUNICATION_ERROR
  • An unexpected exception {} occurred.

One issue seems to be a problem related to I18N. This works:

offline.api2.comm-error.exception = An unexpected exception occurred: {0}

This doesn't:

offline.api2.comm-error.exception = An unexpected exception '{}' occurred.

This doesn't work either:

offline.api2.comm-error.exception = An unexpected exception '{0}' occurred.

It seems the placeholder variable needs to be at the end. Perhaps this feature is some kind of pure luck and is not designed to be used like this at all. I would need to investigate further.

Incomplete HTTP status code check

Another issue is here:

This will usually happen when HTML content is returned. Most likely there is a HTTP status code that would be more interesting than the MIME type. However, only a few are explicitly handled, and the code is not returned:

switch (httpStatus) {
case HttpStatus.UNAUTHORIZED_401:
case HttpStatus.FORBIDDEN_403:
handleHttp2Error(Http2Error.UNAUTHORIZED);
default:
}

I'm not sure what would be the best implementation for this, but a straight-forward one would be:

  • Add instance variable int httpStatus to BaseStreamListenerAdapter.
  • Add method int getHttpStatus() for returning it.
  • Set the variable just before the switch statement above.
  • Check it in putResource and make sure to log it and also use the HTTP status rather than MIME type for determining if the call was successful. We can still check MIME type even for code 200 for being safe.
@jlaur jlaur added the enhancement An enhancement or new feature for an existing add-on label Aug 19, 2023
@andrewfg
Copy link
Contributor

also use the HTTP status rather than mimetype for determining if the call was successful. We can still check MIME type even for code 200 for being safe.

Regardless of whether the response is valid JSON or an HTML error message, the payload is returned with HTTP 200 OK, so the ContentType is the only way to tell if something was wrong.

@andrewfg
Copy link
Contributor

@jlaur the Hue API documentation (e.g. see link below) gives all the error codes that might potentially be reported for each type of call i.e. 400, 401, 403, 404, 405, 429, 500, 503, 507 (maybe others too). In reality I only ever saw 401, 403 and of course 200..

https://developers.meethue.com/develop/hue-api-v2/api-reference/#resource_light__id__put

@openhab-bot
Copy link
Collaborator

This issue has been mentioned on openHAB Community. There might be relevant details there:

https://community.openhab.org/t/philips-hue-clip-2-api-v2-discussion-thread/142111/239

@jlaur
Copy link
Contributor Author

jlaur commented Aug 20, 2023

Regardless of whether the response is valid JSON or an HTML error message, the payload is returned with HTTP 200 OK, so the ContentType is the only way to tell if something was wrong.

OK, I didn't know that because the HTTP error code is not logged. 🙂 Anyway, it doesn't contradict what I wrote:

We can still check MIME type even for code 200 for being safe.

So my suggestion would be to check error code first and react appropriately on that. For 200 we would then move ahead and check Content-Type like now, since nginx seems to react in ways not taken into consideration in the documentation. And last, even for application/json we should anticipate that the deserialization can fail if we receive an unexpected payload. This part is probably already handled correctly.

the Hue API documentation (e.g. see link below) gives all the error codes that might potentially be reported for each type of call i.e. 400, 401, 403, 404, 405, 429, 500, 503, 507 (maybe others too). In reality I only ever saw 401, 403 and of course 200..

Thanks for the link. I briefly checked and they claim application/json for all those codes, which you have proven wrong.

@andrewfg
Copy link
Contributor

andrewfg commented Aug 20, 2023

they claim application/json for all those codes

Yes. Although I think that strictly the HTML error is not an API error but rather a default fallback error of the bridge when the API is not running. If you open the HTML in a browser it displays a simple web page saying "sorry no lighting found" (something like that). You will also get the same error if you try (say) to open an API v2 page in your browser using something like this http://192.168.1.123/clip/v2/resource/light


<html>
 <head>
     <title>hue personal wireless lighting</title>
     <link rel="stylesheet" type="text/css" href="/index.css">
 </head>
 <body>
     <div class="philips-header">
       <img src="/philips-blue.png" class="philips-logo" alt="Philips" />
     </div>
     <div class="header">
       <img class="header-logo" src="/hue-logo.png" alt="hue personal wireless lighting" />
       <img src="/hue-color-line.png" class="colorline" />
     </div>
     <div class="error">Oops, there appears to be no lighting here</div>
 </body>
</html>

image

@andrewfg
Copy link
Contributor

my suggestion would be to check error code first and react appropriately on that

@jlaur as you know from #15350 I am doing some work to improve the stability of Clip2Bridge in relation to GO_AWAY and re-starting the session. And I have already added some code to my work in progress to read and check the HTTP status code. So you don't need to write anything for this check yourself.

@andrewfg
Copy link
Contributor

It seems the placeholder variable needs to be at the end. Perhaps this feature is some kind of pure luck and is not designed to be used like this at all. I would need to investigate further.

@jlaur can I leave this one up to you?

@jlaur
Copy link
Contributor Author

jlaur commented Aug 21, 2023

It seems the placeholder variable needs to be at the end. Perhaps this feature is some kind of pure luck and is not designed to be used like this at all. I would need to investigate further.

@jlaur can I leave this one up to you?

Yes. In the meantime I found the documentation:
https://www.openhab.org/javadoc/latest/org/openhab/core/thing/i18n/thingstatusinfoi18nlocalizationservice

I'm currently looking into the implementation. This should work:

offline.api2.comm-error.exception = An unexpected exception '{0}' occurred.

when used like this:

updateStatus(ThingStatus.OFFLINE, ThingStatusDetail.COMMUNICATION_ERROR,
        "@text/offline.api2.comm-error.exception [\"" + e.getMessage() + "\"]");

However, for some reason I'm seeing this in my tests:

image

@jlaur
Copy link
Contributor Author

jlaur commented Aug 21, 2023

I'm currently looking into the implementation. This should work:

Okay, I found the issue. The position is not the problem, the surrounding apostrophes are.

@jlaur
Copy link
Contributor Author

jlaur commented Aug 30, 2023

Small note - log from this morning:

2023-08-30 06:53:12.430 [INFO ] [ab.event.ThingStatusInfoChangedEvent] - Thing 'hue:bridge-api2:home' changed from ONLINE to OFFLINE (COMMUNICATION_ERROR): An unexpected exception 'Error sending request' occurred.
2023-08-30 07:53:10.830 [INFO ] [ab.event.ThingStatusInfoChangedEvent] - Thing 'hue:bridge-api2:home' changed from OFFLINE (COMMUNICATION_ERROR): An unexpected exception 'Error sending request' occurred. to ONLINE

This raises two questions:

  • What caused it to go offline?
  • Why was it offline almost a full hour?

I will try to add better logging for the "Error sending request" part in my own build to get more information the next time it happens. And also have a look at the code regarding reconnect logic. If any of this justifies a new issue (and/or PR), I will then create that as a result.

Additional notes:

  • I don't have automatic firmware upgrades enabled on my bridge.
  • My system was not affected in general (or having network issues etc.)

@andrewfg
Copy link
Contributor

^
I am pretty sure that this was related to the GO_AWAY issue; after wrongly handling a GO_AWAY the bridge would get into a bad state that would take time to recover from; and as we have fixed that issue then this issue should probably no longer occur..

@jlaur
Copy link
Contributor Author

jlaur commented Aug 30, 2023

and as we have fixed that issue then this issue should probably no longer occur.

It did.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement An enhancement or new feature for an existing add-on
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants