[hue] Bridge (v1) going OFFLINE and then ONLINE frequently due to connection lost #16723

lolodomo · 2024-05-07T08:03:17Z

I found in my logs that my bridge (v1) thing is going OFFLINE frequently, the reason is apparently a lost connection. When it happens, the bridge is going ONLINE 10s later. It happens even during the night when there is no activity in the house. The frequency is not regular.

2024-05-07 02:12:24.915 [INFO ] [ab.event.ThingStatusInfoChangedEvent] - Thing 'hue:bridge:bridge1' changed from ONLINE to OFFLINE: Connexion au pont Hue perdue.
2024-05-07 02:12:35.077 [INFO ] [ab.event.ThingStatusInfoChangedEvent] - Thing 'hue:bridge:bridge1' changed from OFFLINE: Connexion au pont Hue perdue. to ONLINE
2024-05-07 02:48:27.886 [INFO ] [ab.event.ThingStatusInfoChangedEvent] - Thing 'hue:bridge:bridge1' changed from ONLINE to OFFLINE: Connexion au pont Hue perdue.
2024-05-07 02:48:38.057 [INFO ] [ab.event.ThingStatusInfoChangedEvent] - Thing 'hue:bridge:bridge1' changed from OFFLINE: Connexion au pont Hue perdue. to ONLINE
2024-05-07 03:01:56.613 [INFO ] [ab.event.ThingStatusInfoChangedEvent] - Thing 'hue:bridge:bridge1' changed from ONLINE to OFFLINE: Connexion au pont Hue perdue.
2024-05-07 03:02:06.767 [INFO ] [ab.event.ThingStatusInfoChangedEvent] - Thing 'hue:bridge:bridge1' changed from OFFLINE: Connexion au pont Hue perdue. to ONLINE
2024-05-07 04:28:37.726 [INFO ] [ab.event.ThingStatusInfoChangedEvent] - Thing 'hue:bridge:bridge1' changed from ONLINE to OFFLINE: Connexion au pont Hue perdue.
2024-05-07 04:28:48.053 [INFO ] [ab.event.ThingStatusInfoChangedEvent] - Thing 'hue:bridge:bridge1' changed from OFFLINE: Connexion au pont Hue perdue. to ONLINE
2024-05-07 04:47:50.996 [INFO ] [ab.event.ThingStatusInfoChangedEvent] - Thing 'hue:bridge:bridge1' changed from ONLINE to OFFLINE: Connexion au pont Hue perdue.
2024-05-07 04:48:01.152 [INFO ] [ab.event.ThingStatusInfoChangedEvent] - Thing 'hue:bridge:bridge1' changed from OFFLINE: Connexion au pont Hue perdue. to ONLINE
2024-05-07 04:50:28.682 [INFO ] [ab.event.ThingStatusInfoChangedEvent] - Thing 'hue:bridge:bridge1' changed from ONLINE to OFFLINE: Connexion au pont Hue perdue.
2024-05-07 04:50:38.839 [INFO ] [ab.event.ThingStatusInfoChangedEvent] - Thing 'hue:bridge:bridge1' changed from OFFLINE: Connexion au pont Hue perdue. to ONLINE
2024-05-07 04:58:41.334 [INFO ] [ab.event.ThingStatusInfoChangedEvent] - Thing 'hue:bridge:bridge1' changed from ONLINE to OFFLINE: Connexion au pont Hue perdue.
2024-05-07 04:58:51.694 [INFO ] [ab.event.ThingStatusInfoChangedEvent] - Thing 'hue:bridge:bridge1' changed from OFFLINE: Connexion au pont Hue perdue. to ONLINE
2024-05-07 05:20:39.967 [INFO ] [ab.event.ThingStatusInfoChangedEvent] - Thing 'hue:bridge:bridge1' changed from ONLINE to OFFLINE: Connexion au pont Hue perdue.
2024-05-07 05:20:50.122 [INFO ] [ab.event.ThingStatusInfoChangedEvent] - Thing 'hue:bridge:bridge1' changed from OFFLINE: Connexion au pont Hue perdue. to ONLINE
2024-05-07 06:28:13.230 [INFO ] [ab.event.ThingStatusInfoChangedEvent] - Thing 'hue:bridge:bridge1' changed from ONLINE to OFFLINE: Connexion au pont Hue perdue.
2024-05-07 06:28:23.387 [INFO ] [ab.event.ThingStatusInfoChangedEvent] - Thing 'hue:bridge:bridge1' changed from OFFLINE: Connexion au pont Hue perdue. to ONLINE
2024-05-07 07:19:07.552 [INFO ] [ab.event.ThingStatusInfoChangedEvent] - Thing 'hue:bridge:bridge1' changed from ONLINE to OFFLINE: Connexion au pont Hue perdue.
2024-05-07 07:19:17.707 [INFO ] [ab.event.ThingStatusInfoChangedEvent] - Thing 'hue:bridge:bridge1' changed from OFFLINE: Connexion au pont Hue perdue. to ONLINE

The text was updated successfully, but these errors were encountered:

lolodomo · 2024-05-07T08:05:45Z

Here is the line triggering the update of thing to OFFLINE:

openhab-addons/bundles/org.openhab.binding.hue/src/main/java/org/openhab/binding/hue/internal/handler/HueBridgeHandler.java

Line 738 in 283ae59

    
           updateStatus(ThingStatus.OFFLINE, ThingStatusDetail.NONE, "@text/offline.bridge-connection-lost");

andrewfg · 2024-05-08T10:28:10Z

I guess it is normal for the bridge to indicate offline if the connection is lost.

The real question is what is causing the connection to be lost?

lolodomo · 2024-05-08T10:43:59Z

The real question is what is causing the connection to be lost?

Yes, that is the question,. I will add logs to the binding to understand in which case it happens.
Maybe due to authentication....

andrewfg · 2024-05-09T13:04:02Z

There was a fairly major firmware update for v2 bridges on Aptil 25th, so maybe they pushed something for v1 at the same time..

https://www.philips-hue.com/en-gb/support/release-notes/bridge

lolodomo · 2024-06-08T10:09:34Z

After adding/updating some logs, looks like a timeout problem when polling the bridge:

2024-06-08 11:49:47.423 [WARN ] [ue.internal.handler.HueBridgeHandler] - Connection to Hue Bridge 192.168.xxx.xxx lost: Total timeout 5000 ms elapsed
org.openhab.core.i18n.CommunicationException: Total timeout 5000 ms elapsed
        at org.openhab.binding.hue.internal.connection.HueBridge.doNetwork(HueBridge.java:1128) ~[bundleFile:?]
        at org.openhab.binding.hue.internal.connection.HueBridge.doNetwork(HueBridge.java:1094) ~[bundleFile:?]
        at org.openhab.binding.hue.internal.connection.HueBridge.get(HueBridge.java:1077) ~[bundleFile:?]
        at org.openhab.binding.hue.internal.connection.HueBridge.getGroups(HueBridge.java:491) ~[bundleFile:?]
        at org.openhab.binding.hue.internal.handler.HueBridgeHandler$2.updateGroups(HueBridgeHandler.java:312) ~[bundleFile:?]
        at org.openhab.binding.hue.internal.handler.HueBridgeHandler$2.doConnectedRun(HueBridgeHandler.java:254) ~[bundleFile:?]
        at org.openhab.binding.hue.internal.handler.HueBridgeHandler$PollingRunnable.run(HueBridgeHandler.java:138) [bundleFile:?]
        at java.util.concurrent.Executors$RunnableAdapter.call(Unknown Source) [?:?]
        at java.util.concurrent.FutureTask.runAndReset(Unknown Source) [?:?]
        at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(Unknown Source) [?:?]
        at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source) [?:?]
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source) [?:?]
        at java.lang.Thread.run(Unknown Source) [?:?]

Maybe 5s is a little short as timeout ? I am going to try with a bigger timeout.

andrewfg · 2024-06-08T11:27:27Z

Maybe 5s is a little short as timeout ?

Yes this could indeed be the case. For example: if the Hue Bridge is running a Hue Labs automation such as 'Color Loop' then it responds much slower to HTTP requests.

Note however that the above problem may go away rather soon because Philips / Signify will remove Hue Labs from the bridges at the end of this month (if I recall correctly), and will have replaced it with newer built in Effects such as 'Prism' and/or other built in functions which perform faster.

lolodomo · 2024-06-08T11:31:06Z

@andrewfg I am using a bridge v1 so I don't think there will be any change from Philips / Signify in the future.

Trying with 8s as timeout for requests, I have not yet encountered any problem. Lets see if it is reliable after few hours.

lolodomo · 2024-06-08T11:40:30Z

It just happened again with 8s timeout.

andrewfg · 2024-06-08T12:55:06Z

@lolodomo I suppose that being a v1 hub, you are using API v1 rather than v2 .. if so, I am pretty sure there was nothing recently changed on the OH side that could have suddenly started your issue after it working for so long. However it seems also unlikely that Philips / Signify has changed the v1 hub firmware. The only thing that I can think of may be the lamp firmware, but this seems unlikely to cause an HTTP comms issue..

andrewfg · 2024-06-08T13:08:47Z

@lolodomo by the way, the API v2 specification states that the Bridge is limited on number of serial HTTP calls per second (as I recall about 20 per second), and number of parallel overlapping HTTP calls (as I recall about 2 or 3). And I suppose that a similar limitation would apply to API v1 resp. the v1 bridge. When I wrote the API v2 code I introduced a throttler to reduce HTTP calls to within the given serial/parallel limits. I did not back-port it to API v1 since I assumed if that if anyone would ever encounter HTTP limitations they could solve it by upgrading to API v2. Yet for your special case you could consider back-porting the throttler -- just a thought..

lolodomo · 2024-06-08T13:51:41Z

Problem is unchanged with 10s timeput.

As I have the sensor polling interval set to 0, I should not have concurrent requests, except of course if all requests run when polling lights update sometimes takes more than 10s. I mean maybe the next cycle starts while the previous one is not ended.
In case it is the problem, I could try to increase the refresh interval to see if it helps.

lolodomo · 2024-06-08T13:55:49Z

In fact, there is a lock mechanism so concurrent requests when polling data is not possible.

lolodomo · 2024-06-08T14:07:24Z

One solution or rather a workaround could be to not set the bridge thing to OFFLINE when there is a request timeout during the data polling but rather do it if it happens during two consecutive polls.

lolodomo · 2024-06-08T18:16:50Z

I am testing a basic retry mechanism added for GET requests failing in timeout and it seems to work well.
Will see if it is fully reliable after several hours.

Fix openhab#16723 Signed-off-by: Laurent Garnier <lg.hc@free.fr>

…16902) Fix #16723 Signed-off-by: Laurent Garnier <lg.hc@free.fr>

lolodomo · 2024-06-22T11:48:00Z

In addition to the now implemented retry mechanism, I am currently testing with a timeout set to 7s (instead of 5s) to check if I have less timeouts.

…penhab#16902) Fix openhab#16723 Signed-off-by: Laurent Garnier <lg.hc@free.fr>

lolodomo · 2024-06-29T10:15:06Z

In addition to the now implemented retry mechanism, I am currently testing with a timeout set to 7s (instead of 5s) to check if I have less timeouts.

After one week with timeout set to 7s, I can see that the stats are not better. So let's keep the timeout to 5s.

andrewfg · 2024-06-29T10:32:28Z

with timeout set to 7s, I can see that the stats are not better. So let's keep the timeout to 5s.

If I understand correctly, in your latest PR you do an immediate retry after 5s timeout, and this mostly succeeds. The total timeout for two calls is 2 x 5s = 10s, and increasing to 7s timeout (i.e. 14s) does not reduce the failure rate.

It would be interesting to discover if the timeout is actually caused a) by a socket connect timeout, or b) an HTTP response timeout. The former would indicate that the bridge is in a fundamentally non receptive state, whereas the latter would simply indicate that it is busy..

lolodomo · 2024-06-29T10:46:53Z

It would be interesting to discover if the timeout is actually caused a) by a socket connect timeout, or b) an HTTP response timeout. The former would indicate that the bridge is in a fundamentally non receptive state, whereas the latter would simply indicate that it is busy..

The first exception is always a HTTP response timeout.
When the retry fails, it can happen that it is due to a socket timeout exception but most of the time it is again a HTTP response timeout.

It is already better now. The next step will be to buy a new bridge v2 ;)

…penhab#16902) Fix openhab#16723 Signed-off-by: Laurent Garnier <lg.hc@free.fr> Signed-off-by: Patrik Gfeller <patrik.gfeller@proton.me>

…penhab#16902) Fix openhab#16723 Signed-off-by: Laurent Garnier <lg.hc@free.fr>

…penhab#16902) Fix openhab#16723 Signed-off-by: Laurent Garnier <lg.hc@free.fr> Signed-off-by: Ciprian Pascu <contact@ciprianpascu.ro>

lolodomo added the bug An unexpected problem or unintended behavior of an add-on label May 7, 2024

lolodomo added a commit to lolodomo/openhab-addons that referenced this issue Jun 22, 2024

[hue] API v1: retry once on timeout for GET requests to the bridge

67045f2

Fix openhab#16723 Signed-off-by: Laurent Garnier <lg.hc@free.fr>

lolodomo mentioned this issue Jun 22, 2024

[hue] Retry once on timeout for GET requests to the bridge (API v1) #16902

Merged

lsiepel closed this as completed in #16902 Jun 22, 2024

lsiepel pushed a commit that referenced this issue Jun 22, 2024

[hue] API v1: retry once on timeout for GET requests to the bridge (#…

d2466d5

…16902) Fix #16723 Signed-off-by: Laurent Garnier <lg.hc@free.fr>

psmedley pushed a commit to psmedley/openhab-addons that referenced this issue Jun 29, 2024

[hue] API v1: retry once on timeout for GET requests to the bridge (o…

082fe7a

…penhab#16902) Fix openhab#16723 Signed-off-by: Laurent Garnier <lg.hc@free.fr>

joni1993 pushed a commit to joni1993/openhab-addons that referenced this issue Oct 15, 2024

[hue] API v1: retry once on timeout for GET requests to the bridge (o…

154a621

…penhab#16902) Fix openhab#16723 Signed-off-by: Laurent Garnier <lg.hc@free.fr>

matchews pushed a commit to matchews/openhab-addons that referenced this issue Oct 18, 2024

[hue] API v1: retry once on timeout for GET requests to the bridge (o…

f69c452

…penhab#16902) Fix openhab#16723 Signed-off-by: Laurent Garnier <lg.hc@free.fr>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[hue] Bridge (v1) going OFFLINE and then ONLINE frequently due to connection lost #16723

[hue] Bridge (v1) going OFFLINE and then ONLINE frequently due to connection lost #16723

lolodomo commented May 7, 2024

lolodomo commented May 7, 2024

andrewfg commented May 8, 2024

lolodomo commented May 8, 2024

andrewfg commented May 9, 2024

lolodomo commented Jun 8, 2024 •

edited

Loading

andrewfg commented Jun 8, 2024 •

edited

Loading

lolodomo commented Jun 8, 2024

lolodomo commented Jun 8, 2024

andrewfg commented Jun 8, 2024

andrewfg commented Jun 8, 2024

lolodomo commented Jun 8, 2024

lolodomo commented Jun 8, 2024

lolodomo commented Jun 8, 2024 •

edited

Loading

lolodomo commented Jun 8, 2024

lolodomo commented Jun 22, 2024 •

edited

Loading

lolodomo commented Jun 29, 2024

andrewfg commented Jun 29, 2024

lolodomo commented Jun 29, 2024

[hue] Bridge (v1) going OFFLINE and then ONLINE frequently due to connection lost #16723

[hue] Bridge (v1) going OFFLINE and then ONLINE frequently due to connection lost #16723

Comments

lolodomo commented May 7, 2024

lolodomo commented May 7, 2024

andrewfg commented May 8, 2024

lolodomo commented May 8, 2024

andrewfg commented May 9, 2024

lolodomo commented Jun 8, 2024 • edited Loading

andrewfg commented Jun 8, 2024 • edited Loading

lolodomo commented Jun 8, 2024

lolodomo commented Jun 8, 2024

andrewfg commented Jun 8, 2024

andrewfg commented Jun 8, 2024

lolodomo commented Jun 8, 2024

lolodomo commented Jun 8, 2024

lolodomo commented Jun 8, 2024 • edited Loading

lolodomo commented Jun 8, 2024

lolodomo commented Jun 22, 2024 • edited Loading

lolodomo commented Jun 29, 2024

andrewfg commented Jun 29, 2024

lolodomo commented Jun 29, 2024

lolodomo commented Jun 8, 2024 •

edited

Loading

andrewfg commented Jun 8, 2024 •

edited

Loading

lolodomo commented Jun 8, 2024 •

edited

Loading

lolodomo commented Jun 22, 2024 •

edited

Loading