-
-
Notifications
You must be signed in to change notification settings - Fork 3.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[pushover] Sending messages randomly fails with java.io.EOFException #10376
Comments
Can you add some DEBUG or TRACE logs? |
I am trying to reproduce the issue, but it seems to happen very randomly. |
If so, best would be both. |
This is all I could find in the logs. Hope it helps...
|
Thanks. It might help. Currently I have two possible solutions in my mind and I prepared a first test version for you: org.openhab.binding.pushover-3.1.0-SNAPSHOT.zip Are you able to test it? |
This issue has been mentioned on openHAB Community. There might be relevant details there: https://community.openhab.org/t/pushover-exception-return-value/116669/15 |
Do I have to uninstall the 3.1.0.M2 Version before copying your version to the addons folder? (I didn't) I do get errors now that I haven't seen before:
In the console:
Still got the error once so far:
|
Yes. Please uninstall any other Pushover binding versions before. Download the “.zip” file, extract the “.jar” and put it into your addons/ folder. |
Looks good so far. |
Sounds good. I will prepare a PR for it and submit my changes. Some more information in them: I added additional exception handling which should result in correct return of |
I just had a missing pushover notification. As soon as I'm home, I'll have a look into the log files... |
Here we go...
|
In DEBUG mode I expected the message shown above. The important thing is, that we do not see the error in Did you still see an error message like this one?
|
I have to check the logs when I'm back home. Busy day ;-) But I had not a single error on openHAB 2.5. This issue started with the openHAB 3 Update. |
That error didn't show up anymore. |
Is this issue really closed? I'm still experiencing some missing pushover messages, which I haven't seen in version 2.5 at all. |
Did you try the change in #10437? |
Isn't that the same change, that @cweitkamp included in the test version above? |
Please reopen this issue if it is still a problem for you, The changes in #10437 are included in the test version I provided to you. They were necessary anyways. As I told you above we cannot prevent communication failures for sure. Just handle them correctly. But I have a second idea how we can improve the situation. I will prepare another test version for you. Currently the binding uses a shared HTTP client which may interfere with other bindings. We can test to create an own instance of the client. |
Sounds promising... OT: How do I reopen the issue? (Maybe I'm blind :-D) EDIT: I think I'm not allowed to reopen the issue, because @fwolter closed it. |
This error showed up a couple of minutes ago (no pushover message was sent):
|
It's very strange and I don't know whether this helps, but it seems like the longer the interval in my test rule, the more errors I get. Generating pushover messages every one or two minutes creates no errors. EDIT: Almost every second message seems to fail. The reason differs:
|
Any Updates on this? As a workaround I'm sending an email to my pushover mail address (xxxxxxx@pomail.net). This works without any problems, but it's a bit delayed. |
Double checked the related PR regarding TTL and afaict it does not touch anything related. You might want to try this pushover binding version where i replaced the commonhttpclient with a custom one. To test:
|
Thanks for investing time into this. Unfortunately the binding does not start, even after manually starting through the console:
System details:
debug.log
|
Ah, mistake in the consumer name, please try-again i updated the jar. |
Test binding is working now, but issue remains. First attempt after longer pause at 17:29 with error: trace.log with error 17:29
Second attempt, successfull, at 17:31: trace.log success 17:31
Third attempt, error, at 17:37: trace.log error 17:37
Fourth attempt, successfull, at 17:40: trace.log success 17:40
|
Could it be a router or something that actively closes the connection when it is idle for some time? There is not much to configure client side (binding) |
I will further test with your test binding and also with the original binding and post here if I find something. |
Both bindings have sort of the same configuration. It’s just that the test version has an isolated client. Not affected by other bindings. |
Updated the JAR, now it includes some more timeout restrictions. Curious if that makes a difference. most of them are set around 2 minute mark. |
It looks like it does: 16:17 -> success So even after 20 min of inactivity, the next first consecutive attempt was successful. I will do some more tests the next hours and report back. Great job 👍 |
18:00 -> success |
After a couple of hours of operation, the pushover test binding
is still working fine, no issues anymore. Jabadabado 🥇 |
Nice! I have updated the jar again. Would be nice if you can test once more. |
After around 5 hours of operation and several test messages I can confirm that version
also works without any issues. Big THANKS again for solving this issue and all your work on openHAB! |
@lsiepel - can you help me with an explanation why/how your fix works? 🙂 I had a very quick look the other day, and found only this usage of Lines 148 to 149 in 6f55f3d
So it seems we are doing blocking HTTP calls ad-hoc with 10 s timeout (default value at least) and not keeping any persistent connection. What causes the EOF exception, and how did you come up with 18 s as idle value? It seems like the default value for the shared client is 300 s, unless configured to something else or unless a binding messes with this value for the shared client: I probably don't have all the information, so hence asking, since I don't understand what causes the issue, why no one else seems to have it and why/how your fix works. @sihui62 - do you have anything in |
No, everything is commented out.
No, has always been 10s.
|
It took me a while to get it fixed. The EOF is caused by the httpclient while sending the request. The underlying connection seems no longer to be available, but the httpclient is not aware. The connection might be killed by anything in between the httpclient and the remote server (including the remote server self, firewall, routers, anything) As of HTTP 1.1 the connection is kept alive by default. The Jetty client can be manipulated to control the idle time by setting The thought behind this is that the httpclient does know when the idle timeout occurs and it handles it transparantly. As long as this timeout triggers before the 'remote connection kill' event is happening, it all works. BTW: i ruled out the other bindings by first trying to use Hope this explains it enough to merge the request. |
@lsiepel - thanks for your detailed explanation! I'm still a bit confused about what exactly happens in the specific case, and if using 18 s idle timeout will effectively resolve it or just somehow reduce it. I'm currently trying to reproduce it in a simple manner: var timeoutId = 0;
var seconds = 10;
function sendNotification() {
notification.send("Test", "Test");
timeoutId = setTimeout(() => {
console.log("Sending test notification");
sendNotification()
}, seconds*1000);
seconds = seconds * 2;
}
rules.when()
.item("Test").changed().toOn()
.then(event => {
sendNotification()
})
.build("Pushover start test");
rules.when()
.item("Test").changed().toOff()
.then(event => {
if (timeoutId > 0) {
clearTimeout(timeoutId);
}
})
.build("Pushover stop test"); I never experienced the issue myself, and also never saw it reported by anyone else. I'm therefore wondering about a few options:
There are multiple factors that could cause the issue (like connection pool usage), and I'm just not sure if we are fixing the symptom rather than root cause. WDYT? |
I am using the pushover binding since 2016 and also did never experience the issue ... until the end of December 2023. I even installed a new, clean openHAB (around the beginning of this year) and only installed the pushover binding: same issue
For me it seems we have that issue since 2021, when this issue report was opened, or am I wrong? Anyway, thanks for taking your time to look into this. I almost switched to the notifications of the cloud connector, but am now confident we have a solution. |
Did you start using the binding differently at that time, for example did you start using Expiring Messages? Jetty was also upgraded in 4.1: openhab/openhab-core#3814. In any case, the OP had the issue in 2021 already.
Sorry, I mixed you up with the OP, so at least two are having the issue. My main point is that it doesn't seem to be the majority of users having it, unless we are only us three using the binding, or everyone except me is having it and staying silent in the foum and in this issue. 😄 This is not to say nothing is wrong in the binding, but just to bring up some additional considerations, and also to avoid breaking something for unaffected systems. I hope you will be able to help testing once again if we come up with a tweaked fix. Your test efforts are also appreciated. |
There are many ways to Rome, i guess.
Tests by @sihui62 showed that the issue was reproducable with a 5 minuten interval. I certainly don't expect this to lead to regression and or hidden issues, as that should have been shown by the performed tests. I'll sleep a night on option 3 ;-) |
No. Although I tested the new TTL feature, it did not make it into my system as I have no usecase for it. All tests with the test version of the binding from @lsiepel where made with the simplest way possible:
Usually pushover is used to send me an animated gif from my door camera:
Sure, just shout. |
No, so perhaps we could default to the 18, and let 0 keep the default idle timeout. I can't say if it's worth it, but personally I would probably use If you don't think any of this is worth to do, we can also merge it in its current state. |
There is no relation, idletimeout and timeout control different parts without overlap.
I'll come back to this. Bit busy these days. |
@lsiepel - this PR fell off my radar. Did you have a chance to consider #10376 (comment)? |
Same here, will try to look at it tomorrow. |
... or some days after. Adjusted the PR. It now has a parameter that defaults to 300 seconds (current behaviour). Can be adjust to specific needs. Think it is all set now. |
Expected Behavior
Pushover messages should be reliable. Every attempt of sending a message from a rule should bring the same result.
Current Behavior
Very randomly and for no good reason the following error message appears in the logs and no pushover message is sent:
Script execution of rule with UID 'hm-sonstige-5' failed: java.io.EOFException: HttpConnectionOverHTTP@d761b68::DecryptedEndPoint@3b78ddb9{api.pushover.net/104.20.125.71:443<->/192.168.178.2:47218,OPEN,fill=-,flush=F,to=82112373/0} in hm-sonstige
Seconds later the exact same rule works like a charm for multiple times.
Steps to Reproduce (for Bugs)
Send a pushover message from the same rule multiple times.
Your Environment
The text was updated successfully, but these errors were encountered: