Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[push] reporting down when servcice is up in same second. #1747

Open
2 tasks done
Subnet-Masked opened this issue Jun 10, 2022 · 17 comments
Open
2 tasks done

[push] reporting down when servcice is up in same second. #1747

Subnet-Masked opened this issue Jun 10, 2022 · 17 comments
Labels
area:monitor Everything related to monitors area:notifications Everything related to notifications bug Something isn't working type:enhance-existing feature wants to enhance existing monitor

Comments

@Subnet-Masked
Copy link

⚠️ Please verify that this bug has NOT been raised before.

  • I checked and didn't find similar issue

🛡️ Security Policy

Description

Push monitors are reporting down and up in the same second causing copious amounts of false notifications.

Screenshot 2022-06-10 084119

👟 Reproduction steps

Start a 60 second push monitor and use push on a network that has slightly variable ping.

👀 Expected behavior

Downtime should not be reported because the push was received.

😓 Actual Behavior

Downtime is reported in the same second that the push is received.

🐻 Uptime-Kuma Version

1.16.1

💻 Operating System and Arch

Debian

🌐 Browser

Firefox 101.0

🐋 Docker Version

20.10.14, build a224086

🟩 NodeJS Version

No response

📝 Relevant log output

No response

@Subnet-Masked Subnet-Masked added the bug Something isn't working label Jun 10, 2022
@louislam
Copy link
Owner

I thought it should be fixed in 1.16.1. Could you please double check it is 1.16.1 in Settings > About? Thanks

Related issue: #1422
Related pull request: #1428
cc: @kaysond

@Fluqzy
Copy link

Fluqzy commented Jun 10, 2022

Hang on, i got a little diffrent issue.
Im on 1.16.1 too.
My Discord Bot makes a GET request every 60 seconds and tells his ping to the discord servers in the request.
Nothing special, should work, but Uptime Kuma tells the Ping i send with every request but it shows that my Monitor is offline (but i get a ping in every request, so why offline)? I guess this was how the Push monitor works, not Uptime Kuma makes a GET request, the client makes a GET request.
grafik

@Subnet-Masked
Copy link
Author

I thought it should be fixed in 1.16.1. Could you please double check it is 1.16.1 in Settings > About? Thanks

Related issue: #1422 Related pull request: #1428 cc: @kaysond

Yes, I can confirm that I am on 1.16.1
Screenshot 2022-06-10 120038

@Fluqzy
Copy link

Fluqzy commented Jun 10, 2022

Same for me and i have more information.
I debugged a bit and my bot now prints 'PING!' every time a GET request was made.
grafik

The second is, Uptime Kuma tells me "Inactive" as the status.
grafik

Still i get pings (not everyone has a ping thats why there is sometimes no strait line to connect all pings) but it wont turn online, before i had the issue with version 1.15.1? but im not sure if it was that exact version.
grafik

Now im up to date on 1.16.1 and still have that issue
grafik

@Subnet-Masked
Copy link
Author

Subnet-Masked commented Jun 10, 2022

@Fluqzy Please do not hijack other people's issues. However, I had a similar problem because I had not formatted my get request properly.
If you are not using the optional parameters https://<your Kuma>/api/push/5wGOmMPbJB?status=up&msg=OK&ping= then it will report down instead of up. I simply use https://<your Kuma>/api/push/5wGOmMPbJB without them since they are not needed.

@Fluqzy
Copy link

Fluqzy commented Jun 10, 2022

@Subnet-Masked First of all, im sorry for disturb your issue. I felt like it had the same issue so i commented on the same issue, because it just would be stupid to create another issue, so that a project leader would tell me: "look dublicate of issue abc". I didnt hijack your issue and it was never my intention to do so. Sorry!

@kaysond
Copy link
Contributor

kaysond commented Jun 10, 2022

@Subnet-Masked what are you using to call the push url? cron job?

As of 1.16.1, there should be a 1s buffer window before it gets marked down. It also seems like its not happening every time, so maybe occasionally your latency is >1s?

@Subnet-Masked
Copy link
Author

Subnet-Masked commented Jun 10, 2022

@kaysond I am using the following as a cronjob */1 * * * * wget "https://<kuma>/api/push/WrZMizR40l" >/dev/null 2>&1

I don't believe that my ping is >1 second because I have a second monitor set up for that server (port) to watch an IMAP service - That reports ~84ms at the highest and an average of 21ms.

Screenshot 2022-06-10 124916

I should note that this happens to all of my push monitors. Not just that one.

@Subnet-Masked
Copy link
Author

@kaysond
Okay - Some more testing and research later. This might be primarily a Linode Firewall issue. For bits and giggles I turned off the Linode Firewall and the problem vanished. It seems like it is adding some latency on inbound requests but not outbound.

So I guess my question now is this: Is there a way to adjust that buffer / grace period? If not, would setting the interval in Kuma to 61-62 seconds, but leaving the actual period in the cronjob as 60 seconds work as a grace, and @louislam should I submit a feature request for an editable grace?

@kaysond
Copy link
Contributor

kaysond commented Jun 10, 2022

So I guess my question now is this: Is there a way to adjust that buffer / grace period? If not, would setting the interval in Kuma to 61-62 seconds, but leaving the actual period in the cronjob as 60 seconds work as a grace, and @louislam should I submit a feature request for an editable grace?

Not at the moment. My original PR to add the buffer was just a bug fix, but I did note that it would be useful for cases like these to be able to adjust it. I think there's a large backlog of feature-add PR's, so it may not get merged any time soon. Your work around is probably the right choice for now.

@Subnet-Masked
Copy link
Author

Sounds good, thank you very much @kaysond

I am going to close this issue as I consider it resolved.

@louislam
Copy link
Owner

Reopen as similar issue reported on Reddit:
https://www.reddit.com/r/UptimeKuma/comments/va0zzv/tolerance_between_push_events_before_notifying/

@louislam louislam reopened this Jun 12, 2022
@kaysond
Copy link
Contributor

kaysond commented Jun 12, 2022

@louislam probably need to make the buffer period a user setting. Healthchecks does it this way and calls it "grace time" - https://healthchecks.io/docs/configuring_checks/

@louislam
Copy link
Owner

https://serverfault.com/questions/721088/how-precise-is-a-cron-daemon

What cron can guarantee is that your job will start no sooner than the specified time

I am wondering maybe 1-second buffer is not enough. Due to the design of cron, it is not always excuting the command at exact time, plus there is a network latency.

Since upcoming version of Uptime Kuma has 4 fields that are related to interval or time already, I am afraid that adding one more grace time field, the page may looks confusing.

I am thinking maybe:

  • Just add a description to tell user that they have to add some buffer to the interval.
  • Or increase the buffer from 1s to 10s.

@chakflying
Copy link
Collaborator

Maybe we can change it to be some fraction of the heartbeat interval like we did for axios timeout?

@Apashh
Copy link

Apashh commented Jun 13, 2022

Hello ! I have the same problem #1732 , Heartbeat Interval (Check every 70 seconds), my device get URL every 60 seconds.
Uptime KUMA return to DOWN when it should have remained UP.

I hope you will find the solution ! Your project is great !! ;)

@kaysond
Copy link
Contributor

kaysond commented Jun 13, 2022

I am thinking maybe:

* Just add a description to tell user that they have to add some buffer to the interval.

* Or increase the buffer from 1s to 10s.

Probably the first one. It should work fine now because the timeout is synchronized to api calls. If you do the second I think people might wonder why it takes so long for a monitor to go down. But if they choose the timeout they will know.

Theoretically, cron can run your job anywhere from 10:10:00 to 10:10:59 the first time, and anywhere from 10:11:00 to 10:11:59 the second time. In the extreme, you have an interval of 119s when it should be 60s!

When I was doing the testing for #1428, though, cron generally seemed to run the job every 60-61s, hence the 1s buffer.

@CommanderStorm CommanderStorm added area:monitor Everything related to monitors area:notifications Everything related to notifications type:enhance-existing feature wants to enhance existing monitor labels Dec 6, 2023
@CommanderStorm CommanderStorm changed the title Push Monitor reporting down when servcice is up in same second. [push] reporting down when servcice is up in same second. Dec 14, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area:monitor Everything related to monitors area:notifications Everything related to notifications bug Something isn't working type:enhance-existing feature wants to enhance existing monitor
Projects
None yet
Development

No branches or pull requests

7 participants