Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

No down notifications for push monitor #922

Closed
2 tasks done
kaysond opened this issue Nov 15, 2021 · 25 comments
Closed
2 tasks done

No down notifications for push monitor #922

kaysond opened this issue Nov 15, 2021 · 25 comments
Labels
bug Something isn't working

Comments

@kaysond
Copy link
Contributor

kaysond commented Nov 15, 2021

⚠️ Please verify that this bug has NOT been raised before.

  • I checked and didn't find similar issue

🛡️ Security Policy

Description

I have a push monitor configured for a 1min interval. It appears that since upgrading, it no longer sends notification for missed heartbeats:
image

You'll notice that between 8:14 and 8:21 there were no pings, but no down notifications were sent either.

I wonder if it has something to do with the failed SMTP notifications shown in the logs below, but I have the same issue on all my monitors and those send the gotify notifications just fine

👟 Reproduction steps

Configure a push monitor as below:
image

Ping the URL every minute for a while
Stop pinging the URL

👀 Expected behavior

Get a down notification

😓 Actual Behavior

Only get up notifications

🐻 Uptime-Kuma Version

1.10.2

💻 Operating System and Arch

Docker on Ubuntu 20.04

🌐 Browser

Firefox

🐋 Docker Version

20.10.10, build b485636

🟩 NodeJS Version

No response

📝 Relevant log output

onitor #21 'Backups': Failing: No heartbeat in the time window | Interval: 60 seconds | Type: push
Cannot send notification to SMTP
Monitor #21 'Backups': Failing: No heartbeat in the time window | Interval: 60 seconds | Type: push
Cannot send notification to SMTP
Monitor #21 'Backups': Failing: No heartbeat in the time window | Interval: 60 seconds | Type: push
Cannot send notification to SMTP
Monitor #21 'Backups': Failing: No heartbeat in the time window | Interval: 60 seconds | Type: push
Cannot send notification to SMTP
Monitor #21 'Backups': Failing: No heartbeat in the time window | Interval: 60 seconds | Type: push
Cannot send notification to SMTP
Monitor #21 'Backups': Failing: No heartbeat in the time window | Interval: 60 seconds | Type: push
Cannot send notification to SMTP
Monitor #21 'Backups': Failing: No heartbeat in the time window | Interval: 60 seconds | Type: push
Cannot send notification to SMTP
Monitor #21 'Backups': Failing: No heartbeat in the time window | Interval: 60 seconds | Type: push
Cannot send notification to SMTP
Monitor #21 'Backups': Failing: No heartbeat in the time window | Interval: 60 seconds | Type: push
Cannot send notification to SMTP
Monitor #21 'Backups': Failing: No heartbeat in the time window | Interval: 60 seconds | Type: push
Cannot send notification to SMTP
Monitor #21 'Backups': Failing: No heartbeat in the time window | Interval: 60 seconds | Type: push
Cannot send notification to SMTP
@kaysond kaysond added the bug Something isn't working label Nov 15, 2021
@nbvcxz
Copy link

nbvcxz commented Nov 15, 2021

I got probably another issue, also with this kind of sensors, so it can be related somehow: My push notification is keep in Pending state (I expect Failed state after one pending), although there are no signals in set interval. No notification is being sent.

Monitor #27 'Home - AP2': Pending: No heartbeat in the time window | Max retries: 1 | Retry: 1 | Retry Interval: 60 seconds | Type: push
…
…
Monitor #27 'Home - AP2': Pending: No heartbeat in the time window | Max retries: 1 | Retry: 1 | Retry Interval: 60 seconds | Type: push

In my case, it seems it's still Retry: 1 - I'm not sure, but it looks like it isn't counting retries.

@louislam louislam added help and removed bug Something isn't working labels Nov 16, 2021
@louislam
Copy link
Owner

louislam commented Nov 16, 2021

Usually your smtp block the down mail for unknown reason.

Search this repo for more info. I remember someone had same problem like you.

@kaysond
Copy link
Contributor Author

kaysond commented Nov 16, 2021

@louislam the problem is that i don't get a down notification to gotify either
smtp doesnt work for any other monitor either, but i get the gotify notifications for those
only the push monitor doesnt work

@louislam
Copy link
Owner

Just test it again, I still cannot reproduce your problem. Not quite sure what is missing here.

@kaysond
Copy link
Contributor Author

kaysond commented Nov 18, 2021

Not sure what the issue was before, but restarting the container seems to have fixed it. Closing for now...
image

@kaysond kaysond closed this as completed Nov 18, 2021
@kaysond
Copy link
Contributor Author

kaysond commented Nov 25, 2021

@louislam there definitely seems to be some kind of issue with the push monitor logic that allows it to get into a bad state. It went down, then came back up, and now its spamming me with Up notifications every time it gets a ping...

I'm calling the url: api/push/7oxXeSwthj?msg=OK&ping=$PING

image
image

@kaysond kaysond reopened this Nov 25, 2021
@kaysond
Copy link
Contributor Author

kaysond commented Nov 25, 2021

Also not sure why there are gaps in the graph...

@chakflying
Copy link
Collaborator

Gaps are either a pending beat or a single up beat that is not connected as a line. I think this suggests that somehow more than 1 beat is produced every minute, which would be consistent with your observation that notification is constantly getting triggered, which would happen on every down-to-up transition.

@louislam louislam added bug Something isn't working and removed help labels Nov 26, 2021
@kaysond
Copy link
Contributor Author

kaysond commented Nov 26, 2021

Is there a way to log the raw http requests so I can see exactly what's coming through? The cron job is set to ping every minute...

@hansenc0705
Copy link

hansenc0705 commented Nov 27, 2021

I seem to be having this issue as well. I setup a new PUSH monitor and before i triggered my first GET to the URL i did get a down alert, triggered my GET to the URL and got an up alert then I didn't trigger another GET to the URL and haven't received an alert but can see in the WebUI that its down but the history doesn't call it out. If I do a GET to the URL I will get an Up alert.

Steps to duplicate
Create new PUSH monitor using settings below
Don't do anything and get your first down alert
Then load the URL, i'm doing a powershell web request with the method GET you will get an UP alert
Wait configured time to miss a heartbeat, you will see the WebUI show it as down but no alerts
Then trigger a GET to the URL

image

image

image

@louislam
Copy link
Owner

louislam commented Dec 8, 2021

Finally addressed the issue, it should be fixed in the next release

@hansenc0705
Copy link

hansenc0705 commented Dec 16, 2021

This might still be broken, I am no longer receiving down alerts if I intentionally stop the job hitting the URL. The monitor still says up.

@louislam

@robjuffermans
Copy link

I'm using matrix for the notifications and only receive the UP notifications, no DOWN. Using version 1.11.1 in docker.

@hansenc0705
Copy link

I'm not sure if we should crate a new issue or if @louislam can reopen this one?

@robjuffermans
Copy link

I'm using matrix for the notifications and only receive the UP notifications, no DOWN. Using version 1.11.1 in docker.

this is fixed in 1.11.3! Thanks!

@louislam
Copy link
Owner

louislam commented Jan 7, 2022

I'm using matrix for the notifications and only receive the UP notifications, no DOWN. Using version 1.11.1 in docker.

this is fixed in 1.11.3! Thanks!

It should not be related, the logic is the same as 1.11.1.

I'm not sure if we should crate a new issue or if @louislam can reopen this one?

I remember that I tested it a month ago, it should be working, but it could possibly be another hidden issue.

If you find a way to reproduce the issue, feel free to open a new issue.

@hansenc0705
Copy link

I've found it works at first after a restart and on first failure but doesn't work after that. I will try to get some steps to reproduce.

@kaysond
Copy link
Contributor Author

kaysond commented Jan 26, 2022

Something really bad is still happening. Note that its going down and up within the same second...
I just updated to the latest version again I'll report back if I still see the problem.
image

@kaysond
Copy link
Contributor Author

kaysond commented Mar 8, 2022

@louislam fyi - still a problem as of 1.11.3. Going to try 1.12.1
image

@louislam
Copy link
Owner

louislam commented Mar 8, 2022

I think maybe there is a hidden issue. Let me know if anyone know how to reproduce.

@hansenc0705
Copy link

Try this @louislam - configure monitor, restart Uptime Kuma (might not be required), cause a failure and it will alert, bring service back up and it will go green. after that it will not detect future failures. Seems is always works on first failure after a restart of Uptime Kuma but after that first failure no more alerts.

@scr4tchy
Copy link

scr4tchy commented Feb 21, 2023

This is still relevant. My PUSH sensors are only sending DOWN notifications to Discord, and not thé UP. While the status goes back to UP in the chart & the bars, the history does not record the UP state and no notification goes through. This persists after reboot of the application, after deleting all the events/heartbeats of the probe, pause/unpause, quick edit to upside down & back.. What’s more is that I don’t feel like the retries / interval works as expected - with 3 retries, 60 in the interval values & pushes sent every 60sec, I get notified as soon as a single push is sent with DOWN - shouldn’t it be after 3? Nothing in the logs besides the monitor down warning. I am actually getting multiple notifications as the probe goes back/forth between PENDING/DOWN during a single restart of an application (which takes ~30sec to come back up).

@kaysond
Copy link
Contributor Author

kaysond commented Feb 21, 2023

@scr4tchy this particular issue was definitely fixed. Can you try running v1.17.1 and see if things work as expected? If they do, it's a regression.

@scr4tchy
Copy link

scr4tchy commented Feb 24, 2023

image

import asyncio
import aiohttp
import yaml
from datetime import datetime, timedelta

async def check(check, cfg, session):
    # Check
    t = datetime.now()
    try:
        async with session.get(url=check["url"]) as resp:
            status = "up" if resp.status == 200 else "down"
            msg = str(resp.status)
    except Exception as e:
        status = "down"
        msg = "EXCEPT"
        print("Unable to get url {} due to {}.".format(check["url"], e.__class__))
    dt = (datetime.now() - t).total_seconds()

    # Push
    try:
        await session.get(url="{}/{}?status={}&ping={}&msg={}".format(cfg["push_url"], check["id"], status, dt, msg))
    except Exception as e:
        print("Unable to push status due to {}".format(e.__class__))

async def main():
    # Load config
    with open("./config.yaml", 'r') as f:
        cfg = yaml.load(f, Loader=yaml.FullLoader)

    # Loop
    while True:
        t_next = datetime.now() + timedelta(seconds=30)

        async with aiohttp.ClientSession(timeout=aiohttp.ClientTimeout(total=10)) as session:
            await asyncio.gather(*[check(c, cfg, session) for c in cfg["checks"]])

        # Wait until next interval.
        print("Sleeping until %s" % t_next)
        await asyncio.sleep((t_next - datetime.now()).total_seconds())

asyncio.run(main())

It might've been due to pushing heartbeats continuously, with status=up, status=down during downtime and then status=up again as soon as the service went back up. It would send notifications every single time a status=down was received, ignoring the retry configuration, and would never send a notification once up again. Skipping pushes during downtime in the code above may be fixing it, as I tested it once and didn't receive duplicated but did receive an up notification.

    # Push
    if status != "up":
        return

@kaysond
Copy link
Contributor Author

kaysond commented Feb 25, 2023

Yeah this is unrelated to this issue then. I believe the retry only applies if the heartbeat is missed completely. Similarly, it probably sends a notification on every down heartbeat on purpose. But if you can send multiple down heartbeats, I'm not sure why you need uptime-kuma at all...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

7 participants