-
-
Notifications
You must be signed in to change notification settings - Fork 5.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Perf: Fully batch sendImportantHeartbeatList #3463
Conversation
if (monitorID == null) { | ||
// Send important beats for all monitors | ||
list = await R.find("heartbeat", ` | ||
important = 1 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think these are functionally identical:
This is finding the top 5k important heartbeats while the other one is finding the top 500 important heartbeats for each monitor.
=> Lets assume a monitor runs every 20s and every heartbeat is important => 4320 beats
=> have 2 monitors which produce beats like this and one which pings once per day and the results will not be identical
=> I think this is missing a group by
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see your point that if the important beat list grew too long, it would lead to different behavior. I'd say for the initial dashboard load, 5000 is more than enough. That's 250 pages and no one will bother going through the unsearchable list. I think I have two options here:
- Add a function to send the important heartbeat list when the user clicks on a monitor in the dashboard, in case there are any missing, or
- Modify the SQL to replicate the original behavior. A quick search and the only way to do it seems to be a nested SELECT, looks a bit ugly.
Why would GROUP BY be needed here when we are not using an aggregate function?
Yes, that's right. It is one of poor optimizations of Uptime Kuma, because I was lazy and didn't make a real pagination. As far as I remember, I think this is a temporary fix. Eventually, sending a whole list is not a good move. It should be like this:
|
Could you add to the description? (there are likely also other issues which this PR would close, but I have not found them.) The foloowing are also performance issues which were fixed by previous PRs in this PR-Train (please add them too, to close them if you think they are closed): |
I plan to implement the suggested fix in a new PR. Will try to include the issues you found when completed. |
Superseded by #3515. |
https://github.com/louislam/uptime-kuma/blob/master/CONTRIBUTING.md#can-i-create-a-pull-request-for-uptime-kuma
Tick the checkbox if you understand [x]:
Description
Currently on initial page load, server loops over the list of monitors, retrieves the list of important heartbeats via SQL, and sends it over the socket. Even with indexing, this results in an O(n^2) operation. In practice, when there is a large number of monitors, the process takes so long that the server disconnects the client for some reason, causing a reconnect and the whole process to start over again, hence the dashboard never finishes loading. Fully batching the
sendImportantHeartbeatList
process seems to resolve this issue.Idealy we can fully batch
sendHeartbeatList
as well, but we need some SQL magic such that we getup to 100 beats from each monitor
.Type of change
Checklist
(including JSDoc for methods)
Screenshots (if any)