Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Perf: Fully batch sendImportantHeartbeatList #3463

Closed

Conversation

chakflying
Copy link
Collaborator

@chakflying chakflying commented Jul 21, 2023

⚠️⚠️⚠️ Since we do not accept all types of pull requests and do not want to waste your time. Please be sure that you have read pull request rules:
https://github.com/louislam/uptime-kuma/blob/master/CONTRIBUTING.md#can-i-create-a-pull-request-for-uptime-kuma

Tick the checkbox if you understand [x]:

  • I have read and understand the pull request rules.

Description

Currently on initial page load, server loops over the list of monitors, retrieves the list of important heartbeats via SQL, and sends it over the socket. Even with indexing, this results in an O(n^2) operation. In practice, when there is a large number of monitors, the process takes so long that the server disconnects the client for some reason, causing a reconnect and the whole process to start over again, hence the dashboard never finishes loading. Fully batching the sendImportantHeartbeatList process seems to resolve this issue.

Idealy we can fully batch sendHeartbeatList as well, but we need some SQL magic such that we get up to 100 beats from each monitor.

Type of change

  • Bug fix (non-breaking change which fixes an issue)
  • Other

Checklist

  • My code follows the style guidelines of this project
  • I ran ESLint and other linters for modified files
  • I have performed a self-review of my own code and tested it
  • I have commented my code, particularly in hard-to-understand areas
    (including JSDoc for methods)
  • My changes generate no new warnings
  • My code needed automated testing. I have added them (this is optional task)

Screenshots (if any)

if (monitorID == null) {
// Send important beats for all monitors
list = await R.find("heartbeat", `
important = 1
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think these are functionally identical:
This is finding the top 5k important heartbeats while the other one is finding the top 500 important heartbeats for each monitor.

=> Lets assume a monitor runs every 20s and every heartbeat is important => 4320 beats
=> have 2 monitors which produce beats like this and one which pings once per day and the results will not be identical
=> I think this is missing a group by

Copy link
Collaborator Author

@chakflying chakflying Jul 21, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see your point that if the important beat list grew too long, it would lead to different behavior. I'd say for the initial dashboard load, 5000 is more than enough. That's 250 pages and no one will bother going through the unsearchable list. I think I have two options here:

  • Add a function to send the important heartbeat list when the user clicks on a monitor in the dashboard, in case there are any missing, or
  • Modify the SQL to replicate the original behavior. A quick search and the only way to do it seems to be a nested SELECT, looks a bit ugly.

Why would GROUP BY be needed here when we are not using an aggregate function?

@louislam
Copy link
Owner

Yes, that's right. It is one of poor optimizations of Uptime Kuma, because I was lazy and didn't make a real pagination.

As far as I remember, sendImportantHeartbeatList is used for listing the important events on the dashboard home and the monitor page.

I think this is a temporary fix. Eventually, sending a whole list is not a good move. It should be like this:

  • Dashboard home: Query the latest 30 items only. When a user clicks another page, load another 30 items.
  • For each monitor's event list should be loaded when a user clicks a monitor page.

@chakflying chakflying marked this pull request as draft July 21, 2023 07:10
@CommanderStorm
Copy link
Collaborator

CommanderStorm commented Jul 27, 2023

Could you add to the description?
Resolves #2066
Resolves #1716
Resolves #2889
Resolves #3127
Resolves #3494
Resolves #3497
Resolves #736
Resolves #726

(there are likely also other issues which this PR would close, but I have not found them.)

The foloowing are also performance issues which were fixed by previous PRs in this PR-Train (please add them too, to close them if you think they are closed):
Resolves #2781
Resolves #2889
Resolves #1397
Resolves #2346
Resolves #2470
Resolves #3408

@chakflying
Copy link
Collaborator Author

I plan to implement the suggested fix in a new PR. Will try to include the issues you found when completed.

@chakflying
Copy link
Collaborator Author

Superseded by #3515.

@chakflying chakflying closed this Aug 2, 2023
@chakflying chakflying deleted the perf/batch-important-beats branch August 4, 2023 10:38
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants