Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[MM-58085] Improve calls load balancing logic #721

Merged
merged 5 commits into from
May 23, 2024
Merged

[MM-58085] Improve calls load balancing logic #721

merged 5 commits into from
May 23, 2024

Conversation

streamer45
Copy link
Collaborator

Summary

PR implements a (hopefully) more efficient load-balancing logic for calls. Up until now, we'd be using a simple round-robin approach which can work well for lots of smaller calls but it can be quite inefficient in case of larger calls.

The proposed changes will fetch actual system load (CPU) info from the rtcd instances and select the host with the lower load.
The rationale here is that we know CPU to be the main performance bottleneck. Exposing this information avoids having to calculate the load in more complex ways such as figuring out how many connections and tracks (and their type) we are sending at any given time.

Ticket Link

https://mattermost.atlassian.net/browse/MM-58085

@streamer45 streamer45 added 2: Dev Review Requires review by a core committer Do Not Merge/Awaiting PR Awaiting another pull request before merging (e.g. server changes) labels May 2, 2024
@streamer45 streamer45 requested a review from cpoile May 2, 2024 21:44
@streamer45 streamer45 self-assigned this May 2, 2024
Comment on lines +270 to +273
// Fallback to random choice if we couldn't get system info.
if hostWithMinLoad == nil {
hostWithMinLoad = hostsAvailable[rand.Intn(len(hostsAvailable))]
}
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This and the continue in case of error above are to make the change backward compatible in which case we'd be using a randomized approach.

Copy link
Member

@cpoile cpoile left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great, nicely done!

@streamer45 streamer45 added this to the v0.28.0 / MM 9.10 milestone May 6, 2024
@streamer45 streamer45 added 3: Reviews Complete All reviewers have approved the pull request and removed 2: Dev Review Requires review by a core committer labels May 8, 2024
@streamer45
Copy link
Collaborator Author

@cpoile Asking for re-review since I slightly changed the logic after I noticed that the 1 minute average wasn't as reactive as I would have liked. We are now using a 2-second instant load (see https://github.com/mattermost/rtcd/tree/MM-54335-improvements).

@streamer45 streamer45 requested a review from cpoile May 23, 2024 00:15
@streamer45 streamer45 added 2: Dev Review Requires review by a core committer and removed 3: Reviews Complete All reviewers have approved the pull request labels May 23, 2024
Copy link
Member

@cpoile cpoile left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cool, looks great.

@cpoile cpoile added 3: Reviews Complete All reviewers have approved the pull request and removed 2: Dev Review Requires review by a core committer labels May 23, 2024
@streamer45 streamer45 removed the Do Not Merge/Awaiting PR Awaiting another pull request before merging (e.g. server changes) label May 23, 2024
@streamer45 streamer45 merged commit c609126 into main May 23, 2024
19 checks passed
@streamer45 streamer45 deleted the MM-58085 branch May 23, 2024 22:13
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
3: Reviews Complete All reviewers have approved the pull request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants