[MM-58085] Improve calls load balancing logic #721

streamer45 · 2024-05-02T21:44:25Z

Summary

PR implements a (hopefully) more efficient load-balancing logic for calls. Up until now, we'd be using a simple round-robin approach which can work well for lots of smaller calls but it can be quite inefficient in case of larger calls.

The proposed changes will fetch actual system load (CPU) info from the rtcd instances and select the host with the lower load.
The rationale here is that we know CPU to be the main performance bottleneck. Exposing this information avoids having to calculate the load in more complex ways such as figuring out how many connections and tracks (and their type) we are sending at any given time.

Ticket Link

https://mattermost.atlassian.net/browse/MM-58085

streamer45 · 2024-05-02T21:45:27Z

server/rtcd.go

+	// Fallback to random choice if we couldn't get system info.
+	if hostWithMinLoad == nil {
+		hostWithMinLoad = hostsAvailable[rand.Intn(len(hostsAvailable))]
+	}


This and the continue in case of error above are to make the change backward compatible in which case we'd be using a randomized approach.

cpoile

Looks great, nicely done!

streamer45 · 2024-05-23T00:15:55Z

@cpoile Asking for re-review since I slightly changed the logic after I noticed that the 1 minute average wasn't as reactive as I would have liked. We are now using a 2-second instant load (see https://github.com/mattermost/rtcd/tree/MM-54335-improvements).

cpoile

Cool, looks great.

Use actual system load for balancing calls to rtcd hosts

4fa9ee0

streamer45 added 2: Dev Review Requires review by a core committer Do Not Merge/Awaiting PR Awaiting another pull request before merging (e.g. server changes) labels May 2, 2024

streamer45 requested a review from cpoile May 2, 2024 21:44

streamer45 self-assigned this May 2, 2024

streamer45 commented May 2, 2024

View reviewed changes

cpoile approved these changes May 3, 2024

View reviewed changes

streamer45 added this to the v0.28.0 / MM 9.10 milestone May 6, 2024

streamer45 added 3: Reviews Complete All reviewers have approved the pull request and removed 2: Dev Review Requires review by a core committer labels May 8, 2024

streamer45 added 2 commits May 22, 2024 18:12

Update load balancing strategy to use instant load

920b103

Update rtcd

14589e0

streamer45 requested a review from cpoile May 23, 2024 00:15

streamer45 added 2: Dev Review Requires review by a core committer and removed 3: Reviews Complete All reviewers have approved the pull request labels May 23, 2024

cpoile approved these changes May 23, 2024

View reviewed changes

cpoile added 3: Reviews Complete All reviewers have approved the pull request and removed 2: Dev Review Requires review by a core committer labels May 23, 2024

streamer45 removed the Do Not Merge/Awaiting PR Awaiting another pull request before merging (e.g. server changes) label May 23, 2024

streamer45 added 2 commits May 23, 2024 15:40

Merge remote-tracking branch 'origin/main' into MM-58085

4247982

Update rtcd

81f912a

streamer45 merged commit c609126 into main May 23, 2024
19 checks passed

streamer45 deleted the MM-58085 branch May 23, 2024 22:13

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[MM-58085] Improve calls load balancing logic #721

[MM-58085] Improve calls load balancing logic #721

streamer45 commented May 2, 2024

streamer45 May 2, 2024

cpoile left a comment

streamer45 commented May 23, 2024

cpoile left a comment

[MM-58085] Improve calls load balancing logic #721

[MM-58085] Improve calls load balancing logic #721

Conversation

streamer45 commented May 2, 2024

Summary

Ticket Link

streamer45 May 2, 2024

Choose a reason for hiding this comment

cpoile left a comment

Choose a reason for hiding this comment

streamer45 commented May 23, 2024

cpoile left a comment

Choose a reason for hiding this comment