Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[P0] Memory allocation failures in API #173

Closed
willroberts opened this issue Oct 19, 2022 · 3 comments · Fixed by #174
Closed

[P0] Memory allocation failures in API #173

willroberts opened this issue Oct 19, 2022 · 3 comments · Fixed by #174
Assignees
Labels
backend Related to server / worker code bug Something isn't working

Comments

@willroberts
Copy link
Collaborator

willroberts commented Oct 19, 2022

Summary

Sometime between 1.97.2 and 1.97.3, a change was introduced which appears to be resulting in memory contention in the API service. Increasing the available memory from 350 MB to 500 MB doesn't appear to have helped:

The Worker service also appears to be impacted, while the Game/SP services do not.

<--- Last few GCs --->
--
[27:0xffff9436c3c0] 13975962 ms: Mark-sweep 246.6 (258.9) -> 245.5 (258.9) MB, 222.2 / 0.0 ms  (average mu = 0.984, current mu = 0.129) allocation failure scavenge might not succeed
[27:0xffff9436c3c0] 13976247 ms: Mark-sweep 246.0 (258.9) -> 245.5 (259.1) MB, 258.3 / 0.0 ms  (average mu = 0.965, current mu = 0.093) allocation failure GC in old space requested
<--- JS stacktrace --->
FATAL ERROR: Reached heap limit Allocation failed - JavaScript heap out of memory
error Command failed with signal "SIGABRT".

Running git diff -w --stat 1.97.2..1.97.3 server doesn't show all that many changes (about 100 lines added and removed). However, since backend services also pull in app/sdk (and others), the culprit change may be elsewhere.

Memory usage for the last three days looks like this:
Screen Shot 2022-10-19 at 12 31 17 AM

The reduction towards the end was caused by increasing the available memory (thereby reducing the utilization).

This lines up with the work on replays here:
#163
#164

Before this, the last deployment was on 10/16 at ~10AM UTC, so these PRs could also be involved:
#157
#158
#160
#161
#162

We can revert these one at a time (locally to rebuild a hotfix container and test) to see what changed.

@willroberts willroberts added the enhancement New feature or request label Oct 19, 2022
@willroberts willroberts self-assigned this Oct 19, 2022
@willroberts willroberts added bug Something isn't working backend Related to server / worker code and removed enhancement New feature or request labels Oct 19, 2022
@willroberts willroberts changed the title [P0] Memory leak in API [P0] Memory allocation failures in API Oct 19, 2022
@willroberts
Copy link
Collaborator Author

This may be helpful: https://github.com/airbnb/node-memwatch

@willroberts
Copy link
Collaborator Author

This appears to be the result of upgrading CoffeeScript in #162

@willroberts
Copy link
Collaborator Author

API memory usage by CoffeeScript version:

1.12.7: 446 MB
1.12.6: 410 MB
1.12.5: 433 MB
1.12.4: 448 MB
1.12.3: 459 MB
1.12.2: 284 MB
1.12.1: 292 MB
1.12.0: 295 MB
1.11.1: 293 MB
1.11.0: 288 MB
1.10.0: 246 MB
1.9.3: 240 MB (requires coffeeify upgrade)
1.9.0: 239 MB
1.8.0: 237 MB

Measured just after the 'REDIS client onReady' log event with 'docker stats'.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backend Related to server / worker code bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant