Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ActionLimitsTests memory test failing #3246

Closed
eweiter opened this issue Feb 2, 2018 · 17 comments
Closed

ActionLimitsTests memory test failing #3246

eweiter opened this issue Feb 2, 2018 · 17 comments

Comments

@eweiter
Copy link
Contributor

eweiter commented Feb 2, 2018

I've seen that test failing a couple of times.
The reason for the failure was
"limits should be aborted when exceeding its memory limits"
the test tries to verify that an action is aborted when it hits a set memory limit but the abortion is not happening, so when the test tries to verify that an error exists on the response the test will fail.

A discussion occurred on a slack conversation. Here is a transcript of what was discussed as the solution to this issue.

Markus Thömmes [1:40 PM]
I’d think we need to delay execution of the action a bit so the OOM killer can do it’s job

Markus Thömmes [1:42 PM]
that action should always fail
but I don’t know how the memory limit is enforced. Are cgroups “synchronous”? As in: Should it be impossible to even allocate the memory? (edited)

Sven Lange-Last [1:48 PM]
cgroups check for physical memory. once the physical memory sum of all processes in a container exceeds the limit, the oom killer selects the largest memory consuming process in the container and kills it.

https://www.kernel.org/doc/Documentation/cgroup-v1/memory.txt
2.5 Reclaim

Each cgroup maintains a per cgroup LRU which has the same structure as
global VM. When a cgroup goes over its limit, we first try
to reclaim memory from the cgroup so as to make space for the new
pages that the cgroup has touched. If the reclaim is unsuccessful,
an OOM routine is invoked to select and kill the bulkiest task in the
cgroup.

Markus Thömmes [1:51 PM]
that is: it is async? (edited)

Sven Lange-Last [1:51 PM]
it is async.

Markus Thömmes [1:52 PM]
@rabbah ^^ our nodejs code might exit (and nuke memory) too fast, so we could delay the actions execution by 2-3 seconds to give the killer time to operate. That was my idea

Sven Lange-Last [1:53 PM]
in particular, we need to make sure that the node.js code keeps pages in physical memory. otherwise, the memory handler may evoke pages.

=> write some random values to each of the pages and keep reading these values from all pages every 100 ms will keep all pages active.

otherwise: if the test is run on a system that has memory pressure, allocated pages may be paged out and won’t get accounted for physical memory any more.
@chetanmeh
Copy link
Member

Failure seen in https://scans.gradle.com/s/xq6t6op7ybifc/tests/xaqugzlgs2zw6-wlc37iylaskqa

org.scalatest.exceptions.TestFailedException: "An error has occurred: RangeError: Array buffer allocation failed" was not equal to "The action exhausted its memory and was aborted."Open stacktrace
check failed for activation 0726d41ac78e49bca6d41ac78e59bcc1: {"activationId":"0726d41ac78e49bca6d41ac78e59bcc1","annotations":[{"key":"path","value":"guest/TestNodeJsMemoryExceeding"},{"key":"waitTime","value":458},{"key":"kind","value":"nodejs:6"},{"key":"limits","value":{"logs":10,"memory":128,"timeout":60000}},{"key":"initTime","value":251}],"duration":312,"end":1542004205394,"logs":["2018-11-12T06:30:05.344422656Z stdout: helloEatMemory memory 256MB","2018-11-12T06:30:05.391611101Z stderr: RangeError: Array buffer allocation failed","2018-11-12T06:30:05.39164719Z  stderr: at Buffer.Uint8Array (native)","2018-11-12T06:30:05.391664772Z stderr: at FastBuffer (buffer.js:11:5)","2018-11-12T06:30:05.391668493Z stderr: at createUnsafeBuffer (buffer.js:38:12)","2018-11-12T06:30:05.391671628Z stderr: at new Buffer.alloc (buffer.js:130:12)","2018-11-12T06:30:05.391674896Z stderr: at eat (eval at <anonymous> (/nodejsAction/runner.js:79:109), <anonymous>:6:18)","2018-11-12T06:30:05.392922655Z stderr: at NodeActionRunner.main [as userScriptMain] (eval at <anonymous> (/nodejsAction/runner.js:79:109), <anonymous>:14:5)","2018-11-12T06:30:05.392954497Z stderr: at /nodejsAction/runner.js:98:45","2018-11-12T06:30:05.392958552Z stderr: at NodeActionRunner.run (/nodejsAction/runner.js:92:16)","2018-11-12T06:30:05.392962088Z stderr: at doRun (/nodejsAction/src/service.js:168:31)","2018-11-12T06:30:05.392965596Z stderr: at runCode (/nodejsAction/src/service.js:120:20)"],"name":"TestNodeJsMemoryExceeding","namespace":"guest","publish":false,"response":{"result":{"error":"An error has occurred: RangeError: Array buffer allocation failed"},"status":"action developer error","success":false},"start":1542004205082,"subject":"guest","version":"0.0.1"}
Exception occurred during test execution: org.scalatest.exceptions.TestFailedException: "An error has occurred: RangeError: Array buffer allocation failed" was not equal to "The action exhausted its memory and was aborted."
org.scalatest.exceptions.TestFailedException: "An error has occurred: RangeError: Array buffer allocation failed" was not equal to "The action exhausted its memory and was aborted."
	at org.scalatest.MatchersHelper$.indicateFailure(MatchersHelper.scala:340)
	at org.scalatest.Matchers$AnyShouldWrapper.shouldBe(Matchers.scala:6864)
	at org.apache.openwhisk.core.limits.ActionLimitsTests.$anonfun$new$62(ActionLimitsTests.scala:442)
	at org.apache.openwhisk.core.limits.ActionLimitsTests.$anonfun$new$62$adapted(ActionLimitsTests.scala:442)
	at common.WskTestHelpers.$anonfun$withActivation$2(WskTestHelpers.scala:272)
	at common.WskTestHelpers.withRethrowingPrint(WskTestHelpers.scala:304)

@chetanmeh
Copy link
Member

Error logs from activation record

"entityType": "activation",
                "logs": [
                    "2018-11-12T08:32:03.61510596Z  stdout: helloEatMemory memory 256MB",
                    "2018-11-12T08:32:03.648659155Z stderr: RangeError: Array buffer allocation failed",
                    "2018-11-12T08:32:03.648689686Z stderr: at Buffer.Uint8Array (native)",
                    "2018-11-12T08:32:03.64869488Z  stderr: at FastBuffer (buffer.js:11:5)",
                    "2018-11-12T08:32:03.648698642Z stderr: at createUnsafeBuffer (buffer.js:38:12)",
                    "2018-11-12T08:32:03.648701912Z stderr: at new Buffer.alloc (buffer.js:130:12)",
                    "2018-11-12T08:32:03.648705242Z stderr: at eat (eval at <anonymous> (/nodejsAction/runner.js:79:109), <anonymous>:6:18)",
                    "2018-11-12T08:32:03.648711671Z stderr: at NodeActionRunner.main [as userScriptMain] (eval at <anonymous> (/nodejsAction/runner.js:79:109), <anonymous>:14:5)",
                    "2018-11-12T08:32:03.648716081Z stderr: at /nodejsAction/runner.js:98:45",
                    "2018-11-12T08:32:03.648719542Z stderr: at NodeActionRunner.run (/nodejsAction/runner.js:92:16)",
                    "2018-11-12T08:32:03.648723251Z stderr: at doRun (/nodejsAction/src/service.js:168:31)",
                    "2018-11-12T08:32:03.648726681Z stderr: at runCode (/nodejsAction/src/service.js:120:20)"
                ]

@tysonnorris
Copy link
Contributor

I my local system, see controller container restarting when running these tests, leading to 502 responses to test cli and test failures.

I noticed Picked up JAVA_TOOL_OPTIONS: -XX:+UnlockExperimentalVMOptions -XX:+UseCGroupMemoryLimitForHeap in logs, which apparently comes from: https://github.com/AdoptOpenJDK/openjdk-docker/blob/master/8/jdk/alpine/Dockerfile.hotspot.nightly.full#L86

Looking at whether this is causing problems...

@tysonnorris
Copy link
Contributor

In my case (macos + docker for mac) I had the docker prefs set to use 2GB; increasing to 4GB helped.

I guess this is unrelated to the RangeError nodejs case, but unexpected heap settings are also bad.

@chetanmeh
Copy link
Member

Another failure seen at https://scans.gradle.com/s/qedg2uhf76u3s/tests/xaqugzlgs2zw6-wlc37iylaskqa

check failed for activation ef8442c192bc4f138442c192bc3f1375: {"activationId":"ef8442c192bc4f138442c192bc3f1375","annotations":[{"key":"path","value":"guest/TestNodeJsMemoryExceeding"},{"key":"waitTime","value":723},{"key":"kind","value":"nodejs:6"},{"key":"limits","value":{"concurrency":1,"logs":10,"memory":128,"timeout":60000}},{"key":"initTime","value":423}],"duration":470,"end":1543405972179,"logs":["2018-11-28T11:52:52.139641526Z stdout: helloEatMemory memory 256MB","2018-11-28T11:52:52.175917611Z stderr: RangeError: Array buffer allocation failed","2018-11-28T11:52:52.175950915Z stderr: at Buffer.Uint8Array (native)","2018-11-28T11:52:52.175955595Z stderr: at FastBuffer (buffer.js:11:5)","2018-11-28T11:52:52.175959189Z stderr: at createUnsafeBuffer (buffer.js:38:12)","2018-11-28T11:52:52.175962636Z stderr: at new Buffer.alloc (buffer.js:130:12)","2018-11-28T11:52:52.175966299Z stderr: at eat (eval at <anonymous> (/nodejsAction/runner.js:79:109), <anonymous>:6:18)","2018-11-28T11:52:52.175974889Z stderr: at NodeActionRunner.main [as userScriptMain] (eval at <anonymous> (/nodejsAction/runner.js:79:109), <anonymous>:14:5)","2018-11-28T11:52:52.175992221Z stderr: at /nodejsAction/runner.js:98:45","2018-11-28T11:52:52.175995659Z stderr: at NodeActionRunner.run (/nodejsAction/runner.js:92:16)","2018-11-28T11:52:52.175999597Z stderr: at doRun (/nodejsAction/src/service.js:168:31)","2018-11-28T11:52:52.176002715Z stderr: at runCode (/nodejsAction/src/service.js:120:20)"],"name":"TestNodeJsMemoryExceeding","namespace":"guest","publish":false,"response":{"result":{"error":"An error has occurred: RangeError: Array buffer allocation failed"},"status":"action developer error","success":false},"start":1543405971709,"subject":"guest","version":"0.0.1"}
Exception occurred during test execution: org.scalatest.exceptions.TestFailedException: "An error has occurred: RangeError: Array buffer allocation failed" was not equal to "The action exhausted its memory and was aborted."
org.scalatest.exceptions.TestFailedException: "An error has occurred: RangeError: Array buffer allocation failed" was not equal to "The action exhausted its memory and was aborted."
	at org.scalatest.MatchersHelper$.indicateFailure(MatchersHelper.scala:340)
	at org.scalatest.Matchers$AnyShouldWrapper.shouldBe(Matchers.scala:6864)

@chetanmeh
Copy link
Member

@chetanmeh
Copy link
Member

@markusthoemmes This test is failing multiple times for me in my travis builds. I see this was added for #2827. From error it appears now that large array allocation failure is properly handled by Node JS runtime (may be newer version have more resiliency). Should we adapt the test case to cause true OOM is some other way?

@sven-lange-last
Copy link
Member

sven-lange-last commented Nov 29, 2018

I guess the problem lies somewhere in the details of physical / virtual memory, container (cgroup) rss / swap memory limits and overall system memory.

The exception stack starting with RangeError: Array buffer allocation failed says that the Node.js could not allocate the requested amount of virtual memory. See https://github.com/nodejs/node/blob/v6.15.0/lib/buffer.js for the underlying Node.js version 6 code.

To explain why this error happens, let's run the free -hw command on the global system and within a memory limited standard action container (256 MB):

# free -hw
              total        used        free      shared     buffers       cache   available
Mem:            15G        2.3G        1.5G        101M        832M         11G         12G

# docker exec -it 05b670eafb46 free -hw
              total        used        free      shared     buffers       cache   available
Mem:            15G        2.6G        1.5G        101M        832M         10G         12G

There is barely no difference between the global system and the container. Let's ignore the small differences in output for now. Even though there is a memory limit defined on the container, it seems to have 15G RAM. Containers (i.e. namespaces) do not influence the results of memory management kernel calls because there is no memory namespace [1]. The container "thinks" it has lots of memory even if there is a limit on the physical memory.

If you request an amount of (virtual) memory - regardless of whether you are in a container or not -, you will receive a positive result if the kernel thinks it will be able to contain the requested memory. It won't reserve any physical memory pages for your request. [2]

When receiving a RangeError: Array buffer allocation failed, the kernel returned a failure because the overall system (or VM) must have been short on memory. So the kernel decided that it cannot contain a contiguous block of 256 MiB of virtual memory.

As already mentioned, when you receive the desired amount of virtual memory, it's simply table entries but not backed by physical memory.

If you start filling the memory, the memory management system will receive page faults whenever you touch a virtual memory page for the first time and provide a physical page backing the virtual page. This is where the action container limits apply: The memory subsystem has an accounting how much physical memory is owned by the container. Once the actual physical memory reaches the defined limit, the memory subsystem tries to reduce physical memory pressure of the container (cgroup). The final action is to invoke the out-of-memory (OOM) killer which stops the process with the highest score. The process consuming most physical memory usually has the highest score.

BTW: if your action code starts a child process which consumes more physical memory than your main process, the OOM killer will stop the child process. At the same time, the main action process holding the communication socket to the invoker will continue to run normally. You won't see an OOM error of the action.

Another interesting detail: some operating systems and VMs perform memory de-deplication [3]. If you allocate a large buffer and fill all virtual memory pages with the same values, it may end up with a single physical page representing all required virtual memory pages. That's why our current action [4] may fail to provoke the desired OOM kill in some environments.

Based on these details, I suggest to follow this strategy:

  • Set the action's memory limit to the lowest possible value. At the moment, we set it to 128.megabytes which is actually the limit. We should use MemoryLimit.minMemory.
  • The action should allocate exactly the limit. I guess this should already be sufficient.
  • The action should determine the system's page size and fill one randomly chosen byte on each page of the buffer with a random value. On the one hand, that's probably faster than writing each byte. On the other hand, this prevents memory deduplication or compression.

Nevertheless, if the system has too few memory, this strategy won't help either.

[1] http://man7.org/linux/man-pages/man7/namespaces.7.html
[2] https://linux.die.net/man/3/malloc
[3] https://en.wikipedia.org/wiki/Kernel_same-page_merging
[4] https://github.com/apache/incubator-openwhisk/blob/master/tests/dat/actions/memoryWithGC.js

@chetanmeh
Copy link
Member

Great analysis @sven-lange-last !! Would try to come up with a PR matching the proposed approach (unless you are working on the same ... in that case let me know)

@sven-lange-last
Copy link
Member

@chetanmeh feel free to prepare a PR. I can review. At the moment, I'm busy looking into other areas.

@chetanmeh
Copy link
Member

On my local Ubuntu box

Kernel Version: 4.15.0-39-generic
Operating System: Ubuntu 16.04.5 LTS
Total Memory: 62.84GiB
Server Version: 18.09.0

The test case does not pass i.e. no memory error seen at all no matter what memory I use. Changed the memoryWithGC.js like below to somewhat follow suggested approach also did not helped

function eat(memoryMB) {
    var bytes = 1*1024*1024*memoryMB;
    var pageSize = 1024;
    var buffer = new Buffer.alloc(bytes);
    for (i = 0; i < bytes - 4; i+= pageSize) {
        buffer.writeFloatBE(Math.random(), i)
    }
    console.log('done.' + buffer.length);
}

Below is the docker launch command which shows that memory option being passed is --memory 128m --memory-swap 128m

[2018-11-30T10:34:09.194Z] [INFO] [#tid_a4b92326abc919a358937dde59e22b2f] [DockerClientWithFileAccess] running /usr/bin/docker run -d --cpu-shares 64 --memory 128m --memory-swap 128m --network bridge -e __OW_API_HOST=https://172.17.0.1:443 --name wsk00_18_guest_TestNodeJsMemoryExceeding --pids-limit 1024 --cap-drop NET_RAW --cap-drop NET_ADMIN --ulimit nofile=1024:1024 --log-driver json-file --env __OW_ALLOW_CONCURRENT=True openwhisk/nodejs6action:latest (timeout: 1 minute) [marker:invoker_docker.run_start:37]

Would now check the cgroup stats to see why limits are not being enforced

@chetanmeh
Copy link
Member

Below is the activation record

{
  "activationId": "e75b69aea9fe4dc09b69aea9fe5dc0ae",
  "annotations": [
    {
      "key": "limits",
      "value": {
        "concurrency": 1,
        "logs": 10,
        "memory": 128,
        "timeout": 60000
      }
    },
    {
      "key": "path",
      "value": "guest/TestNodeJsMemoryExceeding"
    },
    {
      "key": "kind",
      "value": "nodejs:6"
    },
    {
      "key": "waitTime",
      "value": 242
    }
  ],
  "duration": 2909,
  "end": 1543655616203,
  "logs": [
    "2018-12-01T09:13:33.59781017Z  stdout: helloEatMemory memory 256MB",
    "2018-12-01T09:13:35.993134294Z stdout: <Buffer 3f 5e f5 3c 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ... >"
  ],
  "name": "TestNodeJsMemoryExceeding",
  "namespace": "guest",
  "publish": false,
  "response": {
    "result": {
      "msg": "OK, buffer of size 256 MB has been filled."
    },
    "status": "success",
    "success": true
  },
  "start": 1543655613294,
  "subject": "guest",
  "version": "0.0.1"
}

As can be seen that memory is set to 128 while passing a payload of 256 i.e. allocate a buffer of 256MiB still works OK, buffer of size 256 MB has been filled.

Below are the cgroup stats

$ docker ps | grep Memory
3662a758926e        openwhisk/nodejs6action:latest   "/bin/sh -c 'node --…"   13 minutes ago      Up 13 minutes (Paused)                                                                                                         wsk00_20_guest_TestNodeJsMemoryExceeding

And here are the various stats

/sys/fs/cgroup/memory/docker/3662a758926ec4013648222c865996af2e0146140e7ee3b75999b534bc46a4d4$ tail -n +1 *
==> cgroup.clone_children <==
0
tail: cannot open 'cgroup.event_control' for reading: Permission denied

==> cgroup.procs <==
698
751

==> memory.failcnt <==
28917
tail: cannot open 'memory.force_empty' for reading: Permission denied

==> memory.kmem.failcnt <==
0

==> memory.kmem.limit_in_bytes <==
9223372036854771712

==> memory.kmem.max_usage_in_bytes <==
2498560

==> memory.kmem.slabinfo <==
slabinfo - version: 2.1
# name            <active_objs> <num_objs> <objsize> <objperslab> <pagesperslab> : tunables <limit> <batchcount> <sharedfactor> : slabdata <active_slabs> <num_slabs> <sharedavail>
kmalloc-1024          32     32   1024   32    8 : tunables    0    0    0 : slabdata      1      1      0
kmalloc-192           21     21    192   21    1 : tunables    0    0    0 : slabdata      1      1      0
kmalloc-8              0      0      8  512    1 : tunables    0    0    0 : slabdata      0      0      0
radix_tree_node       84     84    584   28    4 : tunables    0    0    0 : slabdata      3      3      0
inode_cache           78     78    608   26    4 : tunables    0    0    0 : slabdata      3      3      0
shmem_inode_cache     69     69    712   23    4 : tunables    0    0    0 : slabdata      3      3      0
eventpoll_pwq        168    168     72   56    1 : tunables    0    0    0 : slabdata      3      3      0
eventpoll_epi        128    128    128   32    1 : tunables    0    0    0 : slabdata      4      4      0
kmalloc-512          128    128    512   32    4 : tunables    0    0    0 : slabdata      4      4      0
kmalloc-256          128    128    256   32    2 : tunables    0    0    0 : slabdata      4      4      0
mm_struct             60     60   2112   15    8 : tunables    0    0    0 : slabdata      4      4      0
signal_cache          64     64   1024   32    8 : tunables    0    0    0 : slabdata      2      2      0
sighand_cache         30     30   2112   15    8 : tunables    0    0    0 : slabdata      2      2      0
files_cache           69     69    704   23    4 : tunables    0    0    0 : slabdata      3      3      0
mqueue_inode_cache      0      0    960   34    8 : tunables    0    0    0 : slabdata      0      0      0
cred_jar             210    210    192   21    1 : tunables    0    0    0 : slabdata     10     10      0
task_struct           30     30   5952    5    8 : tunables    0    0    0 : slabdata      6      6      0
sock_inode_cache     115    115    704   23    4 : tunables    0    0    0 : slabdata      5      5      0
proc_inode_cache     168    168    680   24    4 : tunables    0    0    0 : slabdata      7      7      0
dentry               349    420    192   21    1 : tunables    0    0    0 : slabdata     20     20      0
filp                 352    352    256   32    2 : tunables    0    0    0 : slabdata     11     11      0
anon_vma             414    414     88   46    1 : tunables    0    0    0 : slabdata      9      9      0
pid                  576    576     64   64    1 : tunables    0    0    0 : slabdata      9      9      0
vm_area_struct       390    390    208   39    2 : tunables    0    0    0 : slabdata     10     10      0

==> memory.kmem.tcp.failcnt <==
0

==> memory.kmem.tcp.limit_in_bytes <==
9223372036854771712

==> memory.kmem.tcp.max_usage_in_bytes <==
0

==> memory.kmem.tcp.usage_in_bytes <==
0

==> memory.kmem.usage_in_bytes <==
2400256

==> memory.limit_in_bytes <==
134217728

==> memory.max_usage_in_bytes <==
134217728

==> memory.move_charge_at_immigrate <==
0

==> memory.numa_stat <==
total=32126 N0=32126
file=0 N0=0
anon=32126 N0=32126
unevictable=0 N0=0
hierarchical_total=32126 N0=32126
hierarchical_file=0 N0=0
hierarchical_anon=32126 N0=32126
hierarchical_unevictable=0 N0=0

==> memory.oom_control <==
oom_kill_disable 0
under_oom 0
oom_kill 0
tail: cannot open 'memory.pressure_level' for reading: Permission denied

==> memory.soft_limit_in_bytes <==
9223372036854771712

==> memory.stat <==
cache 0
rss 131588096
rss_huge 0
shmem 0
mapped_file 0
dirty 0
writeback 0
pgpgin 140710
pgpgout 108584
pgfault 145356
pgmajfault 2051
inactive_anon 65839104
active_anon 65748992
inactive_file 0
active_file 0
unevictable 0
hierarchical_memory_limit 134217728
total_cache 0
total_rss 131588096
total_rss_huge 0
total_shmem 0
total_mapped_file 0
total_dirty 0
total_writeback 0
total_pgpgin 140710
total_pgpgout 108584
total_pgfault 145356
total_pgmajfault 2051
total_inactive_anon 65839104
total_active_anon 65748992
total_inactive_file 0
total_active_file 0
total_unevictable 0

==> memory.swappiness <==
60

==> memory.usage_in_bytes <==
133988352

==> memory.use_hierarchy <==
1

==> notify_on_release <==
0

==> tasks <==
698
751
753
754
755
756
757

Per stats the memory.max_usage_in_bytes 134217728

@chetanmeh
Copy link
Member

Ran a simple travis test based on this test repo which executes nodejs docker image with same script which we pass with test and it fails as expected with 137 exit code and "OOMKilled": true

language: python

group: deprecated-2017Q3

services:
  - docker

script:
  - grep CGROUP  /boot/config-$(uname -r)
  - docker run --name oom-check-node  --memory 128m --memory-swap 128m -v "$PWD":/usr/src/app -w /usr/src/app openwhisk/nodejs6action node memoryWithGC.js
  - docker inspect oom-check-node
  - docker logs oom-check-node

So not sure why it fails in some cases as this travis run was using same docker version

@chetanmeh
Copy link
Member

chetanmeh commented Dec 4, 2018

Attempting a possible fix in #4155

@sven-lange-last
Copy link
Member

sven-lange-last commented Dec 4, 2018

@chetanmeh in your code snippet above, you used 1 KiB as page size. On Linux, the page size typically is 4 KiB:

# getconf PAGESIZE
4096

The code should work regardless of this problem because it will just fill more memory locations than absolutely required.

I'm surprised that your test code exactly stays within the rss memory limit... that's why you don't see the OOM kill. It seems that all Node.js binary code is account for as shared memory.

Limit:

  • File memory.limit_in_bytes: 134217728
  • File memory.stat: hierarchical_memory_limit 134217728

Max usage:

  • File memory.max_usage_in_bytes: 134217728

Current usage:

  • File memory.usage_in_bytes : 133988352

@chetanmeh
Copy link
Member

@sven-lange-last I tried with 4kb also and went beyond the limit but still its usage remains at 128 MB. On my desktop I do see warning

WARNING: Your kernel does not support swap limit capabilities or the cgroup is not mounted. Memory limited without swap.

So may be it does not support cgroup limit. Thoughts its 16.04 which should have that support by default. The travis build do enforce memory limit and that works for simple runs (as seen in linked travis run). However for some reason it fails in test runs. So it may be due to some form of memory crunch which cause the range error before hitting OOM.

For now multiple runs of new approach in #4155 passed. So that may fix this issue

@sven-lange-last
Copy link
Member

#4155 was merged and should address this heisenbug. Please re-open if the test fails again occasionally.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants