Add cgroup memory usage/limit to OS stats on Linux #26166

droberts195 · 2017-08-11T13:56:38Z

This change adds cgroup memory usage/limit to the OS stats section of
the node stats on Linux. This information is useful because in Docker
containers the standard node stats report the host memory limit, not
taking account of extra restrictions that may have been applied to the
container.

droberts195 · 2017-08-14T09:04:25Z

The mixed cluster tests are failing in the CI build of this PR. The reason is that I am planning to backport this to 6.1 and have set up the version checks in the serialisation code accordingly, but since this is not yet merged to 6.1 it makes the BWC cluster crash.

This change adds cgroup memory usage/limit to the OS stats section of the node stats on Linux. This information is useful because in Docker containers the standard node stats report the host memory limit, not taking account of extra restrictions that may have been applied to the container.

This matches what is done for cpu and cpuacct

jasontedor · 2017-08-14T09:22:49Z

One way to deal with this is to set the serialization version and skips in master to 7.0.0, then backport to 6.x with these set to 6.1.0, then push one more commit to master moving to 6.1.0. With this method, it's possible to have green builds every step of the way. This is preferred.

…done

droberts195 · 2017-08-14T11:41:39Z

Thanks for the tip @jasontedor. I have changed the skips to 7.0.0-alpha1 and now the CI is passing.

* master: (278 commits) Move pre-6.0 node checkpoint to SequenceNumbers Invalid JSON request body caused endless loop (elastic#26680) added comment fix line length violation Moved the check to fetch phase. This basically means that we throw a better error message instead of an AOBE and not adding more restrictions. inner hits: Do not allow inner hits that use _source and have a non nested object field as parent Separate Painless Whitelist Loading from the Painless Definition (elastic#26540) convert more admin requests to writeable (elastic#26566) Handle release of 5.6.1 Allow `InputStreamStreamInput` array size validation where applicable (elastic#26692) Update global checkpoint with permit after recovery Filter pre-6.0 nodes for checkpoint invariants Skip bad request REST test on pre-6.0 Reenable BWC tests after disabling for backport Add global checkpoint tracking on the primary [Test] Fix reference/cat/allocation/line_8 test failure [Docs] improved description for fs.total.available_in_bytes (elastic#26657) Fix discovery-file plugin to use custom config path fix testSniffNodes to use the new error message Add check for invalid index in WildcardExpressionResolver (elastic#26409) ...

jasontedor

As we discussed in another channel, I think we should avoid losing the ability to compare what a user sees in /sys/fs/cgroup/memory versus the output from the stats APIs. Otherwise, I'm good with this PR.

The original idea was to store these values as Long, truncating any values outside the range of long. However, this meant that in the relatively common case of no limit being applied, users would not see the same value in the OS stats as they see by querying Linux directly. This change places a burden on consumers of the strings to convert the strings to numbers and decide what to do about extremely large values, but there will be very few consumers and they would need to have a policy for dealing with "no limit" in any case.

droberts195 · 2017-09-25T12:34:11Z

@jasontedor as discussed I changed the type of the cgroup memory stats to String because the value used for "no limit" doesn't fit in long and BigInteger isn't supported by XContent.

Please could you take another look?

jasontedor

LGTM.

droberts195 · 2017-10-03T11:09:02Z

Thanks for reviewing @jasontedor.

This change adds cgroup memory usage/limit to the OS stats section of the node stats on Linux. This information is useful because in Docker containers the standard node stats report the host memory limit, not taking account of extra restrictions that may have been applied to the container. The original idea was to store these values as Long, truncating any values outside the range of long. However, this meant that in the relatively common case of no limit being applied, users would not see the same value in the OS stats as they see by querying Linux directly. So instead the values are stored as String. This change places a burden on consumers of the strings to convert the strings to numbers and decide what to do about extremely large values, but there will be very few consumers and they would need to have a policy for dealing with "no limit" in any case.

droberts195 added v6.1.0 v7.0.0 :Data Management/Stats Statistics tracking and retrieval APIs >enhancement labels Aug 11, 2017

droberts195 requested a review from jasontedor August 11, 2017 15:12

droberts195 added the review label Aug 11, 2017

droberts195 force-pushed the add_memory_cgroup_to_stats branch from c468f38 to 79ba58f Compare August 11, 2017 15:40

David Roberts added 6 commits August 14, 2017 10:07

Fix some typos

d757fec

Revert unnecessary whitespace diff

ef51d14

Include the name of the control group in the memory cgroup stats

299a97d

This matches what is done for cpu and cpuacct

Account for the possibility of the memory limit overflowing a long

30e4ec2

Add memory control group to docs

e0cf08a

droberts195 force-pushed the add_memory_cgroup_to_stats branch from 79ba58f to e0cf08a Compare August 14, 2017 09:07

Change serialisation compatibility to 7.0.0-alpha1 until backport is …

e88ebc0

…done

jasontedor mentioned this pull request Aug 30, 2017

Multi-level Nested Sort with Filters #26395

Merged

jasontedor reviewed Sep 19, 2017

View reviewed changes

David Roberts added 3 commits September 22, 2017 14:32

Merge branch 'master' into add_memory_cgroup_to_stats

c5eeef0

Add extra explanation to Javadoc comments

4014316

Merge branch 'master' into add_memory_cgroup_to_stats

43567e7

jasontedor approved these changes Oct 3, 2017

View reviewed changes

droberts195 merged commit a292740 into elastic:master Oct 3, 2017

droberts195 deleted the add_memory_cgroup_to_stats branch October 3, 2017 11:08

droberts195 pushed a commit that referenced this pull request Oct 3, 2017

Adjust transport compatibility logic following backport of #26166

ea7be2d

droberts195 pushed a commit that referenced this pull request Oct 3, 2017

Fix test bug from #26166

2bdeee8

droberts195 pushed a commit that referenced this pull request Oct 3, 2017

Fix test bug from #26166

dad0c6c

colings86 added v7.0.0-beta1 and removed v7.0.0 labels Feb 7, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add cgroup memory usage/limit to OS stats on Linux #26166

Add cgroup memory usage/limit to OS stats on Linux #26166

Uh oh!

droberts195 commented Aug 11, 2017

Uh oh!

droberts195 commented Aug 14, 2017

Uh oh!

jasontedor commented Aug 14, 2017

Uh oh!

droberts195 commented Aug 14, 2017

Uh oh!

jasontedor left a comment •

edited

Loading

Uh oh!

droberts195 commented Sep 25, 2017

Uh oh!

jasontedor left a comment

Uh oh!

droberts195 commented Oct 3, 2017

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Add cgroup memory usage/limit to OS stats on Linux #26166

Add cgroup memory usage/limit to OS stats on Linux #26166

Uh oh!

Conversation

droberts195 commented Aug 11, 2017

Uh oh!

droberts195 commented Aug 14, 2017

Uh oh!

jasontedor commented Aug 14, 2017

Uh oh!

droberts195 commented Aug 14, 2017

Uh oh!

jasontedor left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

droberts195 commented Sep 25, 2017

Uh oh!

jasontedor left a comment

Choose a reason for hiding this comment

Uh oh!

droberts195 commented Oct 3, 2017

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

jasontedor left a comment •

edited

Loading