Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix insufficient memory issues #11030

Merged
merged 2 commits into from
Apr 20, 2021

Conversation

prezha
Copy link
Contributor

@prezha prezha commented Apr 8, 2021

fixes #11029

examples and oom issue details are given in the description of the original issue #11029

proposed solution: increase (recommended) min memory to 2048 MiB

@k8s-ci-robot k8s-ci-robot added cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. size/S Denotes a PR that changes 10-29 lines, ignoring generated files. labels Apr 8, 2021
@@ -96,7 +96,7 @@ const (
waitTimeout = "wait-timeout"
nativeSSH = "native-ssh"
minUsableMem = 1800 // Kubernetes (kubeadm) will not start with less
minRecommendedMem = 1900 // Warn at no lower than existing configurations
minRecommendedMem = 2048 // Warn at no lower than existing configurations
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need to fix this, because the value is also used for looking at the available memory inside the VM...

So we should be using 2048 MiB on the "outside", but look for at least 1700 MiB on the "inside"

        // ControlPlaneNumCPU is the number of CPUs required on control-plane
        ControlPlaneNumCPU = 2

        // ControlPlaneMem is the number of megabytes of memory required on the control-plane
        // Below that amount of RAM running a stable control plane would be difficult.
        ControlPlaneMem = 1700

There is a force flag to run it it anyway, but we should do the complaint (the same as kubeadm)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

agree with anders, I think we should not bump this, but check fo the outter VM memory

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

plus on docker driver on linux with cgroup v2 the memory limit is being ignored, so this PR doesnt change much

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the focus of this pr was to eliminate tests flake due to insufficient memory errors - shall i just remove this change then and leave the changes in how much memory tests allocate?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i've reverted this specific change as it's irrelevant for the specific issue it tries to solve for flake tests and should be instead addressed under a separate issue/pr

@afbjorklund
Copy link
Collaborator

afbjorklund commented Apr 9, 2021

We should remove the current constants, and provide two new ones. Also need to clear up the MB vs MiB confusion...

When I run the regular VM (with the minimum 2048 MiB), I get 1985 MiB total - as reported by the free command.

🔥 Creating virtualbox VM (CPUs=2, Memory=2048MB, Disk=20000MB) ...

                         _             _            
            _         _ ( )           ( )           
  ___ ___  (_)  ___  (_)| |/')  _   _ | |_      __  
/' _ ` _ `\| |/' _ `\| || , <  ( ) ( )| '_`\  /'__`\
| ( ) ( ) || || ( ) || || |\`\ | (_) || |_) )(  ___/
(_) (_) (_)(_)(_) (_)(_)(_) (_)`\___/'(_,__/'`\____)

$ free -m
              total        used        free      shared  buff/cache   available
Mem:           1985         645          85         611        1254         582
Swap:             0           0           0
$ df -m /mnt/sda1
Filesystem     1M-blocks  Used Available Use% Mounted on
/dev/sda1          17368  1743     14610  11% /mnt/sda1

Checking for 1700 or 1800 doesn't matter, but 1900 can have some false positives* (and there is much less available)

* I think some of the cloud images ended up with like 1892 (for 2 GiB VM), but that shouldn't need any user concern

VBoxManage showvminfo minikube
Memory size 2048MB

qemu-img info ~/.minikube/machines/minikube/disk.vmdk
virtual size: 19.5 GiB (20971520000 bytes)

@afbjorklund
Copy link
Collaborator

afbjorklund commented Apr 9, 2021

I opened #10808 to provide some better documentation on the requirements

Like #10808 (comment)

The traditional interpretation has used 2 vCPU, 2048 MiB mem and 20,000 MiB disk.

Most people would like to double these, to 4 vCPU and 4 GiB mem and 32-40 GiB disk.

@afbjorklund
Copy link
Collaborator

proposed solution: increase (recommended) min memory to 2048 MiB

We should definitely use --memory 2048, my comment was about the code

@prezha
Copy link
Contributor Author

prezha commented Apr 19, 2021

@afbjorklund thanks for your comments!

would you agree then with this pr - to fix the flaking tests (focusing on --memory 2048) due to the lack of memory and clear that one out, and we can have a separate pr to better handle customer user mem params, proposed min memory (for different envs) in line with your #10808?

@afbjorklund
Copy link
Collaborator

@afbjorklund thanks for your comments!

would you agree then with this pr - to fix the flaking tests (focusing on --memory 2048) due to the lack of memory and clear that one out, and we can have a separate pr to better handle customer user mem params, proposed min memory (for different envs) in line with your #10808?

Yes, changing the VM to use --memory=2048 can be done any time. The other is more of a backlog item.

@prezha
Copy link
Contributor Author

prezha commented Apr 19, 2021

great, thanks!
i see still a lot of tests flake bc of this, so it will be much clearer when we rule this one out :)

@medyagh medyagh merged commit 6ad8cd7 into kubernetes:master Apr 20, 2021
@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: medyagh, prezha

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Apr 20, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. size/S Denotes a PR that changes 10-29 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Tests Flake due to insufficient memory
4 participants