Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug] Firecracker uses up 100% CPU after an upgrade from v1.1.2 to v1.3.1 #3542

Closed
3 tasks done
gorbak25 opened this issue Mar 20, 2023 · 2 comments · Fixed by #3595
Closed
3 tasks done

[Bug] Firecracker uses up 100% CPU after an upgrade from v1.1.2 to v1.3.1 #3542

gorbak25 opened this issue Mar 20, 2023 · 2 comments · Fixed by #3595
Assignees
Labels
Priority: High Indicates than an issue or pull request should be resolved ahead of issues or pull requests labelled Type: Bug Indicates an unexpected problem or unintended behavior

Comments

@gorbak25
Copy link
Contributor

gorbak25 commented Mar 20, 2023

Describe the bug

After upgrading firecracker from v1.1.2 to v1.3.1 the VMM instantly uses up 100% of the CPU while not creating an unix api socket. Firecracker eventually manages to create the socket, but it takes a few minutes. After attaching to the faulty firecracker process with strace it is revealed that firecracker is stuck in a loop trying to close every pid in existence:
image

To Reproduce

Run firecracker inside a privileged docker container based on node:16-bullseye with /dev/kvm mounted inside. v1.1.2 works fine, v1.3.1 doesn't work at all.

Expected behaviour

The Firecracker process starts without using 100% of the CPU and creates an unix socket.

Environment

  • Firecracker version: v1.3.1
  • Host kernel: 6.1.12-arch1-1 Archlinux
  • Guest kernel: 5.6
  • Rootfs used: Ext4, ubuntu 22.04
  • Architecture: x86_64, AMD Ryzen 9 5950X

Additional context

I was working on my startup which is a project for creating and managing dev environments inside Firecracker Microvm's. The project works fine using firecracker v1.1.2, after I upgraded the firecracker version the project broke.

Checks

  • Have you searched the Firecracker Issues database for similar problems?
  • Have you read the existing relevant Firecracker documentation?
  • Are you certain the bug being reported is a Firecracker issue?
@gorbak25 gorbak25 added the Type: Bug Indicates an unexpected problem or unintended behavior label Mar 20, 2023
@gorbak25
Copy link
Contributor Author

This is the offending commit: f79c94d
Turns out the limit for open FD's is pretty high in the container I've been running firecracker in.
image

@JonathanWoollett-Light JonathanWoollett-Light added the Priority: Medium Indicates than an issue or pull request should be resolved ahead of issues or pull requests labelled label Mar 24, 2023
@Rohan-Jagannathan
Copy link

Hello, I am representing a group of undergraduate UT Austin students aiming to contribute to open-source virtualization projects on behalf of a virtualization course. Could we have this issue assigned to us?

@dianpopa dianpopa added Priority: High Indicates than an issue or pull request should be resolved ahead of issues or pull requests labelled and removed Priority: Medium Indicates than an issue or pull request should be resolved ahead of issues or pull requests labelled labels Apr 4, 2023
dianpopa added a commit to dianpopa/firecracker that referenced this issue Apr 6, 2023
Fixes the mechanism for closing open FDs in jailer which was
based on the SC_OPEN_MAX system constant. Such an approach can lead
to bad performance when this value is very high.

The method for closing file descriptors is now chosen based on the environment:
1. we try to call into the close_range syscall (available on kernels >=5.9)
2. we fallback to reading from /proc/self/fd (for kernels <5.9)

Fixes firecracker-microvm#3542.

Signed-off-by: Grzegorz Uriasz <gorbak25@gmail.com>
Co-authored-by: Diana Popa <dpopa@amazon.com>
dianpopa added a commit to dianpopa/firecracker that referenced this issue Apr 11, 2023
Fixes the mechanism for closing open FDs in jailer which was
based on the SC_OPEN_MAX system constant. Such an approach can lead
to bad performance when this value is very high.

The method for closing file descriptors is now chosen based
on the environment:
1. we try to call into the close_range syscall
   (available on kernels >=5.9)
2. we fallback to reading from /proc/self/fd (for kernels <5.9)

Fixes firecracker-microvm#3542.

Signed-off-by: Grzegorz Uriasz <gorbak25@gmail.com>
Co-authored-by: Diana Popa <dpopa@amazon.com>
dianpopa added a commit to dianpopa/firecracker that referenced this issue Apr 13, 2023
Signed-off-by: Diana Popa <dpopa@amazon.com>
@dianpopa dianpopa self-assigned this Apr 13, 2023
dianpopa added a commit that referenced this issue Apr 14, 2023
Fixes the mechanism for closing open FDs in jailer which was
based on the SC_OPEN_MAX system constant. Such an approach can lead
to bad performance when this value is very high.

The method for closing file descriptors is now chosen based
on the environment:
1. we try to call into the close_range syscall
   (available on kernels >=5.9)
2. we fallback to reading from /proc/self/fd (for kernels <5.9)

Fixes #3542.

Signed-off-by: Grzegorz Uriasz <gorbak25@gmail.com>
Co-authored-by: Diana Popa <dpopa@amazon.com>
dianpopa added a commit that referenced this issue Apr 14, 2023
Signed-off-by: Diana Popa <dpopa@amazon.com>
ShadowCurse pushed a commit to ShadowCurse/firecracker that referenced this issue Apr 18, 2023
Fixes the mechanism for closing open FDs in jailer which was
based on the SC_OPEN_MAX system constant. Such an approach can lead
to bad performance when this value is very high.

The method for closing file descriptors is now chosen based
on the environment:
1. we try to call into the close_range syscall
   (available on kernels >=5.9)
2. we fallback to reading from /proc/self/fd (for kernels <5.9)

Fixes firecracker-microvm#3542.

Signed-off-by: Grzegorz Uriasz <gorbak25@gmail.com>
Co-authored-by: Diana Popa <dpopa@amazon.com>
ShadowCurse pushed a commit to ShadowCurse/firecracker that referenced this issue Apr 18, 2023
Signed-off-by: Diana Popa <dpopa@amazon.com>
andreitraistaru pushed a commit to andreitraistaru/firecracker that referenced this issue Apr 20, 2023
Fixes the mechanism for closing open FDs in jailer which was
based on the SC_OPEN_MAX system constant. Such an approach can lead
to bad performance when this value is very high.

The method for closing file descriptors is now chosen based
on the environment:
1. we try to call into the close_range syscall
   (available on kernels >=5.9)
2. we fallback to reading from /proc/self/fd (for kernels <5.9)

Fixes firecracker-microvm#3542.

Signed-off-by: Grzegorz Uriasz <gorbak25@gmail.com>
Co-authored-by: Diana Popa <dpopa@amazon.com>
(cherry picked from commit f472eda)
sladyn98 pushed a commit to sladyn98/firecracker that referenced this issue Jun 19, 2023
Fixes the mechanism for closing open FDs in jailer which was
based on the SC_OPEN_MAX system constant. Such an approach can lead
to bad performance when this value is very high.

The method for closing file descriptors is now chosen based
on the environment:
1. we try to call into the close_range syscall
   (available on kernels >=5.9)
2. we fallback to reading from /proc/self/fd (for kernels <5.9)

Fixes firecracker-microvm#3542.

Signed-off-by: Grzegorz Uriasz <gorbak25@gmail.com>
Co-authored-by: Diana Popa <dpopa@amazon.com>
sladyn98 pushed a commit to sladyn98/firecracker that referenced this issue Jun 19, 2023
Signed-off-by: Diana Popa <dpopa@amazon.com>
ShadowCurse pushed a commit to ShadowCurse/firecracker that referenced this issue Jul 26, 2023
Fixes the mechanism for closing open FDs in jailer which was
based on the SC_OPEN_MAX system constant. Such an approach can lead
to bad performance when this value is very high.

The method for closing file descriptors is now chosen based
on the environment:
1. we try to call into the close_range syscall
   (available on kernels >=5.9)
2. we fallback to reading from /proc/self/fd (for kernels <5.9)

Fixes firecracker-microvm#3542.

Signed-off-by: Grzegorz Uriasz <gorbak25@gmail.com>
Co-authored-by: Diana Popa <dpopa@amazon.com>
ShadowCurse pushed a commit to ShadowCurse/firecracker that referenced this issue Jul 26, 2023
Signed-off-by: Diana Popa <dpopa@amazon.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Priority: High Indicates than an issue or pull request should be resolved ahead of issues or pull requests labelled Type: Bug Indicates an unexpected problem or unintended behavior
Projects
None yet
4 participants