Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

etcdserver: runtime.FDUsage() function high CPU usage, create large numbers of objects #11969

Closed
cfc4n opened this issue Jun 3, 2020 · 7 comments · Fixed by #11986
Closed

Comments

@cfc4n
Copy link
Contributor

cfc4n commented Jun 3, 2020

version

etcd Version: 3.3.17
Git SHA: 6d80523
Go Version: go1.12.9
Go OS/Arch: linux/amd64

issue

  • When the number of etcdserver clients reaches 100K, there will be more than 100K fds under /proc/self/fd in the etcd process.
  • The monitorFileDescriptor function will execute the runtime.FDUsage() function per 5 seconds.
  • The runtime.FDUsage function scans all directories and files under /proc/self/fd and returns []os.FileInfo.

result:

  • A lot of fd scans will bring high CPU usage.
  • A lot of of fd reads will bring a lot of application and allocation of memory objects.

cpu_fdusage
fdusage_all_objects

solution:

  • Delete the acquisition of FDUsage and cancel the comparison with the syscall.Rlimit. Error message that depends on fd operation failure.
  • Extend the execution time of the monitorFileDescriptor function from every 5 seconds to 24hours.

@gyuho @xiang90 @jpbetz @jingyih @spzala

@cfc4n cfc4n changed the title etcdserver: runtime.FDUsage() function uses a very high CPU under high load scenarios, causing a large number of objects etcdserver: runtime.FDUsage() function high CPU usage, create large numbers of objects Jun 4, 2020
@cfc4n
Copy link
Contributor Author

cfc4n commented Jun 7, 2020

I think that FDUsage ticker set to 1 hour is enough. and I send a PR for it #11986

@agargi
Copy link
Contributor

agargi commented Jun 7, 2020

@cfc4n Just for my understanding, can you please clarify why do we say 1 hour is enough?

@tedyu
Copy link
Contributor

tedyu commented Jun 7, 2020

Closer look at FDUsage() reveals that it sorts the os.FileInfo's before returning.
I propose bypassing the sorting:

diff --git a/pkg/runtime/fds_linux.go b/pkg/runtime/fds_linux.go
index 8e9359db2..ebc53f8d9 100644
--- a/pkg/runtime/fds_linux.go
+++ b/pkg/runtime/fds_linux.go
@@ -16,7 +16,6 @@
 package runtime

 import (
-       "io/ioutil"
        "syscall"
 )

@@ -29,7 +28,12 @@ func FDLimit() (uint64, error) {
 }

 func FDUsage() (uint64, error) {
-       fds, err := ioutil.ReadDir("/proc/self/fd")
+       f, err := os.Open("/proc/self/fd")
+       if err != nil {
+               return 0, err
+       }
+       fds, err := f.Readdir(-1)
+       f.Close()
        if err != nil {
                return 0, err
        }

Note: -1 is passed to f.Readdir
The way monitorFileDescriptor() uses the file count is in regards to FDLimit.
We can refactor the code:

diff --git a/etcdserver/metrics.go b/etcdserver/metrics.go
index a77ebfddd..5f1ce14e9 100644
--- a/etcdserver/metrics.go
+++ b/etcdserver/metrics.go
@@ -187,14 +187,14 @@ func monitorFileDescriptor(lg *zap.Logger, done <-chan struct{}) {
        ticker := time.NewTicker(5 * time.Second)
        defer ticker.Stop()
        for {
-               used, err := runtime.FDUsage()
+               limit, err := runtime.FDLimit()
                if err != nil {
-                       lg.Warn("failed to get file descriptor usage", zap.Error(err))
+                       lg.Warn("failed to get file descriptor limit", zap.Error(err))
                        return
                }
-               limit, err := runtime.FDLimit()
+               used, err := runtime.FDUsage(limit/5*4)
                if err != nil {
-                       lg.Warn("failed to get file descriptor limit", zap.Error(err))
+                       lg.Warn("failed to get file descriptor usage", zap.Error(err))
                        return
                }
                if used >= limit/5*4 {

so that runtime.FDUsage() can specify upper bound to the number of entries retrieved.

@tedyu
Copy link
Contributor

tedyu commented Jun 7, 2020

	ticker := time.NewTicker(5 * time.Second)

If used >= limit/5*4, that means we already spend considerable time in runtime.FDUsage(), the interval for the ticker can be made longer.

@cfc4n
Copy link
Contributor Author

cfc4n commented Jun 8, 2020

@cfc4n Just for my understanding, can you please clarify why do we say 1 hour is enough?

etcd/etcdserver/metrics.go

Lines 200 to 202 in 49f91d6

if used >= limit/5*4 {
lg.Warn("80% of file descriptors are used", zap.Uint64("used", used), zap.Uint64("limit", limit))
}

It just a lg.Warn notice. It will not affect the execution of the program.
It can give a notic by ticker of 1 hour.

@cfc4n
Copy link
Contributor Author

cfc4n commented Jun 8, 2020

Closer look at FDUsage() reveals that it sorts the os.FileInfo's before returning.
I propose bypassing the sorting:

diff --git a/pkg/runtime/fds_linux.go b/pkg/runtime/fds_linux.go
index 8e9359db2..ebc53f8d9 100644
--- a/pkg/runtime/fds_linux.go
+++ b/pkg/runtime/fds_linux.go
@@ -16,7 +16,6 @@
 package runtime

 import (
-       "io/ioutil"
        "syscall"
 )

@@ -29,7 +28,12 @@ func FDLimit() (uint64, error) {
 }

 func FDUsage() (uint64, error) {
-       fds, err := ioutil.ReadDir("/proc/self/fd")
+       f, err := os.Open("/proc/self/fd")
+       if err != nil {
+               return 0, err
+       }
+       fds, err := f.Readdir(-1)
+       f.Close()
        if err != nil {
                return 0, err
        }

Note: -1 is passed to f.Readdir
The way monitorFileDescriptor() uses the file count is in regards to FDLimit.
We can refactor the code:

I don't think so about fds, err := f.Readdir(-1) to replace fds, err := ioutil.ReadDir("/proc/self/fd").
func (f *File) readdir() foreach all of fds that was the point. It cause use more object malloc,more memory used. and cause more golang GC with high CPU usage.

So, I think FDUsage ticker needs to be expanded

@dims
Copy link
Contributor

dims commented Sep 3, 2021

tailscale folks had the same issue, please see how they fixed it here: tailscale/tailscale#2785

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging a pull request may close this issue.

4 participants