-
Notifications
You must be signed in to change notification settings - Fork 8
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
grpcfd produces memory leaks #25
Comments
Profile: goroutinelimit1.zip |
Some things to keep in mind wrt resource management in grpcfd:
Go Routine in RecvFileThe go routine in RecvFile -loops until the RecvFD creates the fdCh it returns. In connWrap.recvExecutor.AsyncExec either we have an fd already for the requested (dev,ino), in which case we see the fdCh closed or we don't, in which case we store fdCh in connWrap.recvFDChans. This leaves us with two questions:
When is the fdCh stored connWrap.recvFDChans closed?In connWrap.Read if a read results in us receiving an fd we iterate through the connWrap.recvFDChans to see if there's a fdCh waiting to receive it. If so we send it and close the fdCh. So if we receive an fd for the (dev,ino) after its been asked for, the fdCh should be closed, and the corresponding go routine(s) should exit and not leak. If the connWrap as a net.Conn is Closed or garbage collected we call connWrap.close and loop through connWrap.recvFDchans closing all the fdChs which would cause all corresponding go routines to exit. Net-net: all go routines originating in RecvFile should exit after the connWrap as a net.Conn is Closed or garbage collected unless AsyncExec is deadlocking, or something really strange is happening. Expectations should therefore be that no go routine is leaked past the Close or garbage collection point. |
New incident reported by Szilard: |
I've found a suspect place, pelase have a look
|
Good catch. I think upping the buffers to 50 is likely overkill... but I am fine with it if that's how you'd like to address things :) |
Totally agree. Currently, it's needed for testing. We can quickly use this solution only for patch release (v1.13.1), but we first need to get results. |
Hmm, I can also be wrong because, in this case, we should panic about sending on the closed channel. So, most likely, it can be deadlocks in executors or cached connections that are not closed, as you mentioned above. |
Motivation
During using the grpcfd, we found goroutine leaks that could produce memory leaks in involved applications.
Soltuion
https://github.com/edwarnicke/grpcfd/blob/main/connwrap.go#L44-L56
The provided elements could produce memory leaks because they don't use context for canceling operations.
Suggested to change it to
The using of context can prevent edge cases for grpcfd.
Related to networkservicemesh/cmd-nsmgr#675
The text was updated successfully, but these errors were encountered: