Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Updating server to use ConnState to track idle connections #1010

Closed
wants to merge 1 commit into from

Conversation

jesselucas
Copy link

@jesselucas jesselucas commented Aug 7, 2016

I was working on graceful shutdown for another project and saw this issue #870.

This tries to keep track of idle connections and forcefully close them: golang/go#4674 (comment). Using httptest/server.go's implementation of graceful shutdown as inspiration: https://github.com/golang/go/blob/master/src/net/http/httptest/server.go

@CLAassistant
Copy link

CLAassistant commented Aug 7, 2016

CLA assistant check
All committers have signed the CLA.

@mholt
Copy link
Member

mholt commented Aug 8, 2016

Thanks Jesse, I'll look at this soon!

My initial concern is for memory usage of busy servers (one of the reasons I had not implemented this before).

@jwkohnen
Copy link

jwkohnen commented Aug 8, 2016

Great! Haven't looked closely, but just wanted to leave this here for another potential inspiration: https://github.com/facebookgo/httpdown

@mholt mholt added the under review 🧐 Review is pending before merging label Aug 8, 2016
go func() {
defer close(wait)
s.mu.Lock()
for c, st := range s.conns {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How often is this called? Is this called during normal operation? IE: this keeps the connections map locked for the full duration of a scan, with a lot of connections, this could possibly prevent connections from being made, or delayed until this full scan is done.

Again, I'm not sure if this is the case here.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@weingart Should only be called one time when the graceful Stop() is initiated. New connections would be prevented after we close the listener at the top of the method.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are there any race conditions with net/http?
Consider the following timeline:

  1. Stop called
  2. We got the lock for closing idle/new conns
  3. Just before we close an idle/new conn, a new request arrive on that conn in net/http
  4. s.Server.ConnState is blocked by the lock
  5. We close all idle/new conn and release the lock
  6. ConnState carry on and pass control to handler
  7. Handler process the request but fail to send response because the conn is closed

Copy link
Author

@jesselucas jesselucas Aug 10, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@tpng Would the ConnState of the now active conn transition to StateActive in this scenario? If it does then it should cause a GracefulTimeout to happen since the wg.Wait() would timeout. Which would mean we never forcefully close conn.

Are you thinking we should forcefully Close() all conns if the s.connTimeout happens?

Copy link

@tpng tpng Aug 10, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The connection will be closed and removed from s.conns.
The only issue in this case is we are not stopping gracefully anymore as we closed the connection when a request is being handled.

Copy link
Author

@jesselucas jesselucas Aug 10, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@tpng Thinking something like this?

// Stop stops s gracefully by closing conns after s.connTimeout and
// closes its listener.
func (s *Server) Stop() error {
    if s.closed {
        return errors.New("Server has been closed")
    }

    // Make sure a listener was set
    if s.listener != nil {
        // Close the listener to stop all new connections
        if err := s.listener.Close(); err != nil {
            return err
        }
    }
    s.mu.Lock()
    s.closed = true
    s.mu.Unlock()
    s.Server.SetKeepAlivesEnabled(false)

    // Give all connections time to finish. We wait the full timeout to close any
    // connection to avoid race situations with ConnState transitions
    time.Sleep(s.connTimeout)
        s.mu.Lock()
    for c, st := range s.conns {
        // Force close any idle and new connections.
        if st == http.StateIdle || st == http.StateNew {
            c.Close()
        }
    }
    s.mu.Unlock()

    // Closing this signals any TLS governor goroutines to exit
    if s.tlsGovChan != nil {
        close(s.tlsGovChan)
    }

    return nil
}

If we're always waiting for the s.connTimeout then I don't think we'd need the WaitGroup.

Copy link

@tpng tpng Aug 10, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am thinking of moving this

    s.mu.Lock()
    for c, st := range s.conns {
        // Force close any idle and new connections.
        if st == http.StateIdle || st == http.StateNew {
            c.Close()
        }
    }
    s.mu.Unlock()

to only run inside case <-time.After(s.connTimeout): in the select below. https://github.com/jesselucas/caddy/blob/d78c94fc9eb27ebb593a4e6226d0e45a6e5564a2/caddyhttp/httpserver/server.go#L328

And keeping the s.closed = true in the go func().

    s.mu.Lock()
    s.closed = true
    s.mu.Unlock()

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@tpng updated the code above. Removed the go func and the select in place of a sleep. See any issues with this?

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using a sleep will force the server to wait for timeout even when there are no open connections.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@tpng synced the update. Let me know if you see any issues.

@mholt
Copy link
Member

mholt commented Aug 12, 2016

Jesse, thanks so much for working on this. This is a big change so I don't expect it to make it in before the 0.9.1 release (if all goes well next week) -- we'll want to measure the impact of memory use and performance and also correctness here, since this meddles in some very critical sections of the HTTP server.

I do like that you have some tests. I think the next steps, if we think the code is correct, is to load test this branch with HTTP requests and see how the memory profile looks and also the performance difference, if any. Also try it with proxying long lived connections like websockets, etc, make sure the graceful timeout is hit and the connection is forced closed anyway. We also need to ensure we're not leaking any connections (leaving them open when they should be closed).

I might have some time for that soon but I would be appreciative if anyone wanted to help move this along by performing some of these tests and posting the results! Remember to keep it as simple as possible to reduce the surface area for confusion as much as we can.

@mholt
Copy link
Member

mholt commented Aug 22, 2016

Okay. I probably won't have time to measure memory impact in the near future. Would someone else be able to do that?

@jesselucas
Copy link
Author

@mholt I've been working to improve/test this and will do my best to look into measuring memory impact as well. I'll keep you posted.

@mholt
Copy link
Member

mholt commented Aug 23, 2016

@jesselucas Cool. And to clarify, I'm mostly just curious at this point; not saying there is a problem. Just wondering what the impact is. Thanks for your efforts!

@mholt mholt removed the under review 🧐 Review is pending before merging label Sep 19, 2016
@mholt
Copy link
Member

mholt commented Sep 21, 2016

@jesselucas Would it be alright if I close this until you're ready to continue working on it? We can reopen later.

@mholt mholt closed this Sep 21, 2016
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants