Skip to content

Commit

Permalink
http/tcp checks: fix long timeout behavior to default to user-configu…
Browse files Browse the repository at this point in the history
…red value (#6094)

Fixes #5834
  • Loading branch information
s-mang authored Jul 16, 2019
1 parent 648e20d commit ea2bd5b
Show file tree
Hide file tree
Showing 3 changed files with 101 additions and 29 deletions.
22 changes: 7 additions & 15 deletions agent/checks/check.go
Original file line number Diff line number Diff line change
Expand Up @@ -343,15 +343,10 @@ func (c *CheckHTTP) Start() {
Timeout: 10 * time.Second,
Transport: trans,
}

// For long (>10s) interval checks the http timeout is 10s, otherwise the
// timeout is the interval. This means that a check *should* return
// before the next check begins.
if c.Timeout > 0 && c.Timeout < c.Interval {
if c.Timeout > 0 {
c.httpClient.Timeout = c.Timeout
} else if c.Interval < 10*time.Second {
c.httpClient.Timeout = c.Interval
}

if c.OutputMaxSize < 1 {
c.OutputMaxSize = DefaultBufSize
}
Expand Down Expand Up @@ -482,15 +477,12 @@ func (c *CheckTCP) Start() {

if c.dialer == nil {
// Create the socket dialer
c.dialer = &net.Dialer{DualStack: true}

// For long (>10s) interval checks the socket timeout is 10s, otherwise
// the timeout is the interval. This means that a check *should* return
// before the next check begins.
if c.Timeout > 0 && c.Timeout < c.Interval {
c.dialer = &net.Dialer{
Timeout: 10 * time.Second,
DualStack: true,
}
if c.Timeout > 0 {
c.dialer.Timeout = c.Timeout
} else if c.Interval < 10*time.Second {
c.dialer.Timeout = c.Interval
}
}

Expand Down
78 changes: 78 additions & 0 deletions agent/checks/check_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -328,6 +328,84 @@ func TestCheckHTTP(t *testing.T) {
}
}

func TestCheckHTTPTCP_BigTimeout(t *testing.T) {
testCases := []struct {
timeoutIn, intervalIn, timeoutWant time.Duration
}{
{
timeoutIn: 31 * time.Second,
intervalIn: 30 * time.Second,
timeoutWant: 31 * time.Second,
},
{
timeoutIn: 30 * time.Second,
intervalIn: 30 * time.Second,
timeoutWant: 30 * time.Second,
},
{
timeoutIn: 29 * time.Second,
intervalIn: 30 * time.Second,
timeoutWant: 29 * time.Second,
},
{
timeoutIn: 0 * time.Second,
intervalIn: 10 * time.Second,
timeoutWant: 10 * time.Second,
},
{
timeoutIn: 0 * time.Second,
intervalIn: 30 * time.Second,
timeoutWant: 10 * time.Second,
},
{
timeoutIn: 10 * time.Second,
intervalIn: 30 * time.Second,
timeoutWant: 10 * time.Second,
},
{
timeoutIn: 9 * time.Second,
intervalIn: 30 * time.Second,
timeoutWant: 9 * time.Second,
},
{
timeoutIn: -1 * time.Second,
intervalIn: 10 * time.Second,
timeoutWant: 10 * time.Second,
},
{
timeoutIn: 0 * time.Second,
intervalIn: 5 * time.Second,
timeoutWant: 10 * time.Second,
},
}

for _, tc := range testCases {
desc := fmt.Sprintf("timeoutIn: %v, intervalIn: %v", tc.timeoutIn, tc.intervalIn)
t.Run(desc, func(t *testing.T) {
checkHTTP := &CheckHTTP{
Timeout: tc.timeoutIn,
Interval: tc.intervalIn,
}
checkHTTP.Start()
defer checkHTTP.Stop()
if checkHTTP.httpClient.Timeout != tc.timeoutWant {
t.Fatalf("expected HTTP timeout to be %v, got %v", tc.timeoutWant, checkHTTP.httpClient.Timeout)
}

checkTCP := &CheckTCP{
Timeout: tc.timeoutIn,
Interval: tc.intervalIn,
}
checkTCP.Start()
defer checkTCP.Stop()
if checkTCP.dialer.Timeout != tc.timeoutWant {
t.Fatalf("expected TCP timeout to be %v, got %v", tc.timeoutWant, checkTCP.dialer.Timeout)
}
})

}
}

func TestCheckMaxOutputSize(t *testing.T) {
t.Parallel()
timeout := 5 * time.Millisecond
Expand Down
30 changes: 16 additions & 14 deletions website/source/docs/agent/checks.html.md
Original file line number Diff line number Diff line change
Expand Up @@ -42,25 +42,27 @@ There are several different kinds of checks:
blog post](https://www.hashicorp.com/blog/protecting-consul-from-rce-risk-in-specific-configurations)
for more details.

* HTTP + Interval - These checks make an HTTP `GET` request every Interval (e.g.
every 30 seconds) to the specified URL. The status of the service depends on
the HTTP response code: any `2xx` code is considered passing, a `429 Too Many
Requests` is a warning, and anything else is a failure. This type of check
* HTTP + Interval - These checks make an HTTP `GET` request to the specified URL,
waiting the specified `interval` amount of time between requests (eg. 30 seconds).
The status of the service depends on the HTTP response code: any `2xx` code is
considered passing, a `429 Too ManyRequests` is a warning, and anything else is
a failure. This type of check
should be preferred over a script that uses `curl` or another external process
to check a simple HTTP operation. By default, HTTP checks are `GET` requests
unless the `method` field specifies a different method. Additional header
fields can be set through the `header` field which is a map of lists of
strings, e.g. `{"x-foo": ["bar", "baz"]}`. By default, HTTP checks will be
configured with a request timeout equal to the check interval, with a max of
10 seconds. It is possible to configure a custom HTTP check timeout value by
configured with a request timeout equal to 10 seconds.
It is possible to configure a custom HTTP check timeout value by
specifying the `timeout` field in the check definition. The output of the
check is limited to roughly 4KB. Responses larger than this will be truncated.
HTTP checks also support TLS. By default, a valid TLS certificate is expected.
Certificate verification can be turned off by setting the `tls_skip_verify`
field to `true` in the check definition.

* TCP + Interval - These checks make a TCP connection attempt every Interval
(e.g. every 30 seconds) to the specified IP/hostname and port. If no hostname
* TCP + Interval - These checks make a TCP connection attempt to the specified
IP/hostname and port, waiting `interval` amount of time between attempts
(e.g. 30 seconds). If no hostname
is specified, it defaults to "localhost". The status of the service depends on
whether the connection attempt is successful (ie - the port is currently
accepting connections). If the connection is accepted, the status is
Expand All @@ -69,10 +71,9 @@ There are several different kinds of checks:
addresses, and the first successful connection attempt will result in a
successful check. This type of check should be preferred over a script that
uses `netcat` or another external process to check a simple socket operation.
By default, TCP checks will be configured with a request timeout equal to the
check interval, with a max of 10 seconds. It is possible to configure a custom
TCP check timeout value by specifying the `timeout` field in the check
definition.
By default, TCP checks will be configured with a request timeout of 10 seconds.
It is possible to configure a custom TCP check timeout value by specifying the
`timeout` field in the check definition.

* <a name="TTL"></a>Time to Live (TTL) - These checks retain their last known
state for a given TTL. The state of the check must be updated periodically
Expand Down Expand Up @@ -107,8 +108,9 @@ There are several different kinds of checks:

* gRPC + Interval - These checks are intended for applications that support the standard
[gRPC health checking protocol](https://github.com/grpc/grpc/blob/master/doc/health-checking.md).
The state of the check will be updated at the given interval by probing the configured
endpoint. By default, gRPC checks will be configured with a default timeout of 10 seconds.
The state of the check will be updated by probing the configured endpoint, waiting `interval`
amount of time between probes (eg. 30 seconds). By default, gRPC checks will be configured
with a default timeout of 10 seconds.
It is possible to configure a custom timeout value by specifying the `timeout` field in
the check definition. gRPC checks will default to not using TLS, but TLS can be enabled by
setting `grpc_use_tls` in the check definition. If TLS is enabled, then by default, a valid
Expand Down

0 comments on commit ea2bd5b

Please sign in to comment.