Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

http.Response.Body of api response is sometimes not read completely and connections are not reused. #3316

Open
oliverbestmann opened this issue Jul 24, 2017 · 1 comment
Labels
theme/api Relating to the HTTP API interface type/bug Feature does not function as expected

Comments

@oliverbestmann
Copy link

oliverbestmann commented Jul 24, 2017

consul version for both Client and Server

Client: 0.8.5
Server: 0.8.5

Operating system and Environment details

Go 1.8.3

Description of the Issue (and unexpected/desired result)

The consul api client in golang does not consume the complete body of the response received from the consul server. Because of that, the http (keep alive) connection can not be re-used and will be closed.

Reproduction steps

Use a connection that prints a message, each time a new connection is created, then call some endpoints, like client.Health().Service("my-service", "", true, nil):

api.NewClient(&api.Config{
    Address: consulAddress,
    HttpClient: &http.Client{
        Transport: &http.Transport{
            Proxy: nil,
            DialContext: func(ctx context.Context, network, address string) (net.Conn, error) {
                log.Infof("Dialing %s %s", network, address)
                return dial(ctx, network, address)
            },
            ResponseHeaderTimeout: 90 * time.Second,
        },
    },
})
@slackpad slackpad added the type/bug Feature does not function as expected label Sep 2, 2017
@dnephin dnephin added the theme/api Relating to the HTTP API interface label Aug 18, 2020
@findkim
Copy link
Contributor

findkim commented Dec 8, 2020

Hi @oliverbestmann! I happened to be looking into this while debugging a seemingly related bug on another project using the Consul API client (hashicorp/consul-terraform-sync#146). I wanted to add my findings from following the reproduction steps you provided hopefully to help out anyone else.

I wrote up the test below to capture concurrent and serial requests using the Consul API client. It reveals that a new connection is established for concurrent requests. Once a pool of connections are established, subsequent serial requests reuse those existing connections.

client_test.go
package client_test

import (
	"context"
	"log"
	"net"
	"net/http"
	"sync"
	"testing"
	"time"

	"github.com/hashicorp/consul/api"
	"github.com/stretchr/testify/assert"
	"github.com/stretchr/testify/require"
)

func TestAPIClient_reusableConns(t *testing.T) {
	var dialCount int
	dialer := net.Dialer{
		Timeout:   30 * time.Second,
		KeepAlive: 30 * time.Second,
	}
	transport := http.Transport{
		Proxy: nil,
		Dial:  dialer.Dial,
		DialContext: func(ctx context.Context, network, address string) (net.Conn, error) {
			dialCount++
			log.Printf("[INFO] Dialing %s %s", network, address)
			return dialer.DialContext(ctx, network, address)
		},
		MaxIdleConns:        5,
		MaxIdleConnsPerHost: 5,
		IdleConnTimeout:     90 * time.Second,
		TLSHandshakeTimeout: 10 * time.Second,
	}

	config := api.DefaultConfig()
	config.HttpClient = &http.Client{Transport: &transport}
	client, err := api.NewClient(config)
	require.NoError(t, err)

	// New connections: concurrent requests
	var wg sync.WaitGroup
	wg.Add(5)
	for i := 0; i < 5; i++ {
		go func() {
			services, qm, err := client.Health().Service("app", "", true, nil)
			require.NoError(t, err)
			t.Log("Nubmer of services:", len(services))
			t.Log("Request time:", qm.RequestTime)
			wg.Done()
		}()
	}
	wg.Wait()

	// Reuses connections: serial requests waits for complete
	// Implies that the API client is reading all of the body
	for i := 0; i < 5; i++ {
		services, qm, err := client.Health().Service("app", "", true, nil)
		require.NoError(t, err)
		t.Log("Number of services:", len(services))
		t.Log("Request time:", qm.RequestTime)
		require.NoError(t, err)
	}

	assert.Equal(t, dialCount, 5, "unexpected # of connections reused")
}

Output consul/api client v1.4.0 (go1.14.6) with server 1.9.0

$ go test ./client_test/ -v
=== RUN   TestAPIClient_reusableConns
2020/12/08 11:22:24 [INFO] Dialing tcp 127.0.0.1:8500
2020/12/08 11:22:24 [INFO] Dialing tcp 127.0.0.1:8500
2020/12/08 11:22:24 [INFO] Dialing tcp 127.0.0.1:8500
2020/12/08 11:22:24 [INFO] Dialing tcp 127.0.0.1:8500
2020/12/08 11:22:24 [INFO] Dialing tcp 127.0.0.1:8500
    client_test.go:49: Number of services: 100
    client_test.go:50: Request time: 9.65514ms
    client_test.go:49: Number of services: 100
    client_test.go:50: Request time: 9.994027ms
    client_test.go:49: Number of services: 100
    client_test.go:50: Request time: 9.494824ms
    client_test.go:49: Number of services: 100
    client_test.go:50: Request time: 9.994659ms
    client_test.go:49: Number of services: 100
    client_test.go:50: Request time: 10.192487ms
    client_test.go:61: Number of services: 100
    client_test.go:62: Request time: 4.494578ms
    client_test.go:61: Number of services: 100
    client_test.go:62: Request time: 3.266552ms
    client_test.go:61: Number of services: 100
    client_test.go:62: Request time: 3.105789ms
    client_test.go:61: Number of services: 100
    client_test.go:62: Request time: 3.33691ms
    client_test.go:61: Number of services: 100
    client_test.go:62: Request time: 3.260026ms
--- PASS: TestAPIClient_reusableConns (0.05s)
PASS
ok  	github.com/hashicorp/consul/client_test	0.384s

Output consul/api from v0.8.5 with server v0.8.5

$ go test ./client_test/ -v -count 1
=== RUN   TestAPIClient_reusableConns
2020/12/08 11:35:11 [INFO] Dialing tcp 127.0.0.1:8500
2020/12/08 11:35:11 [INFO] Dialing tcp 127.0.0.1:8500
2020/12/08 11:35:11 [INFO] Dialing tcp 127.0.0.1:8500
2020/12/08 11:35:11 [INFO] Dialing tcp 127.0.0.1:8500
2020/12/08 11:35:11 [INFO] Dialing tcp 127.0.0.1:8500
    client_test.go:49: Nubmer of services: 100
    client_test.go:50: Request time: 4.261913ms
    client_test.go:49: Nubmer of services: 100
    client_test.go:50: Request time: 4.370877ms
    client_test.go:49: Nubmer of services: 100
    client_test.go:50: Request time: 5.019223ms
    client_test.go:49: Nubmer of services: 100
    client_test.go:50: Request time: 4.301784ms
    client_test.go:49: Nubmer of services: 100
    client_test.go:50: Request time: 4.88288ms
2020/12/08 11:35:11 [INFO] Dialing tcp 127.0.0.1:8500
    client_test.go:61: Number of services: 100
    client_test.go:62: Request time: 2.380345ms
2020/12/08 11:35:11 [INFO] Dialing tcp 127.0.0.1:8500
    client_test.go:61: Number of services: 100
    client_test.go:62: Request time: 2.111134ms
2020/12/08 11:35:11 [INFO] Dialing tcp 127.0.0.1:8500
    client_test.go:61: Number of services: 100
    client_test.go:62: Request time: 2.150876ms
2020/12/08 11:35:11 [INFO] Dialing tcp 127.0.0.1:8500
    client_test.go:61: Number of services: 100
    client_test.go:62: Request time: 2.486228ms
2020/12/08 11:35:11 [INFO] Dialing tcp 127.0.0.1:8500
    client_test.go:61: Number of services: 100
    client_test.go:62: Request time: 1.968497ms
    client_test.go:66: 
        	Error Trace:	client_test.go:66
        	Error:      	Not equal: 
        	            	expected: 10
        	            	actual  : 5
        	Test:       	TestAPIClient_reusableConns
        	Messages:   	unexpected # of connections reused
--- FAIL: TestAPIClient_reusableConns (0.03s)
FAIL
FAIL	github.com/hashicorp/consul/client_test	0.159s
FAIL

I tested this against consul/api v1.4.0 running go v1.14.6, and compared the relevant changes to Consul version v0.8.5. There were no major changes between v0.8.5 and the module v1.4.0 API client that were related to the reading of the body.

A difference however are the versions of Go to build from which includes the standard encoding/json library. A trailing \n may be a culprit where the JSON decoding of the response stops processing the reader and leaves some bytes causing the TLS connection to close and not be reused. (sourced from bradfitz golang/go#20528 (comment) and Consul source code)

The JSON decoding library may have changed since then, I didn't dive deep into this, but it might account for the \n now and reading the whole response body.

Please do share if you're observing that TLS connections are not being reused with newer versions of Consul.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
theme/api Relating to the HTTP API interface type/bug Feature does not function as expected
Projects
None yet
Development

No branches or pull requests

4 participants