Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

conn: fix Close() on rows when a timeout during processing #548

Merged
merged 1 commit into from
May 3, 2022

Conversation

vincentbernat
Copy link
Contributor

There is a race condition when calling rows.Close() after a query was
stopped due to an error (notably when hitting a timeout). The
goroutine processing the query result tries to send the error to the
error channel, before closing both the error and the stream channel
while the Close() method is trying to read from stream before
processing errors. This leads to a deadlock.

This could be fixed by closing the stream channel before sending the
error, but it seems more future-proof to drain both stream and errors
channel in parallel in the Close() method.

There is a race condition when calling rows.Close() after a query was
stopped due to an error (notably when hitting a timeout). The
goroutine processing the query result tries to send the error to the
error channel, before closing both the error and the stream channel
while the Close() method is trying to read from stream before
processing errors. This leads to a deadlock.

This could be fixed by closing the stream channel before sending the
error, but it seems more future-proof to drain both stream and errors
channel in parallel in the Close() method.
@CLAassistant
Copy link

CLAassistant commented Apr 22, 2022

CLA assistant check
All committers have signed the CLA.

@gingerwizard gingerwizard self-requested a review May 2, 2022 16:50
@gingerwizard
Copy link
Collaborator

Thanks @vincentbernat for raising this and nicely spotted. This indeed can be simulated with the following. Swapping the close(stream) and close(error) is not even full proof in this case - although it admittedly is an extreme case.

package issues

import (
	"context"
	"testing"
	"time"

	"github.com/ClickHouse/clickhouse-go/v2"
	"github.com/stretchr/testify/assert"
)

func Test548(t *testing.T) {
	var (
		ctx, cancel = context.WithTimeout(context.Background(), time.Second)
		conn, err   = clickhouse.Open(&clickhouse.Options{
			Addr: []string{"127.0.0.1:9000"},
			Auth: clickhouse.Auth{
				Database: "default",
				Username: "default",
				Password: "",
			},
			DialTimeout: time.Second,
			Compression: &clickhouse.Compression{
				Method: clickhouse.CompressionLZ4,
			},
			//Debug: true,
		})
	)
	defer cancel()
	assert.NoError(t, err)
	timeout := time.After(3 * time.Second)
	done := make(chan bool)
	go func() {
		rows, _ := conn.Query(ctx, "SELECT sleepEachRow(0.001) as Col1 FROM system.numbers LIMIT 1000 SETTINGS max_block_size=10;")
		rows.Close()
		done <- true
	}()

	select {
	case <-timeout:
		t.Fatal("Close() deadlocked")
	case <-done:
	}
}

@gingerwizard
Copy link
Collaborator

Happy this solves the issue and will follow up with the test. Thanks @vincentbernat again

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants