Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Data race of reading row value of timestamp data type #179

Open
7phs opened this issue Nov 20, 2023 · 5 comments
Open

Data race of reading row value of timestamp data type #179

7phs opened this issue Nov 20, 2023 · 5 comments
Labels
bug Something isn't working

Comments

@7phs
Copy link

7phs commented Nov 20, 2023

Hello,

I found a data race with reading data from Databricks simultaneously. For example, from different tables or reading catalogs metadata simultaneously.

A code to reproduce data race is

import (
	"context"
	"database/sql"
	"fmt"
	"sync"
	"testing"

	dbsql "github.com/databricks/databricks-sql-go"
)

func TestDatabricksDataRace(t *testing.T) {
	var (
		wg         sync.WaitGroup
		ctx        = context.Background()
		w          = make(chan bool)
		routineNum = 10
	)

	wg.Add(routineNum)

	for i := 0; i < routineNum; i++ {
		go func() {
			defer wg.Done()

			<-w

			if err := listCatalogs(ctx); err != nil {
				fmt.Println("ERROR:", err)
			}
		}()
	}

	close(w)

	wg.Wait()
}

func listCatalogs(ctx context.Context) error {
	connector, err := dbsql.NewConnector(
		dbsql.WithServerHostname(cfg.Host),
		dbsql.WithPort(int(cfg.Port)),
		dbsql.WithAccessToken(cfg.AccessToken),
		dbsql.WithHTTPPath(cfg.HTTPPath),
	)
	if err != nil {
		return err
	}

	db := sql.OpenDB(connector)
	defer db.Close()

	for i := 0; i < 10; i++ {
		err := func() error {
			r, err := db.QueryContext(ctx, "SHOW CATALOGS;")
			if err != nil {
				return err
			}
			defer r.Close()

			for r.Next() {
				var s string
				if err := r.Scan(&s); err != nil {
					return err
				}
			}

			return nil
		}()

		if err != nil {
			return err
		}
	}

	return nil
}

A command to run this test with data race detector:

go test -race -run TestDatabricksDataRace .

A root cause of data race is not initialised field loc of arrow.TimestampType. It initialised in the first call of function arrow.TimestampType :: GetZone().

A workaround of data race is:

func init() {
   // init `arrow.TimestampType` before use it.
    _, _ = arrow.FixedWidthTypes.Timestamp_us.(*arrow.TimestampType).GetToTimeFunc()
}

Environment:

  • go v1.21
  • databricks-sql-go v1.5.2
@7phs
Copy link
Author

7phs commented Nov 20, 2023

Related issue of Apache Arrow - apache/arrow#38795

@kravets-levko kravets-levko added the bug Something isn't working label Apr 17, 2024
@7phs
Copy link
Author

7phs commented May 29, 2024

databricks-sql-go updated and uses Apache Arrow Go v16 with fixed data race.

This bug is fixed now.

@7phs 7phs closed this as completed May 29, 2024
@7phs
Copy link
Author

7phs commented Jun 5, 2024

A bug of a data race still exists with reverting the Apache Arrow version to v12.

@7phs 7phs reopened this Jun 5, 2024
@nverkhachoyan
Copy link

Why was the Apache Arrow version reverted back to v12? It uses golang.org/x/net v0.7.0 which has several vulnerabilities.

@7phs
Copy link
Author

7phs commented Nov 18, 2024

@nverkhachoyan The reason for reverting was to maintain backward compatibility with some end-user solutions.

The databricks-sql-go library has been updated to use golang.org/x/net v0.17.0, which addresses all known vulnerabilities. All dependencies adhere to the same versioning level (minor, in this case) as per Go module rules.

However, the data race bug in Apache Arrow v12 still persists.

You can use the following workaround:

func init() {
	_, _ = arrow.FixedWidthTypes.Timestamp_s.(*arrow.TimestampType).GetToTimeFunc()
	_, _ = arrow.FixedWidthTypes.Timestamp_ms.(*arrow.TimestampType).GetToTimeFunc()
	_, _ = arrow.FixedWidthTypes.Timestamp_us.(*arrow.TimestampType).GetToTimeFunc()
	_, _ = arrow.FixedWidthTypes.Timestamp_ns.(*arrow.TimestampType).GetToTimeFunc()
}

Insert this snippet into any appropriate file that uses Apache Arrow.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

3 participants