Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

bigquery/storage/managedwriter: contexts are shared across regions in the pool causing future writes to be unusable #8232

Closed
jameshartig opened this issue Jul 7, 2023 · 4 comments · Fixed by #8275
Assignees
Labels
api: bigquery Issues related to the BigQuery API. priority: p1 Important issue which blocks shipping the next release. Will be fixed prior to next release. type: bug Error or flaw in code with unintended results or allowing sub-optimal usage patterns.

Comments

@jameshartig
Copy link
Contributor

Client

bigquery/storage/managedwriter

Environment

N/A

Go Environment

N/A

Code

Here's a contrived example. Essentially whenever a managed stream's context is cancelled all future managed streams are unusable because the connectionPool's context was cancelled.
e.g.

package main

import (
	"context"
	"fmt"

	"cloud.google.com/go/bigquery/storage/managedwriter"
)

var projectID = "test"
var dataset = "test"
var table = "test"

func main() {
	ctx := context.Background()
	client, err := managedwriter.NewClient(ctx, projectID)
	if err != nil {
		panic(err)
	}
	func() {
		innerCtx, cancel := context.WithCancel(ctx)
		defer cancel()
		ms, err := client.NewManagedStream(innerCtx,
			managedwriter.WithDestinationTable(fmt.Sprintf("projects/%s/datasets/%s/tables/%s", projectID, dataset, table)),
			// there isn't really a way to have duplicates within the same stream given
			// how this is architected so there's no reason not to use the default stream
			managedwriter.WithType(managedwriter.DefaultStream),
			managedwriter.EnableWriteRetries(true),
		)
		if err != nil {
			panic(err)
		}
		defer ms.Close()
		// do stuff with the stream like append, etc
	}()

	// later...

	ms, err := client.NewManagedStream(ctx,
		managedwriter.WithDestinationTable(fmt.Sprintf("projects/%s/datasets/%s/tables/%s", projectID, dataset, table)),
		// there isn't really a way to have duplicates within the same stream given
		// how this is architected so there's no reason not to use the default stream
		managedwriter.WithType(managedwriter.DefaultStream),
		managedwriter.EnableWriteRetries(true),
	)
	if err != nil {
		panic(err)
	}
	defer ms.Close()

	// anything you do with the stream will fail with context.Cancelled
}

Expected behavior

NewManagedStream should create a new stream with its own context separate from other contexts. Notably I do not have multiplexing enabled but even if I did, I don't think that should cause the contexts to be shared.

Actual behavior

After closing the context for a managed stream, all future managed streams are unusable.

Screenshots

e.g. A chart showing how messages are slow. Delete if not necessary.

Additional context

Commenting out this line fixes the issue for us:

@jameshartig jameshartig added the triage me I really want to be triaged. label Jul 7, 2023
@product-auto-label product-auto-label bot added the api: bigquery Issues related to the BigQuery API. label Jul 7, 2023
@shollyman
Copy link
Contributor

Thanks for the report, I'll take a look.

@shollyman shollyman added type: bug Error or flaw in code with unintended results or allowing sub-optimal usage patterns. priority: p1 Important issue which blocks shipping the next release. Will be fixed prior to next release. and removed triage me I really want to be triaged. labels Jul 7, 2023
@shollyman
Copy link
Contributor

This is a case of deriving child contexts from the wrong parents(s) when building the connection pool(s), which manage both multiplex and non-multiplex connections. We should build the pool(s) from the client's context, whereas we're deriving from the first writer we create in a given location/region at present.

We can potentially derive a specific connection's context from the writer context for the explicit connection case, or scope the context from NewManagedStream solely to the writer (ManagedStream) and not it's connection at the potential cost of more context checks. I'll need to explore this a bit.

@jameshartig
Copy link
Contributor Author

@shollyman I appreciate the quick reply. I agree with your assessment. I think using the client's context would be a quick fix for this issue.

Longer-term it would be nice if the pool could close the connection when all writes are gone, which I think was the goal of using the writer's context. Switching to the client's context would cause it to potentially outlast the writers, right?

@patrickmeiring
Copy link

Our team also ran into this, rolling the package back to v1.49.0 fixed it for us [1].

Our setup is similar: a singleton BigQuery client created on a long-lived server context, with managed stream writers being created on short-lived per-request contexts, to take advantage of connection pooling.

[1] https://chromium-review.googlesource.com/c/infra/luci/luci-go/+/4676773

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
api: bigquery Issues related to the BigQuery API. priority: p1 Important issue which blocks shipping the next release. Will be fixed prior to next release. type: bug Error or flaw in code with unintended results or allowing sub-optimal usage patterns.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants