Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(storage): introduce gRPC client-side metrics #10639

Merged
merged 45 commits into from
Aug 29, 2024
Merged

Conversation

frankyn
Copy link
Member

@frankyn frankyn commented Aug 6, 2024

Introduce client-side metrics for gRPC client only.

Pending

@product-auto-label product-auto-label bot added the api: storage Issues related to the Cloud Storage API. label Aug 6, 2024
@frankyn frankyn force-pushed the grpc-storage-metrics branch 2 times, most recently from 4863273 to 892e55c Compare August 7, 2024 17:55
Copy link
Contributor

@tritone tritone left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall looks pretty good. We should talk about default behavior and user facing options.

@@ -136,6 +152,9 @@ func newGRPCStorageClient(ctx context.Context, opts ...storageOption) (storageCl
}

func (c *grpcStorageClient) Close() error {
if c.settings.meterProvider != nil {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it problematic if this never gets called? e.g. will some data not be flushed?

I just ask because calling client.Close() is basically optional; the program could just terminate first. But maybe that's fine.

@@ -116,6 +117,21 @@ type grpcStorageClient struct {
func newGRPCStorageClient(ctx context.Context, opts ...storageOption) (storageClient, error) {
s := initSettings(opts...)
s.clientOption = append(defaultGRPCOptions(), s.clientOption...)
// TODO: detect project id
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am assuming we'd have to add an option to enable/disable metrics as well?

return provider, nil
}

func togRPCDialOption(provider *sdkmetric.MeterProvider) option.ClientOption {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: name seems misleading, maybe clientOptionFromMeterProvider. Or just do this inline since the function is only called once.

{Key: "host_id", Value: attribute.StringValue(mr_host_id_label_default)}}
}

func getPreparedResourceUsingGCPDetector(ctx context.Context, project string) (*resource.Resource, error) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: This func also seems superfluous and might be clearer in-line

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wrote it this way to write tests; should I be doing something different to make it testable?

metric_lb_locality_label = "grpc.lb.locality"
)

func getMetricsToEnable() []estats.Metric {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This and getMetricsEnabledByDefault could probably be local vars rather than functions given that they are just static data.


func TestMetrics(t *testing.T) {
ctx := context.Background()
grpcClient, err := NewGRPCClient(ctx)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this work without auth? If not we should probably make it an integration test. Or try option.WithoutAuthentication

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Requires auth; I need to update to not run with short tests.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If it's an integration test, move to the file with the other integration tests.

@frankyn frankyn force-pushed the grpc-storage-metrics branch 2 times, most recently from 55f8e46 to e7abe02 Compare August 15, 2024 23:31
@frankyn frankyn marked this pull request as ready for review August 16, 2024 16:01
@frankyn frankyn requested review from a team as code owners August 16, 2024 16:01
@frankyn frankyn added the status: blocked Resolving the issue is dependent on other work. label Aug 16, 2024
@frankyn frankyn requested a review from tritone August 19, 2024 15:53
storage/doc.go Outdated
@@ -359,13 +359,50 @@ the following side-effect imports to your application:
_ "google.golang.org/grpc/xds/googledirectpath"
)

The gRPC client emits metrics by default and will export the
gRPC telemetry discussed in [gRFC/66] and [gRFC/78] to
[Google Cloud Monitoring]. The metrics are accessible to through Cloud Monitoring API and you incur no additional cost for publishing the metrics.Google Cloud Support can use this information to more quickly diagnose
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
[Google Cloud Monitoring]. The metrics are accessible to through Cloud Monitoring API and you incur no additional cost for publishing the metrics.Google Cloud Support can use this information to more quickly diagnose
[Google Cloud Monitoring]. The metrics are accessible through Cloud Monitoring API and you incur no additional cost for publishing the metrics. Google Cloud Support can use this information to more quickly diagnose

}

func enableClientMetrics(ctx context.Context, s *settings) (*internalMetricsContext, error) {
_, ep, err := htransport.NewClient(ctx, s.clientOption...)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe we have this endpoint captured under the client as xml host/scheme.

// lack of credentials after initial failure.
// https://pkg.go.dev/go.opentelemetry.io/otel/sdk/metric@v1.28.0#Exporter
func (e *exporterLogSuppressor) Export(ctx context.Context, rm *metricdata.ResourceMetrics) error {
if err := e.exporter.Export(ctx, rm); err != nil && !e.emittedFailure {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we should do the check for emittedFailure earlier? Or is the intent that the user can fix the permission issue while the code is still running without having to restart?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The intention is that once it's deployed it can recover once appropriate permissions are provided otherwise a redeployment would be required.

@frankyn frankyn removed status: blocked Resolving the issue is dependent on other work. do not merge Indicates a pull request not ready for merge, due to either quality or timing. labels Aug 28, 2024
@frankyn frankyn enabled auto-merge (squash) August 29, 2024 16:40
@frankyn frankyn merged commit 437bcb1 into main Aug 29, 2024
11 checks passed
@frankyn frankyn deleted the grpc-storage-metrics branch August 29, 2024 18:45
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
api: storage Issues related to the Cloud Storage API.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants