Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Verify compliant metric SDK specification implementation: MeterProvider/Shutdown #3643

Closed
2 tasks done
Tracked by #3674
MrAlias opened this issue Feb 3, 2023 · 9 comments
Closed
2 tasks done
Tracked by #3674
Assignees
Labels
area:metrics Part of OpenTelemetry Metrics pkg:SDK Related to an SDK package

Comments

@MrAlias
Copy link
Contributor

MrAlias commented Feb 3, 2023

  • Identify all the normative requirements, recommendations, and options the specification defines as comments to this issue
  • Ensure the current metric SDK implementation is compliant with these normative requirements, recommendations, and options in those comments.
@MrAlias MrAlias converted this from a draft issue Feb 3, 2023
@MrAlias MrAlias added pkg:SDK Related to an SDK package area:metrics Part of OpenTelemetry Metrics labels Feb 3, 2023
@MrAlias MrAlias self-assigned this May 30, 2023
@MrAlias MrAlias moved this from Todo to In Progress in Go: Metric SDK (GA) May 30, 2023
@MrAlias
Copy link
Contributor Author

MrAlias commented May 30, 2023

Shutdown MUST be called only once for each MeterProvider instance.

This is a normative statement for the user of a MeterProvider it is not something we are in the scope of compliance for.

Our implementation does handle the case where a user calls shutdown multiple times by using a sync.Once.

var once sync.Once

@MrAlias
Copy link
Contributor Author

MrAlias commented May 30, 2023

After the call to Shutdown, subsequent attempts to get a Meter are not allowed. SDKs SHOULD return a valid no-op Meter for these calls, if possible.

The MeterProvider is documented to perform this behavior:

// Calls to the Meter method after Shutdown has been called will return Meters
// that perform no operations.

However, our implementation is not compliant:

func TestMeterAfterShutdownIsNoOp(t *testing.T) {
	mp := NewMeterProvider()
	require.NoError(t, mp.Shutdown(context.Background()))

	m := mp.Meter("TestMeterAfterShutdownIsNoOp")
	_, ok := m.(noop.Meter)
	assert.Truef(t, ok, "Meter from shutdown MeterProvider is not NoOp: %T", m)
}
$ go test ./...
    provider_test.go:86:
        	Error Trace:	/home/tyler/go/src/go.opentelemetry.io/otel/sdk/metric/provider_test.go:86
        	Error:      	Should be true
        	Test:       	TestMeterAfterShutdownIsNoOp
        	Mejssages:   	Meter from shutdown MeterProvider is not NoOp: *metric.meter

@MrAlias
Copy link
Contributor Author

MrAlias commented May 30, 2023

Shutdown SHOULD provide a way to let the caller know whether it succeeded, failed or timed out.

An error is returned from the method:

func (mp *MeterProvider) Shutdown(ctx context.Context) error {

@MrAlias
Copy link
Contributor Author

MrAlias commented May 30, 2023

Shutdown SHOULD complete or abort within some timeout.

The method is documented to comply with this recommendation:

// This method honors the deadline or cancellation of ctx. An appropriate
// error will be returned in these situations. There is no guaranteed that all
// telemetry be flushed or all resources have been released in these
// situations.

The behavior though is going to depend on the Readers and exporters as this context is passed on to them and the cancel logic is expected to be applied there.

The manual reader seems to comply as it doesn't have any blocking operations (it has nowhere to flush a final collection so it just shuts down):

// Shutdown closes any connections and frees any resources used by the reader.
func (mr *manualReader) Shutdown(context.Context) error {
err := ErrReaderShutdown
mr.shutdownOnce.Do(func() {
// Any future call to Collect will now return ErrReaderShutdown.
mr.sdkProducer.Store(produceHolder{
produce: shutdownProducer{}.produce,
})
mr.mu.Lock()
defer mr.mu.Unlock()
mr.isShutdown = true
// release references to Producer(s)
mr.externalProducers.Store([]Producer{})
err = nil
})
return err
}

The periodic reader does a final collection and then shuts down the exporter. Each of these operations pass the context:

err = r.collect(ctx, ph, m)
if err == nil {
err = r.export(ctx, m)

sErr := r.exporter.Shutdown(ctx)

The exporter behavior is out of scope for the MeterProvider specification. It will be evaluated it its own issue (#3666).

The collection is going to call the pipeline produce. That handles the context timeouts according to the specification:

https://github.com/open-telemetry/opentelemetry-go/blob/b2246d58650bc6db04459353843250ee3bba77fc/sdk/metric/pipeline.go#L125C30-L153

It will also call the external metric producers (bridges):

externalMetrics, err := producer.Produce(ctx)

These are out of scope of the MeterProvider specification.

@MrAlias
Copy link
Contributor Author

MrAlias commented May 30, 2023

Shutdown MAY be implemented as a blocking API or an asynchronous API which notifies the caller via a callback or an event.

Shutdown is a synchronous method call:

// Shutdown shuts down the MeterProvider flushing all pending telemetry and
// releasing any held computational resources.
//
// This call is idempotent. The first call will perform all flush and
// releasing operations. Subsequent calls will perform no action and will
// return an error stating this.
//
// Measurements made by instruments from meters this MeterProvider created
// will not be exported after Shutdown is called.
//
// This method honors the deadline or cancellation of ctx. An appropriate
// error will be returned in these situations. There is no guaranteed that all
// telemetry be flushed or all resources have been released in these
// situations.
//
// This method is safe to call concurrently.
func (mp *MeterProvider) Shutdown(ctx context.Context) error {
if mp.shutdown != nil {
return mp.shutdown(ctx)
}
return nil
}

@MrAlias
Copy link
Contributor Author

MrAlias commented May 30, 2023

OpenTelemetry SDK authors MAY decide if they want to make the shutdown timeout configurable.

Timeouts are handled via context timeouts.

// This method honors the deadline or cancellation of ctx. An appropriate
// error will be returned in these situations. There is no guaranteed that all
// telemetry be flushed or all resources have been released in these
// situations.

@MrAlias
Copy link
Contributor Author

MrAlias commented May 30, 2023

Shutdown MUST be implemented at least by invoking Shutdown on all registered MetricReader and MetricExporter instances.

Indeed it is:

// config contains configuration options for a MeterProvider.
type config struct {
res *resource.Resource
readers []Reader
views []View
}
// readerSignals returns a force-flush and shutdown function for a
// MeterProvider to call in their respective options. All Readers c contains
// will have their force-flush and shutdown methods unified into returned
// single functions.
func (c config) readerSignals() (forceFlush, shutdown func(context.Context) error) {
var fFuncs, sFuncs []func(context.Context) error
for _, r := range c.readers {
sFuncs = append(sFuncs, r.Shutdown)
fFuncs = append(fFuncs, r.ForceFlush)
}
return unify(fFuncs), unifyShutdown(sFuncs)
}
// unify unifies calling all of funcs into a single function call. All errors
// returned from calls to funcs will be unify into a single error return
// value.
func unify(funcs []func(context.Context) error) func(context.Context) error {
return func(ctx context.Context) error {
var errs []error
for _, f := range funcs {
if err := f(ctx); err != nil {
errs = append(errs, err)
}
}
return unifyErrors(errs)
}
}
// unifyErrors combines multiple errors into a single error.
func unifyErrors(errs []error) error {
switch len(errs) {
case 0:
return nil
case 1:
return errs[0]
default:
return fmt.Errorf("%v", errs)
}
}
// unifyShutdown unifies calling all of funcs once for a shutdown. If called
// more than once, an ErrReaderShutdown error is returned.
func unifyShutdown(funcs []func(context.Context) error) func(context.Context) error {
f := unify(funcs)
var once sync.Once
return func(ctx context.Context) error {
err := ErrReaderShutdown
once.Do(func() { err = f(ctx) })
return err
}
}

@MrAlias
Copy link
Contributor Author

MrAlias commented May 30, 2023

@MrAlias MrAlias moved this from In Progress to Blocked in Go: Metric SDK (GA) May 30, 2023
@pellared
Copy link
Member

pellared commented Jun 5, 2023

@MrAlias I think that this issue can be resolved.

@pellared pellared moved this from Blocked to In Progress in Go: Metric SDK (GA) Jun 5, 2023
@MrAlias MrAlias closed this as completed Jun 5, 2023
@github-project-automation github-project-automation bot moved this from In Progress to Done in Go: Metric SDK (GA) Jun 5, 2023
@MrAlias MrAlias added this to the v1.17.0/v0.40.0 milestone Aug 3, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area:metrics Part of OpenTelemetry Metrics pkg:SDK Related to an SDK package
Projects
No open projects
Development

No branches or pull requests

2 participants