-
Notifications
You must be signed in to change notification settings - Fork 889
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Define FaaS Metric Semantics #1052
Changes from all commits
0bd0ef3
44d9225
d666a30
864e40a
7e8c294
1219aab
b6519a3
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,20 @@ | ||
groups: | ||
- id: faas-metrics | ||
prefix: faas | ||
brief: > | ||
This document defines the attributes used in | ||
faas (function as a service) metrics. | ||
attributes: | ||
- ref: faas.trigger | ||
required: always | ||
- ref: faas.invoked_name | ||
required: always | ||
- ref: faas.invoked_provider | ||
required: always | ||
- ref: faas.invoked_region | ||
required: always | ||
- ref: faas.coldstart | ||
required: always | ||
Comment on lines
+10
to
+17
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Agree about coldstarts. Can add a condition there. For invoked_*, doesn't this apply for incoming as well? Maybe I misunderstand the trace spec here. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. To my understanding, the |
||
- id: error | ||
type: boolean | ||
brief: 'Whether or not the function resulted in an error.' |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,53 @@ | ||
# General | ||
|
||
The conventions described in this section are FaaS (Function as a Service) specific. When FaaS operations occur, | ||
metric events about those operations will be generated and reported to provide insight into the | ||
operations. By adding FaaS labels to metric events it allows for finely tuned filtering. | ||
|
||
**Disclaimer:** These are initial FaaS metric instruments and labels but more may be added in the future. | ||
|
||
## Metric Instruments | ||
|
||
The following metric instruments MUST be used to describe FaaS operations. They MUST be of the specified | ||
type and units. | ||
|
||
Naming conventions follow [FaaS Trace Semantics](/open-telemetry/opentelemetry-specification/blob/master/specification/trace/semantic_conventions/faas.md) wherever possible. | ||
kolanos marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
||
### FaaS Invocations | ||
|
||
Below is a table of FaaS invocation metric instruments. | ||
|
||
| Name | Instrument | Units | Description | | ||
kolanos marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|------|------------|-------|-------------| | ||
| `faas.execution_duration` | ValueRecorder | milliseconds | Measures the duration of the invocation, the time the function spent processing an event. | | ||
| `faas.init_duration` | ValueRecorder | milliseconds | Measures the duration of the function's initialization, such as a cold start | | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I'm not really an expert in FAAS, so maybe this is obvious to some of you, but I don't know the relationship between invoke_duration and init_duration. Is init duration included in invoke duration? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @justinfoote Good question. AWS Lambda considers an invocation's duration inclusive of any initialization (cold starts). So a There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This is a tough question, and I'd love for someone with more experience with serverless architecture in general to weigh in. I'm curious about how this is represented in tracing. I know that there's a top-level |
||
| `faas.timeouts` | Counter | number of timeouts | number of invocation timeouts. A timeout is an execution that reaches or exceeds configured execution time limits. | | ||
| `faas.throttles` | Counter | number of throttles | number of invocation throttles. A throttle is an invocation rejected when concurrrency limits are reached or exceeded. | | ||
| `faas.concurrent_executions` | UpDownCounter | number of concurrent executions | The current number of function instances that are processing events. | | ||
Comment on lines
+25
to
+26
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I am not sure about these two. I don't think an instrumented function can report these values. I see these as metrics that a backend can compute aggregating the data it receives. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Are these metrics limited to only values that can be collected from within a function? Every FaaS platform that I researched has a way to extract these metrics, however in most cases it is via an API external to the function itself. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. That makes sense to me, but if the metrics are collected using an API, I think they'll need to use an asynchronous instrument. I think maybe this means it should be a There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think you are right. We should not specify only data that can be collected within a function! :) |
||
|
||
## Labels | ||
|
||
Below is a table of the labels that SHOULD be included on FaaS metric events. | ||
|
||
Naming conventions follow [FaaS Trace Semantics](/open-telemetry/opentelemetry-specification/blob/master/specification/trace/semantic_conventions/faas.md) wherever possible. | ||
kolanos marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
||
| Name | Recommended | Notes and examples | | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. You can generate this table with:
and then you can use the semantic convention generator :) There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. ...not just yet. I'll have a PR to add metric semantic convention generation soon, and then we can update all the metric semantic conventions with generated tables in a single PR later. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This table looks identical to a semantic convention table. How do you plan to change the render of metric? But I guess this discussion does not belong to this PR, I will wait for your PR to update the tool :) |
||
|------|-------------|--------------------| | ||
| `faas.trigger` | Yes | Type of the trigger on which the function is invoked. SHOULD be one of: `datasource`, `http`, `pubsub`, `timer`, `other`. See: [Function Trigger Types](/open-telemetry/opentelemetry-specification/blob/master/specification/trace/semantic_conventions/faas.md) | | ||
kolanos marked this conversation as resolved.
Show resolved
Hide resolved
|
||
| `faas.invoked_name` | Yes | Name of the invoked function. Example: `my-function` | | ||
| `faas.invoked_provider` | Yes | Cloud provider of the invoked function. Corresponds to the resource `cloud.provider`. Example: `aws` | | ||
| `faas.invoked_region` | Yes | Cloud provider region of invoked function. Corresponds to resource `cloud.region`. Example: `us-east-1` | | ||
| `faas.coldstart` | Yes | Whether or not the invocation was a cold start. | | ||
| `faas.error` | Yes | Whether or not the invocation resulted in an error. | | ||
|
||
## References | ||
|
||
### Metric Reference | ||
|
||
Below are links to documentation regarding metrics that are available with different | ||
FaaS providers. This list is not exhaustive. | ||
|
||
* [AWS Lambda Metrics](https://docs.aws.amazon.com/lambda/latest/dg/monitoring-metrics.html) | ||
* [Azure Functions Metrics](https://docs.microsoft.com/en-us/azure/azure-monitor/platform/metrics-supported) | ||
* [Google CloudFunctions Metrics](https://cloud.google.com/monitoring/api/metrics_gcp#gcp-cloudfunctions) | ||
* [OpenFaas Metrics](https://docs.openfaas.com/architecture/metrics/) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it mandatory to have a region ? What is the target for that convention, public cloud 'faas' only or also 'faas' running on-premise or somewhere else (like OpenFaaS or Knative which probably also would fall into that realm, even when no 'faas' but just 'serverless'. Tbh, faas as metrics here feels too narrow as you can cover with this metrics also serverless deployments, that are not a 'faas' (e.g. not centered around functions as a programming model but e.g containers that are operated in a serverless manner).