[transform] Implementation of transform processing #8252

anuraaga · 2022-03-03T08:07:54Z

The transform processor currently supports transformation of traces. This issue tracks next steps

Make function invocation logic generic for use across signals
Add metrics data model
Add logs data model
Add an initial set of transformation functions on top of initial set and keep_keys

pureklkl · 2022-03-08T18:02:57Z

Just tried this processor, is there a way to check whether an attribute exists? like where attr["name"] != nil

aunshc · 2022-03-10T01:50:20Z

@anuraaga does the transform processor support using wildcards for operations with attributes through queries?
For example, set(attributes["http.*"], "/foo")

anuraaga · 2022-03-10T08:22:59Z

@pureklkl Thanks for the suggestion! nil being a supported literal did come up, we didn't add it in the first version but will try to get that in soon as it does seem important.

@aunshc Wildcards in the path expressions is not currently supported. Could you describe your use case for wildcards? Is it to remove attributes for entire namespaces?

aunshc · 2022-03-18T22:57:06Z

@anuraaga The use case is to perform actions like insert, update, delete listed for the attributesprocessor https://github.com/open-telemetry/opentelemetry-collector-contrib/tree/main/processor/attributesprocessor for all attributes within a namespace with wildcards (eg.db.*, http.*). Would that be in the scope of this processor?

pureklkl · 2022-03-29T06:44:55Z

I need following features for the metric side, and I think they are already supported for the span(feel free to to correct me):

Promote a metric label to a resource
Rename a promoted metric label
Replace a value for a resource attribute based on a condition
Keep selected metric labels only (i.e. only preserve 'foo' and 'bar' labels)

I am wondering what is the timeline to add support for metric side. If possible, I would also like to help the contribution to accelerate the progress.

anuraaga · 2022-03-30T05:18:16Z

Hi @pureklkl - indeed I think they are supported or would only require some small tweaks.

I am currently working on a change to make the core function handling logic generic, instead of only working with spans, which will enable adding the metrics data model. Should be able to send it out this week and it will hopefully be relatively mechanical to get metrics in after that.

TylerHelmuth · 2022-04-28T21:34:31Z

I can start working on adding the metrics data model

anuraaga · 2022-05-02T08:26:29Z

Thanks @TylerHelmuth - just one point about metrics which you may have already realized is we would scope the transformations to a point so need to expose the metrics descriptor as well in the field expressions, this is briefly mentioned in the design doc.

https://github.com/open-telemetry/opentelemetry-collector/blob/main/docs/processing.md#telemetry-query-language

Let me know if anything's not clear about that

TylerHelmuth · 2022-05-02T16:10:23Z

That helps a lot. Where did descriptor originate? I anticipated using metric.name, metric.description, metric.unit, and metric.data for the fields on Metric. Also, how are you thinking we should get access to some of the fields specific to a data type? For example, how should we access Sum. aggregation_temporality or Sum.is_montonic? Feels like we might need another virtual, or the ability to access the data type via descriptor.

TylerHelmuth · 2022-05-03T19:35:34Z

@anuraaga what do you think will be the best way to do the access functions for the data points themselves since there are multiple? Unlike traces and logs, attributes would apply to all data points on a Metric. Should the getter return a slice of all the attribute maps, with the position in the slice matching the position in the DataPoints slice, and the setter take a slice of attribute maps and set the attributes in the DataPoints slice based on the position?

There is cleanup that should be done, but could look like this

func accessAttributes() pathGetSetter {
	return pathGetSetter{
		getter: func(ctx common.TransformContext) interface{} {
			metric := ctx.GetItem().(pmetric.Metric)
			switch metric.DataType() {
			case pmetric.MetricDataTypeGauge:
				dps := metric.Gauge().DataPoints()
				dataPointAttrs := make([]pcommon.Map, dps.Len())
				for i := 0; i < dps.Len(); i++ {
					dataPointAttrs[i] = dps.At(i).Attributes()
				}
				return dataPointAttrs
			case pmetric.MetricDataTypeSum:
				dps := metric.Sum().DataPoints()
				dataPointAttrs := make([]pcommon.Map, dps.Len())
				for i := 0; i < dps.Len(); i++ {
					dataPointAttrs[i] = dps.At(i).Attributes()
				}
				return dataPointAttrs
			case pmetric.MetricDataTypeHistogram:
				dps := metric.Histogram().DataPoints()
				dataPointAttrs := make([]pcommon.Map, dps.Len())
				for i := 0; i < dps.Len(); i++ {
					dataPointAttrs[i] = dps.At(i).Attributes()
				}
				return dataPointAttrs
			case pmetric.MetricDataTypeExponentialHistogram:
				dps := metric.ExponentialHistogram().DataPoints()
				dataPointAttrs := make([]pcommon.Map, dps.Len())
				for i := 0; i < dps.Len(); i++ {
					dataPointAttrs[i] = dps.At(i).Attributes()
				}
				return dataPointAttrs
			case pmetric.MetricDataTypeSummary:
				dps := metric.Summary().DataPoints()
				dataPointAttrs := make([]pcommon.Map, dps.Len())
				for i := 0; i < dps.Len(); i++ {
					dataPointAttrs[i] = dps.At(i).Attributes()
				}
				return dataPointAttrs
			}
			return nil
		},
		setter: func(ctx common.TransformContext, val interface{}) {
			metric := ctx.GetItem().(pmetric.Metric)
			switch metric.DataType() {
			case pmetric.MetricDataTypeGauge:
				if attrs, ok := val.([]pcommon.Map); ok {
					dps := metric.Gauge().DataPoints()
					for i := 0; i < len(attrs); i++ {
						dps.At(i).Attributes().Clear()
						attrs[i].CopyTo(dps.At(i).Attributes()) 
					}
				}
			case pmetric.MetricDataTypeSum:
				if attrs, ok := val.([]pcommon.Map); ok {
					dps := metric.Sum().DataPoints()
					for i := 0; i < len(attrs); i++ {
						dps.At(i).Attributes().Clear()
						attrs[i].CopyTo(dps.At(i).Attributes())
					}
				}
			case pmetric.MetricDataTypeHistogram:
				if attrs, ok := val.([]pcommon.Map); ok {
					dps := metric.Histogram().DataPoints()
					for i := 0; i < len(attrs); i++ {
						dps.At(i).Attributes().Clear()
						attrs[i].CopyTo(dps.At(i).Attributes())
					}
				}
			case pmetric.MetricDataTypeExponentialHistogram:
				if attrs, ok := val.([]pcommon.Map); ok {
					dps := metric.ExponentialHistogram().DataPoints()
					for i := 0; i < len(attrs); i++ {
						dps.At(i).Attributes().Clear()
						attrs[i].CopyTo(dps.At(i).Attributes())
					}
				}
			case pmetric.MetricDataTypeSummary:
				if attrs, ok := val.([]pcommon.Map); ok {
					dps := metric.Summary().DataPoints()
					for i := 0; i < len(attrs); i++ {
						dps.At(i).Attributes().Clear()
						attrs[i].CopyTo(dps.At(i).Attributes())
					}
				}
			}
		},
	}
}

Or would the metricsTransformContext GetItem() return an individual data point instead of a whole Metric, and the access function is only dealing with one data point?

func accessAttributes() pathGetSetter {
	return pathGetSetter{
		getter: func(ctx common.TransformContext) interface{} {
			switch ctx.GetItem().(type) {
			case pmetric.NumberDataPoint:
				return ctx.GetItem().(pmetric.NumberDataPoint).Attributes()
			case pmetric.HistogramDataPoint:
				return ctx.GetItem().(pmetric.HistogramDataPoint).Attributes()
			case pmetric.ExponentialHistogramDataPoint:
				return ctx.GetItem().(pmetric.ExponentialHistogramDataPoint).Attributes()
			case pmetric.SummaryDataPoint:
				return ctx.GetItem().(pmetric.SummaryDataPoint).Attributes()
			}
			return nil
		},
		setter: func(ctx common.TransformContext, val interface{}) {
			switch ctx.GetItem().(type) {
			case pmetric.NumberDataPoint:
				if attrs, ok := val.(pcommon.Map); ok {
					ctx.GetItem().(pmetric.NumberDataPoint).Attributes().Clear()
					attrs.CopyTo(ctx.GetItem().(pmetric.NumberDataPoint).Attributes())
				}
			case pmetric.HistogramDataPoint:
				if attrs, ok := val.(pcommon.Map); ok {
					ctx.GetItem().(pmetric.HistogramDataPoint).Attributes().Clear()
					attrs.CopyTo(ctx.GetItem().(pmetric.HistogramDataPoint).Attributes())
				}
			case pmetric.ExponentialHistogramDataPoint:
				if attrs, ok := val.(pcommon.Map); ok {
					ctx.GetItem().(pmetric.ExponentialHistogramDataPoint).Attributes().Clear()
					attrs.CopyTo(ctx.GetItem().(pmetric.ExponentialHistogramDataPoint).Attributes())
				}
			case pmetric.SummaryDataPoint:
				if attrs, ok := val.(pcommon.Map); ok {
					ctx.GetItem().(pmetric.SummaryDataPoint).Attributes().Clear()
					attrs.CopyTo(ctx.GetItem().(pmetric.SummaryDataPoint).Attributes())
				}
			}
		},
	}
}

TylerHelmuth · 2022-05-03T19:51:07Z

Also, in the design doc the metrics are accessed by name:

create_gauge("pod.cpu.utilized", read_gauge("pod.cpu.usage") / read_gauge("node.cpu.limit")

In a situation like this, it looks like the read_gauge function needs access to the Gauge() of a metric with a name of "pod.cpu.usage". How would our switch statement handle a situation like this?

TylerHelmuth · 2022-05-03T22:03:46Z

I went forward with the DataPoint approach. I also see now why "descriptor" was chosen as the virtual. The draft PR can be found here. Would love feedback.

TylerHelmuth · 2022-05-11T18:57:01Z

The metrics data model has been merged, I will get started on wiring up the processor to the metrics pipeline.

TylerHelmuth · 2022-05-16T19:38:37Z

@anuraaga for the alpha release of the processor, what functions would you like to see added in addition to set, keep_keys, truncate_all, and limit?

anuraaga · 2022-05-17T01:19:28Z

I think replace_wildcards as described in the design doc would be nice too, cutting down on cardinality of data from highly generic instrumentation is an important use case and one that the other processors don't currently cover IIRC. That would probably be enough for an initial release.

TylerHelmuth · 2022-05-17T02:57:58Z

I'll get a PR out for replace_wildcards this week.

TylerHelmuth · 2022-05-18T18:18:26Z

PRs required before "Alpha" release:

TylerHelmuth · 2022-05-24T18:56:25Z

All required PRs have been merged. Once 0.52.0 is released I'll update the processor's status table.

anuraaga self-assigned this Mar 3, 2022

anuraaga mentioned this issue Mar 3, 2022

[processor/transform] Add business logic for handling traces queries. #7300

Merged

anuraaga mentioned this issue Mar 30, 2022

[processor/transform] Move function handling logic to common and generify transform context. #8972

Merged

alolita added processor/spanmetrics Span Metrics processor processor/metricstransform Metrics Transform processor labels Apr 4, 2022

anuraaga mentioned this issue Apr 14, 2022

[processor/transform] Add logs data model #9271

Merged

TylerHelmuth mentioned this issue May 3, 2022

[processor/transform] Add metrics data model #9719

Merged

TylerHelmuth mentioned this issue May 16, 2022

[processor/transform] Wire up metrics processing #10100

Merged

TylerHelmuth mentioned this issue May 26, 2022

[processor/transform] Updated status table #10367

Merged

codeboten closed this as completed in #10367 May 26, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[transform] Implementation of transform processing #8252

[transform] Implementation of transform processing #8252

anuraaga commented Mar 3, 2022 •

edited

Loading

pureklkl commented Mar 8, 2022

aunshc commented Mar 10, 2022

anuraaga commented Mar 10, 2022

aunshc commented Mar 18, 2022

pureklkl commented Mar 29, 2022

anuraaga commented Mar 30, 2022

TylerHelmuth commented Apr 28, 2022

anuraaga commented May 2, 2022

TylerHelmuth commented May 2, 2022

TylerHelmuth commented May 3, 2022 •

edited

Loading

TylerHelmuth commented May 3, 2022

TylerHelmuth commented May 3, 2022

TylerHelmuth commented May 11, 2022

TylerHelmuth commented May 16, 2022 •

edited

Loading

anuraaga commented May 17, 2022

TylerHelmuth commented May 17, 2022

TylerHelmuth commented May 18, 2022

TylerHelmuth commented May 24, 2022

[transform] Implementation of transform processing #8252

[transform] Implementation of transform processing #8252

Comments

anuraaga commented Mar 3, 2022 • edited Loading

pureklkl commented Mar 8, 2022

aunshc commented Mar 10, 2022

anuraaga commented Mar 10, 2022

aunshc commented Mar 18, 2022

pureklkl commented Mar 29, 2022

anuraaga commented Mar 30, 2022

TylerHelmuth commented Apr 28, 2022

anuraaga commented May 2, 2022

TylerHelmuth commented May 2, 2022

TylerHelmuth commented May 3, 2022 • edited Loading

TylerHelmuth commented May 3, 2022

TylerHelmuth commented May 3, 2022

TylerHelmuth commented May 11, 2022

TylerHelmuth commented May 16, 2022 • edited Loading

anuraaga commented May 17, 2022

TylerHelmuth commented May 17, 2022

TylerHelmuth commented May 18, 2022

TylerHelmuth commented May 24, 2022

anuraaga commented Mar 3, 2022 •

edited

Loading

TylerHelmuth commented May 3, 2022 •

edited

Loading

TylerHelmuth commented May 16, 2022 •

edited

Loading