Skip to content

Commit

Permalink
Add an optional argument to converters to support hashing (#27235)
Browse files Browse the repository at this point in the history
**Description:** Functions to modify matched text during replacement can
now be passed as optional arguments to the following Editors:
- replace_pattern
- replace_all_patterns
- replace_match
- replace_all_matches

**Documentation:**

https://github.com/rnishtala-sumo/opentelemetry-collector-contrib/blob/ottl-replace-pattern/pkg/ottl/ottlfuncs/README.md#replace_pattern

**Issue:**
Resolves
#22787
  • Loading branch information
rnishtala-sumo authored Oct 11, 2023
1 parent e5e0aa0 commit 14ea97d
Show file tree
Hide file tree
Showing 13 changed files with 565 additions and 70 deletions.
32 changes: 32 additions & 0 deletions .chloggen/ottl-replace-pattern.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,32 @@
# Use this changelog template to create an entry for release notes.

# One of 'breaking', 'deprecation', 'new_component', 'enhancement', 'bug_fix'
change_type: enhancement

# The name of the component, or a single word describing the area of concern, (e.g. filelogreceiver)
component: pkg/ottl

# A brief description of the change. Surround your text with quotes ("") if it needs to start with a backtick (`).
note: Add optional Converter parameters to replacement Editors

# Mandatory: One or more tracking issues related to the change. You can use the PR number here if no issue exists.
issues: [27235]

# (Optional) One or more lines of additional information to render under the primary note.
# These lines will be padded with 2 spaces and then inserted directly into the document.
# Use pipe (|) for multiline entries.
subtext: |
Functions to modify matched text during replacement can now be passed as optional arguments to the following Editors:
- `replace_pattern`
- `replace_all_patterns`
- `replace_match`
- `replace_all_matches`
# If your change doesn't affect end users or the exported elements of any package,
# you should instead start your pull request title with [chore] or use the "Skip Changelog" label.
# Optional: The change log or logs in which this entry should be included.
# e.g. '[user]' or '[user, api]'
# Include 'user' if the change is relevant to end users.
# Include 'api' if there is a change to a library API.
# Default: '[user]'
change_logs: [user]
12 changes: 6 additions & 6 deletions pkg/ottl/expression.go
Original file line number Diff line number Diff line change
Expand Up @@ -254,21 +254,21 @@ type FunctionGetter[K any] interface {

// StandardFunctionGetter is a basic implementation of FunctionGetter.
type StandardFunctionGetter[K any] struct {
fCtx FunctionContext
fact Factory[K]
FCtx FunctionContext
Fact Factory[K]
}

// Get takes an Arguments struct containing arguments the caller wants passed to the
// function and instantiates the function with those arguments.
// If there is a mismatch between the function's signature and the arguments the caller
// wants to pass to the function, an error is returned.
func (g StandardFunctionGetter[K]) Get(args Arguments) (Expr[K], error) {
if g.fact == nil {
if g.Fact == nil {
return Expr[K]{}, fmt.Errorf("undefined function")
}
fArgs := g.fact.CreateDefaultArguments()
fArgs := g.Fact.CreateDefaultArguments()
if reflect.TypeOf(fArgs).Kind() != reflect.Pointer {
return Expr[K]{}, fmt.Errorf("factory for %q must return a pointer to an Arguments value in its CreateDefaultArguments method", g.fact.Name())
return Expr[K]{}, fmt.Errorf("factory for %q must return a pointer to an Arguments value in its CreateDefaultArguments method", g.Fact.Name())
}
if reflect.TypeOf(args).Kind() != reflect.Pointer {
return Expr[K]{}, fmt.Errorf("%q must be pointer to an Arguments value", reflect.TypeOf(args).Kind())
Expand All @@ -282,7 +282,7 @@ func (g StandardFunctionGetter[K]) Get(args Arguments) (Expr[K], error) {
field := argsVal.Field(i)
fArgsVal.Field(i).Set(field)
}
fn, err := g.fact.CreateFunction(g.fCtx, fArgs)
fn, err := g.Fact.CreateFunction(g.FCtx, fArgs)
if err != nil {
return Expr[K]{}, fmt.Errorf("couldn't create function: %w", err)
}
Expand Down
8 changes: 4 additions & 4 deletions pkg/ottl/expression_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -720,7 +720,7 @@ func Test_FunctionGetter(t *testing.T) {
return "str", nil
},
},
function: StandardFunctionGetter[any]{fCtx: FunctionContext{Set: componenttest.NewNopTelemetrySettings()}, fact: functions["SHA256"]},
function: StandardFunctionGetter[any]{FCtx: FunctionContext{Set: componenttest.NewNopTelemetrySettings()}, Fact: functions["SHA256"]},
want: "anything",
valid: true,
},
Expand All @@ -731,7 +731,7 @@ func Test_FunctionGetter(t *testing.T) {
return nil, nil
},
},
function: StandardFunctionGetter[any]{fCtx: FunctionContext{Set: componenttest.NewNopTelemetrySettings()}, fact: functions["SHA250"]},
function: StandardFunctionGetter[any]{FCtx: FunctionContext{Set: componenttest.NewNopTelemetrySettings()}, Fact: functions["SHA250"]},
want: "anything",
valid: false,
expectedErrorMsg: "undefined function",
Expand All @@ -743,7 +743,7 @@ func Test_FunctionGetter(t *testing.T) {
return nil, nil
},
},
function: StandardFunctionGetter[any]{fCtx: FunctionContext{Set: componenttest.NewNopTelemetrySettings()}, fact: functions["test_arg_mismatch"]},
function: StandardFunctionGetter[any]{FCtx: FunctionContext{Set: componenttest.NewNopTelemetrySettings()}, Fact: functions["test_arg_mismatch"]},
want: "anything",
valid: false,
expectedErrorMsg: "incorrect number of arguments. Expected: 4 Received: 1",
Expand All @@ -755,7 +755,7 @@ func Test_FunctionGetter(t *testing.T) {
return nil, nil
},
},
function: StandardFunctionGetter[any]{fCtx: FunctionContext{Set: componenttest.NewNopTelemetrySettings()}, fact: functions["cannot_create_function"]},
function: StandardFunctionGetter[any]{FCtx: FunctionContext{Set: componenttest.NewNopTelemetrySettings()}, Fact: functions["cannot_create_function"]},
want: "anything",
valid: false,
expectedErrorMsg: "couldn't create function: error",
Expand Down
2 changes: 1 addition & 1 deletion pkg/ottl/functions.go
Original file line number Diff line number Diff line change
Expand Up @@ -123,7 +123,7 @@ func (p *Parser[K]) buildArgs(ed editor, argsVal reflect.Value) error {
if !ok {
return fmt.Errorf("undefined function %s", name)
}
val = StandardFunctionGetter[K]{fCtx: FunctionContext{Set: p.telemetrySettings}, fact: f}
val = StandardFunctionGetter[K]{FCtx: FunctionContext{Set: p.telemetrySettings}, Fact: f}
case fieldType.Kind() == reflect.Slice:
val, err = p.buildSliceArg(arg.Value, fieldType)
default:
Expand Down
62 changes: 39 additions & 23 deletions pkg/ottl/ottlfuncs/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -16,6 +16,7 @@ Some functions are able to handle different types and will generally convert tho
In these situations the function will error if it does not know how to do the conversion.
Use `ErrorMode` to determine how the `Statement` handles these errors.
See the component-specific guides for how each uses error mode:

- [filterprocessor](https://github.com/open-telemetry/opentelemetry-collector-contrib/tree/main/processor/filterprocessor#ottl)
- [routingprocessor](https://github.com/open-telemetry/opentelemetry-collector-contrib/tree/main/processor/routingprocessor#tech-preview-opentelemetry-transformation-language-statements-as-routing-conditions)
- [transformprocessor](https://github.com/open-telemetry/opentelemetry-collector-contrib/tree/main/processor/transformprocessor#config)
Expand All @@ -26,11 +27,12 @@ Editors are what OTTL uses to transform telemetry.

Editors:

- Are allowed to transform telemetry. When a Function is invoked the expectation is that the underlying telemetry is modified in some way.
- May have side effects. Some Functions may generate telemetry and add it to the telemetry payload to be processed in this batch.
- May return values. Although not common and not required, Functions may return values.
- Are allowed to transform telemetry. When a Function is invoked the expectation is that the underlying telemetry is modified in some way.
- May have side effects. Some Functions may generate telemetry and add it to the telemetry payload to be processed in this batch.
- May return values. Although not common and not required, Functions may return values.

Available Editors:

- [delete_key](#delete_key)
- [delete_matching_keys](#delete_matching_keys)
- [keep_keys](#keep_keys)
Expand All @@ -55,8 +57,8 @@ The key will be deleted from the map.

Examples:

- `delete_key(attributes, "http.request.header.authorization")`

- `delete_key(attributes, "http.request.header.authorization")`

- `delete_key(resource.attributes, "http.request.header.authorization")`

Expand All @@ -72,8 +74,8 @@ All keys that match the pattern will be deleted from the map.

Examples:

- `delete_key(attributes, "http.request.header.authorization")`

- `delete_key(attributes, "http.request.header.authorization")`

- `delete_key(resource.attributes, "http.request.header.authorization")`

Expand Down Expand Up @@ -126,6 +128,7 @@ The `merge_maps` function merges the source map into the target map using the su
`target` is a `pdata.Map` type field. `source` is a `pdata.Map` type field. `strategy` is a string that must be one of `insert`, `update`, or `upsert`.

If strategy is:

- `insert`: Insert the value from `source` into `target` where the key does not already exist.
- `update`: Update the entry in `target` with the value from `source` where the key does exist.
- `upsert`: Performs insert or update. Insert the value from `source` into `target` where the key does not already exist and update the entry in `target` with the value from `source` where the key does exist.
Expand All @@ -144,11 +147,11 @@ Examples:

### replace_all_matches

`replace_all_matches(target, pattern, replacement)`
`replace_all_matches(target, pattern, replacement, function)`

The `replace_all_matches` function replaces any matching string value with the replacement string.

`target` is a path expression to a `pdata.Map` type field. `pattern` is a string following [filepath.Match syntax](https://pkg.go.dev/path/filepath#Match). `replacement` is either a path expression to a string telemetry field or a literal string.
`target` is a path expression to a `pdata.Map` type field. `pattern` is a string following [filepath.Match syntax](https://pkg.go.dev/path/filepath#Match). `replacement` is either a path expression to a string telemetry field or a literal string. `function` is an optional argument that can take in any Converter that accepts a (`replacement`) string and returns a string. An example is a hash function that replaces any matching string with the hash value of `replacement`.

Each string value in `target` that matches `pattern` will get replaced with `replacement`. Non-string values are ignored.

Expand All @@ -158,10 +161,11 @@ There is currently a bug with OTTL that does not allow the pattern to end with `
Examples:

- `replace_all_matches(attributes, "/user/*/list/*", "/user/{userId}/list/{listId}")`
- `replace_all_matches(attributes, "/user/*/list/*", "/user/{userId}/list/{listId}", SHA256)`

### replace_all_patterns

`replace_all_patterns(target, mode, regex, replacement)`
`replace_all_patterns(target, mode, regex, replacement, function)`

The `replace_all_patterns` function replaces any segments in a string value or key that match the regex pattern with the replacement string.

Expand All @@ -173,6 +177,8 @@ If one or more sections of `target` match `regex` they will get replaced with `r

The `replacement` string can refer to matched groups using [regexp.Expand syntax](https://pkg.go.dev/regexp#Regexp.Expand).

The `function` is an optional argument that can take in any Converter that accepts a (`replacement`) string and returns a string. An example is a hash function that replaces any matching regex pattern with the hash value of `replacement`.

There is currently a bug with OTTL that does not allow the pattern to end with `\\"`.
If your pattern needs to end with backslashes, add something inconsequential to the end of the pattern such as `{1}`, `$`, or `.*`.
[See Issue 23238 for details](https://github.com/open-telemetry/opentelemetry-collector-contrib/issues/23238).
Expand All @@ -182,31 +188,35 @@ Examples:
- `replace_all_patterns(attributes, "value", "/account/\\d{4}", "/account/{accountId}")`
- `replace_all_patterns(attributes, "key", "/account/\\d{4}", "/account/{accountId}")`
- `replace_all_patterns(attributes, "key", "^kube_([0-9A-Za-z]+_)", "k8s.$$1.")`
- `replace_all_patterns(attributes, "key", "^kube_([0-9A-Za-z]+_)", "k8s.$$1.", SHA256)`

Note that when using OTTL within the collector's configuration file, `$` must be escaped to `$$` to bypass
environment variable substitution logic. To input a literal `$` from the configuration file, use `$$$`.
If using OTTL outside of collector configuration, `$` should not be escaped and a literal `$` can be entered using `$$`.

### replace_match

`replace_match(target, pattern, replacement)`
`replace_match(target, pattern, replacement, function)`

The `replace_match` function allows replacing entire strings if they match a glob pattern.

`target` is a path expression to a telemetry field. `pattern` is a string following [filepath.Match syntax](https://pkg.go.dev/path/filepath#Match). `replacement` is either a path expression to a string telemetry field or a literal string.

If `target` matches `pattern` it will get replaced with `replacement`.

The `function` is an optional argument that can take in any Converter that accepts a (`replacement`) string and returns a string. An example is a hash function that replaces any matching glob pattern with the hash value of `replacement`.

There is currently a bug with OTTL that does not allow the pattern to end with `\\"`.
[See Issue 23238 for details](https://github.com/open-telemetry/opentelemetry-collector-contrib/issues/23238).

Examples:

- `replace_match(attributes["http.target"], "/user/*/list/*", "/user/{userId}/list/{listId}")`
- `replace_match(attributes["http.target"], "/user/*/list/*", "/user/{userId}/list/{listId}", SHA256)`

### replace_pattern

`replace_pattern(target, regex, replacement)`
`replace_pattern(target, regex, replacement, function)`

The `replace_pattern` function allows replacing all string sections that match a regex pattern with a new value.

Expand All @@ -216,6 +226,8 @@ If one or more sections of `target` match `regex` they will get replaced with `r

The `replacement` string can refer to matched groups using [regexp.Expand syntax](https://pkg.go.dev/regexp#Regexp.Expand).

The `function` is an optional argument that can take in any Converter that accepts a (`replacement`) string and returns a string. An example is a hash function that replaces a matching regex pattern with the hash value of `replacement`.

There is currently a bug with OTTL that does not allow the pattern to end with `\\"`.
If your pattern needs to end with backslashes, add something inconsequential to the end of the pattern such as `{1}`, `$`, or `.*`.
[See Issue 23238 for details](https://github.com/open-telemetry/opentelemetry-collector-contrib/issues/23238).
Expand All @@ -224,6 +236,7 @@ Examples:

- `replace_pattern(resource.attributes["process.command_line"], "password\\=[^\\s]*(\\s?)", "password=***")`
- `replace_pattern(name, "^kube_([0-9A-Za-z]+_)", "k8s.$$1.")`
- `replace_pattern(name, "^kube_([0-9A-Za-z]+_)", "k8s.$$1.", SHA256)`

Note that when using OTTL within the collector's configuration file, `$` must be escaped to `$$` to bypass
environment variable substitution logic. To input a literal `$` from the configuration file, use `$$$`.
Expand Down Expand Up @@ -275,6 +288,7 @@ Converters are pure functions that take OTTL values as input and output a single
Unlike functions, they do not modify any input telemetry and always return a value.

Available Converters:

- [Concat](#concat)
- [ConvertCase](#convertcase)
- [ExtractPatterns](#extractpatterns)
Expand Down Expand Up @@ -373,9 +387,9 @@ Examples:

The `ExtractPatterns` Converter returns a `pcommon.Map` struct that is a result of extracting named capture groups from the target string. If not matches are found then an empty `pcommon.Map` is returned.

`target` is a Getter that returns a string. `pattern` is a regex string.
`target` is a Getter that returns a string. `pattern` is a regex string.

If `target` is not a string or nil `ExtractPatterns` will return an error. If `pattern` does not contain at least 1 named capture group then `ExtractPatterns` will error on startup.
If `target` is not a string or nil `ExtractPatterns` will return an error. If `pattern` does not contain at least 1 named capture group then `ExtractPatterns` will error on startup.

Examples:

Expand Down Expand Up @@ -425,10 +439,11 @@ The `Int` Converter converts the `value` to int type.
The returned type is int64.

The input `value` types:
* float64. Fraction is discharged (truncation towards zero).
* string. Trying to parse an integer from string if it fails then nil will be returned.
* bool. If `value` is true, then the function will return 1 otherwise 0.
* int64. The function returns the `value` without changes.

- float64. Fraction is discharged (truncation towards zero).
- string. Trying to parse an integer from string if it fails then nil will be returned.
- bool. If `value` is true, then the function will return 1 otherwise 0.
- int64. The function returns the `value` without changes.

If `value` is another type or parsing failed nil is always returned.

Expand Down Expand Up @@ -531,8 +546,8 @@ If target is not a float64, it will be converted to one:

- int64s are converted to float64s
- strings are converted using `strconv`
- booleans are converted using `1` for `true` and `0` for `false`. This means passing `false` to the function will cause an error.
- int, float, string, and bool OTLP Values are converted following the above rules depending on their type. Other types cause an error.
- booleans are converted using `1` for `true` and `0` for `false`. This means passing `false` to the function will cause an error.
- int, float, string, and bool OTLP Values are converted following the above rules depending on their type. Other types cause an error.

If target is nil an error is returned.

Expand Down Expand Up @@ -713,7 +728,7 @@ Examples:

### Split

`Split(target, delimiter)`
```Split(target, delimiter)```

The `Split` Converter separates a string by the delimiter, and returns an array of substrings.

Expand All @@ -726,7 +741,7 @@ There is currently a bug with OTTL that does not allow the target string to end

Examples:

- ```Split("A|B|C", "|")```
- `Split("A|B|C", "|")`

### Substring

Expand Down Expand Up @@ -773,7 +788,7 @@ Examples:

The `TruncateTime` Converter returns the given time rounded down to a multiple of the given duration. The Converter [uses the `time.Truncate` function](https://pkg.go.dev/time#Time.Truncate).

`time` is a `time.Time`. `duration` is a `time.Duration`. If `time` is not a `time.Time` or if `duration` is not a `time.Duration`, an error will be returned.
`time` is a `time.Time`. `duration` is a `time.Duration`. If `time` is not a `time.Time` or if `duration` is not a `time.Duration`, an error will be returned.

While some common paths can return a `time.Time` object, you will most like need to use the [Duration Converter](#duration) to create a `time.Duration`.

Expand Down Expand Up @@ -846,9 +861,10 @@ The `UUID` function generates a v4 uuid string.
## Function syntax

Functions should be named and formatted according to the following standards.

- Function names MUST start with a verb unless it is a Factory that creates a new type.
- Converters MUST be UpperCamelCase.
- Function names that contain multiple words MUST separate those words with `_`.
- Functions that interact with multiple items MUST have plurality in the name. Ex: `truncate_all`, `keep_keys`, `replace_all_matches`.
- Functions that interact with a single item MUST NOT have plurality in the name. If a function would interact with multiple items due to a condition, like `where`, it is still considered singular. Ex: `set`, `delete`, `replace_match`.
- Functions that interact with multiple items MUST have plurality in the name. Ex: `truncate_all`, `keep_keys`, `replace_all_matches`.
- Functions that interact with a single item MUST NOT have plurality in the name. If a function would interact with multiple items due to a condition, like `where`, it is still considered singular. Ex: `set`, `delete`, `replace_match`.
- Functions that change a specific target MUST set the target as the first parameter.
Loading

0 comments on commit 14ea97d

Please sign in to comment.