Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Replace pattern Issues with Transform processor #34123

Closed
kumachop2 opened this issue Jul 16, 2024 · 7 comments
Closed

Replace pattern Issues with Transform processor #34123

kumachop2 opened this issue Jul 16, 2024 · 7 comments

Comments

@kumachop2
Copy link

kumachop2 commented Jul 16, 2024

Component(s)

processor/transform

Describe the issue you're reporting

Have requirement to mask fullName value in application logs. Since we are migrating from O11y to Splunk Cloud, thought of implementing the transform process at client side (Splunk Agent). Created RegEx and its working fine in regex101 but some reason its not functioning at processor end. I have not noticed any process errors in agent logs. Here is splunk agent config and sample raw message.

Helm: 0.103.0
Azure AKS Clusters

agent:
enabled: true
securityContext:
runAsUser: 20000
runAsGroup: 20000
config:
processors:
transform:
log_statements:
- context: log
statements:
- replace_pattern(attributes["body.message"], "(."fullName\\\\\\\":\\\\\\\")(.?)(\\\\\\\".*?)", "$1xxx$3")

Tried with below regex as well:
- replace_pattern(attributes["body.message"], "(."fullName\\\":\\\")(.?)(\\\".*?)", "$1xxx$3")

Raw Message:

{"timestamp":"2024-07-16T17:51:27.133135994Z","level":"DEBUG","trace_id":"c06882917a367265fea5042cbce4b632","span_id":"0d31813def1db136","message":"our JSON body: "{\"query\":\"mutation InitiateDeposit__money_movement_microservice__0($input:InitiateDepositInput!){initiateDeposit(input:$input){depositId}}\",\"operationName\":\"InitiateDeposit__money_movement_microservice__0\",\"variables\":{\"input\":{\"accountId\":\"lHmY2QJKoESJyaRfvzVV4DPG6mp18yt3fPsBsoJHoYn7BzSo39yoGw==\",\*"fullName\":\"John Moore\"}}}**"","target":"apollo_router::services::subgraph_service","spans":[{"http.method":"POST","http.request.method":"POST","http.route":"/gateway","http.flavor":"HTTP/1.1","name":"request"},{"http.method":"POST","http.request.method":"POST","http.route":"/gateway","http.flavor":"HTTP/1.1","trace_id":"c06882917a367265fea5042cbce4b632","url.path":"/gateway","client.name":"","client.version":"","name":"router"},{"graphql.document":"mutation InitiateDeposit($input: InitiateDepositInput!) {\n initiateDeposit(input: $input) {\n depositId\n }\n}","graphql.operation.name":"InitiateDeposit","graphql.operation.name":"InitiateDeposit","name":"supergraph"},{"graphql.operation.type":"mutation","name":"execution"},{"apollo.subgraph.name":"money-movement-microservice","name":"fetch"},{"apollo.subgraph.name":"money-movement-microservice","graphql.document":"mutation InitiateDeposit__money_movement_microservice__0($input:InitiateDepositInput!){initiateDeposit(input:$input){depositId}}","graphql.operation.name":"InitiateDeposit__money_movement_microservice__0","subgraph.name":"money-movement-microservice","name":"subgraph"}],"resource":{"deployment.environment":"dev-qa","service.name":"***-federated-gateway","service.version":"1.45.1","process.executable.name":"router"}}

The expected message in Splunk logs should be masked.

{timestamp":"2024-07-16T17:51:27.133135994Z","level":"DEBUG","trace_id":"c06882917a367265fea5042cbce4b632","span_id":"0d31813def1db136","message":"our JSON body: "{\"query\":\"mutation InitiateDeposit__money_movement_microservice__0($input:InitiateDepositInput!){initiateDeposit(input:$input){depositId}}\",\"operationName\":\"InitiateDeposit__money_movement_microservice__0\",\"variables\":{\"input\":{\"accountId\":\"lHmY2QJKoESJyaRfvzVV4DPG6mp18yt3fPsBsoJHoYn7BzSo39yoGw==\",\"fullName\":\"xxx\"}}}","target":"apollo_router::services::subgraph_service","spans":[{"http.method":"POST","http.request.method":"POST","http.route":"/gateway","http.flavor":"HTTP/1.1","name":"request"},{"http.method":"POST","http.request.method":"POST","http.route":"/gateway","http.flavor":"HTTP/1.1","trace_id":"c06882917a367265fea5042cbce4b632","url.path":"/gateway","client.name":"","client.version":"","name":"router"},{"graphql.document":"mutation InitiateDeposit($input: InitiateDepositInput!) {\n initiateDeposit(input: $input) {\n depositId\n }\n}","graphql.operation.name":"InitiateDeposit","graphql.operation.name":"InitiateDeposit","name":"supergraph"},{"graphql.operation.type":"mutation","name":"execution"},{"apollo.subgraph.name":"money-movement-microservice","name":"fetch"},{"apollo.subgraph.name":"money-movement-microservice","graphql.document":"mutation InitiateDeposit__money_movement_microservice__0($input:InitiateDepositInput!){initiateDeposit(input:$input){depositId}}","graphql.operation.name":"InitiateDeposit__money_movement_microservice__0","subgraph.name":"money-movement-microservice","name":"subgraph"}],"resource":{"deployment.environment":"dev-qa","service.name":"***-federated-gateway","service.version":"1.45.1","process.executable.name":"router"}}

Thank You in Advance.

@kumachop2 kumachop2 added the needs triage New item requiring triage label Jul 16, 2024
@github-actions github-actions bot added the processor/transform Transform processor label Jul 16, 2024
Copy link
Contributor

Pinging code owners:

See Adding Labels via Comments if you do not have permissions to add labels yourself.

@evan-bradley
Copy link
Contributor

Could you try enabling debug logging to look at where your data is located inside the payload? Instructions to enable debug logging can be found here: https://github.com/open-telemetry/opentelemetry-collector-contrib/tree/main/pkg/ottl#troubleshooting. I suspect the body may not exist at the OTTL path you are specifying, and the regex isn't working because it isn't running on any data.

In particular, I'm not clear that this is the right path: attributes["body.message"]. The right path is likely just body if you haven't parsed the logs inside the Collector. If it has been parsed and the body is in the attributes, the right path would be attributes["body"]["message"].

@evan-bradley evan-bradley removed the needs triage New item requiring triage label Jul 17, 2024
@kumachop2
Copy link
Author

kumachop2 commented Jul 19, 2024

@evan-bradley Thank You very much for your prompt response. I learned that path is body. Turned on debug level and based on collector logs condition matched but its not redacting the value. Also I do noticed issues with conditions as well. Please review and advice.

Agent Config:

    log_statements:
      - context: log
        conditions:
          - attributes["k8s.container.name"] == "*-federated-gateway"	  
        statements:
          - replace_pattern(attributes["body"], "(.*)function(.*)", "$1--$2")

Collector Log:

2024-07-18T23:53:37.153Z	debug	ottl@v0.103.0/parser.go:268	TransformContext after statement execution	{"kind": "processor", "name": "transform", "pipeline": "logs", "statement": "replace_pattern(attributes[\"body\"], \"(.*)function(.*)\", \"--\")", "condition matched": true, "TransformContext": {"resource": {"attributes": {"com.splunk.sourcetype": "kube:container:********-federated-gateway", "com.splunk.source": "/var/log/pods/********_********-federated-gateway-route1-76f4d844d6-qcz5x_243efb14-eb0f-41ca-b903-e218692394c1/********-federated-gateway/0.log", "k8s.pod.uid": "243efb14-eb0f-41ca-b903-e218692394c1", "k8s.container.restart_count": "0", "k8s.container.name": "********-federated-gateway", "k8s.namespace.name": "********", "k8s.pod.name": "********-federated-gateway-route1-76f4d844d6-qcz5x", "cloud.provider": "azure", "cloud.platform": "azure_aks", "host.name": "aks-userpool-16002329-vmss_3", "cloud.region": "eastus", "host.id": "5827d12f-bcc2-49bb-955f-3b082c96037a", "cloud.account.id": "15d5381e-b094-40e4-8679-17ccfbb26d94", "azure.vm.name": "aks-userpool-16002329-vmss_3", "azure.vm.size": "Standard_D8s_v4", "azure.vm.scaleset.name": "aks-userpool-16002329-vmss", "azure.resourcegroup.name": "rg-nodepool-dev-usea", "os.type": "linux", "k8s.node.name": "aks-userpool-16002329-vmss000003", "k8s.cluster.name": "aks-********-middleware_primary-dev-usea", "deployment.environment": "dev"}, "dropped_attribute_count": 0}, "scope": {"attributes": {}, "dropped_attribute_count": 0, "name": "", "version": ""}, "log_record": {"attributes": {"log.iostream": "stdout", "logtag": "F"}, "body": "{\"timestamp\":\"2024-07-18T23:53:36.993027592Z\",\"level\":\"DEBUG\",\"trace_id\":\"1bed38e0eda6d848e23418eef7bfd9bc\",\"span_id\":\"3103a3108bdcaca0\",\"message\":\"subgraph_service function found\",\"target\":\"apollo_router::plugins::rhai\",\"spans\":[{\"http.method\":\"POST\",\"http.request.method\":\"POST\",\"http.route\":\"/gateway\",\"http.flavor\":\"HTTP/1.1\",\"name\":\"request\"},{\"http.method\":\"POST\",\"http.request.method\":\"POST\",\"http.route\":\"/gateway\",\"http.flavor\":\"HTTP/1.1\",\"trace_id\":\"1bed38e0eda6d848e23418eef7bfd9bc\",\"url.path\":\"/gateway\",\"client.name\":\"\",\"client.version\":\"\",\"name\":\"router\"},{\"graphql.document\":\"query Widgets($widgetParameters: [WidgetParameter!]!) {\\n  widgets(widgetParameters: $widgetParameters) {\\n    type\\n    url\\n    __typename\\n  }\\n}\",\"graphql.operation.name\":\"Widgets\",\"graphql.operation.name\":\"Widgets\",\"name\":\"supergraph\"},{\"graphql.operation.type\":\"query\",\"name\":\"execution\"},{\"apollo.subgraph.name\":\"evolved-microservice\",\"name\":\"fetch\"}],\"resource\":{\"service.version\":\"1.45.1\",\"deployment.environment\":\"dev\",\"service.name\":\"********-federated-gateway\",\"process.executable.name\":\"router\"}}", "dropped_attribute_count": 0, "flags": 0, "observed_time_unix_nano": 1721346817052828063, "severity_number": 0, "severity_text": "", "span_id": "0000000000000000", "time_unix_nano": 

@kumachop2
Copy link
Author

@evan-bradley Please do review above mentioned findings and provide your inputs. Much appreciated your timely response.

@evan-bradley
Copy link
Contributor

The path you want is just body. I would do something like this:

    log_statements:
      - context: log
        conditions:
          - attributes["k8s.container.name"] == "*-federated-gateway"	  
        statements:
          - replace_pattern(body, "(.*)function(.*)", "$1--$2")

This will only remove the word function from your body, though, so you may need to revise your regular expression a bit. I would consider using the ParseJSON function to parse the body and directly edit the message field on the body.

Copy link
Contributor

This issue has been inactive for 60 days. It will be closed in 60 days if there is no activity. To ping code owners by adding a component label, see Adding Labels via Comments, or if you are unsure of which component this issue relates to, please ping @open-telemetry/collector-contrib-triagers. If this issue is still relevant, please ping the code owners or leave a comment explaining why it is still relevant. Otherwise, please close it.

Pinging code owners:

See Adding Labels via Comments if you do not have permissions to add labels yourself.

@github-actions github-actions bot added the Stale label Sep 25, 2024
Copy link
Contributor

This issue has been closed as inactive because it has been stale for 120 days with no activity.

@github-actions github-actions bot closed this as not planned Won't fix, can't repro, duplicate, stale Nov 24, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants