Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

FluentD throwing json_parse exception for Nginx Ingress logs #4202

Closed
max-blue opened this issue Jun 16, 2023 · 1 comment
Closed

FluentD throwing json_parse exception for Nginx Ingress logs #4202

max-blue opened this issue Jun 16, 2023 · 1 comment

Comments

@max-blue
Copy link

max-blue commented Jun 16, 2023

Describe the bug

I'm experiencing an issue when Fluentd attempts to send logs to Elasticsearch. The problem seems to be related to parsing the log data. The specific error message is "Failed to parse content to map" and "Invalid UTF-8 middle byte 0x5c", suggesting a possible issue with the encoding of the data being sent to Elasticsearch. This only happens with Nginx Ingress controller logs. Elasticsearch gets like 4 or 5 hits for nginx when it should be around 15K.

To Reproduce

Deploy fluentD on K8S cluster using the Daemonset
Install Ingress Nginx Controller using helm
Add the below config to FluentD
Head over the Kibana and search for nginx-ingress

Expected behavior

Fluentd should correctly send logs to Elasticsearch without encountering parsing errors.

Your Environment

- Fluentd version: v4.3.9
- Operating system: Amazon Linux 2 Kubernetes 
- Kernel version: 5.10.173-154.642.amzn2.x86_64
- Fluent Helm Chart version: 13.10.0
- Nginx Ingress Controller Version: 1.21

Your Configuration

FluentD helm Values

fluentdElasticsearch:
  enabled: true
  syncPolicy: {}
  version: "13.10.0"
  values:
    elasticsearch:
      auth:
        enabled: true
        user: "kubernetes"
        existingSecret:
          name: elasticsearch
          key: password
      buffer:
        chunkLimitSize: 4M
      hosts:
        - "mycluster.us-east-1.aws.found.io:9243"
      log400Reason: true
      logLevel: warn
      logstash:
        prefix: logs-default
      requestTimeout: 15s
      scheme: https
      suppressTypeName: true
    env:
      FLUENT_UID: '0'
      LIVENESS_THRESHOLD_SECONDS: 900
      STUCK_THRESHOLD_SECONDS: 1800
      TZ: US/Eastern
    priorityClassName: system-node-critical
    resources:
      limits:
        cpu: 500m
        memory: 2Gi
      requests:
        cpu: 100m
        memory: 2Gi
    serviceMonitor:
      enabled: true
    tolerations:
      - operator: Exists
    image:
      repository: ecr.aws/fluentd
      tag: v4.3.9
      pullPolicy: Always
    livenessProbe:
      enabled: false
    configMaps:
      useDefaults:
        containersInputConf: false
        systemInputConf: false
    extraConfigMaps:
      containers.input.conf: |-
        <source>
          @id fluentd-containers.log
          @type tail
          path /var/log/containers/*.log
          pos_file /var/log/containers.log.pos
          tag raw.kubernetes.*
          read_from_head true
          <parse>
            @type multi_format
            <pattern>
              format json
              time_key time
              time_format %Y-%m-%dT%H:%M:%S.%NZ
            </pattern>
            <pattern>
              format /^(?<time>.+) (?<stream>stdout|stderr) [^ ]* (?<log>.*)$/
              time_format %Y-%m-%dT%H:%M:%S.%N%:z
            </pattern>
          </parse>
        </source>
        # Detect exceptions in the log output and forward them as one log entry.
        <match raw.kubernetes.**>
          @id raw.kubernetes
          @type detect_exceptions
          remove_tag_prefix raw
          message log
          stream stream
          multiline_flush_interval 5
          max_bytes 500000
          max_lines 1000
        </match>
        <filter kubernetes.**>
          @id filter_kubernetes_metadata
          @type kubernetes_metadata
        </filter>
        # Concatenate multi-line logs
        <filter kubernetes.**>
          @type concat
          key "message"
          stream_identity_key "$.docker.container_id"
          multiline_end_regexp "/\n$/"
          separator ""
          timeout_label @NORMAL
          flush_interval 5
        </filter>
        <filter kubernetes.**>
          @type record_transformer
          <record>
            cluster "#{ENV['CLUSTER_NAME']}"
          </record>
        </filter>
        # Fixes json fields in Elasticsearch
        <filter kubernetes.**>
          @id filter_parser
          @type parser
          key_name log
          reserve_data true
          remove_key_name_field false
          <parse>
            @type multi_format
            <pattern>
              format json
              time_key ""
            </pattern>
            <pattern>
              format none
            </pattern>
          </parse>
        </filter>
        <match /.+fluentd-elasticsearch.+/>
          @type rewrite_tag_filter
          <rule>
            key message
            pattern /Rejected by Elasticsearch/
            tag elasticsearch-rejected
          </rule>
        </match>
        <match /+.(elastalert|fluentd|kubecost|metricbeat|my-collector).+/>
          @type null
        </match>
        <filter kubernetes.**>
          @type grep
          <exclude>
            key message
            pattern /^[\n\r\s]*$/
          </exclude>
        </filter>
        <filter kubernetes.**>
          @type grep
          <exclude>
            key message
            pattern /\/manage\/(health|info)/
          </exclude>
        </filter>
        <filter kubernetes.**>
          @type grep
          <exclude>
            key $.kubernetes.container_name
            pattern /^(my-collector|logstash|liveness-probe)$/
          </exclude>
        </filter>
        <filter kubernetes.**>
          @type grep
          <exclude>
            key message
            pattern /^Read timed out$/
          </exclude>
        </filter>
        <filter kubernetes.**>
          @type grep
          <exclude>
            key message
            pattern /Successfully connected to (.+):6161[67]/
          </exclude>
        </filter>
        <filter kubernetes.**>
          @type grep
          <exclude>
            key message
            pattern /Adding headers AuthenticatedUser, Authorization and x-company-userinfo/
          </exclude>
        </filter>
        <filter kubernetes.**>
          @type grep
          <exclude>
            key message
            pattern /^\[UserRolesAndPermissionsClient#fetchUserRolesAndPermissions\]/
          </exclude>
        </filter>
        <filter kubernetes.**>
          @type grep
          <exclude>
            key message
            pattern /^\[Consumer clientId=gist-service-.+ groupId=myGroup\]/
          </exclude>
        </filter>
        <filter kubernetes.**>
          @type grep
          <exclude>
            key message
            pattern /Inside method fetchUserRolesAndPermissions to fetch user permissions and child role/
          </exclude>
        </filter>
        <filter kubernetes.**>
          @type grep
          <exclude>
            key message
            pattern /(START|END) request GET:/v1/live-view/data/
          </exclude>
        </filter>
        <filter kubernetes.**>
          @type grep
          <exclude>
            key message
            pattern /^x-company-userinfo token null$/
          </exclude>
        </filter>
        <filter kubernetes.**>
          @type grep
          <exclude>
            key message
            pattern /DefaultDnsRawRecord/
          </exclude>
        </filter>
        <filter kubernetes.**>
          @type grep
          <exclude>
            key message
            pattern /DefaultDnsQuestion/
          </exclude>
        </filter>
        <filter kubernetes.var.log.containers.gist-**>
          @type grep
          <exclude>
            key message
            pattern /Get scheduled entry/
          </exclude>
        </filter>
        <filter kubernetes.**.database-*-0_*_database-**>
          @type concat
          key message
          multiline_start_regexp /^\d{4}(-\d{1,2}){2} \d{2}(:\d{1,2}){2}\.\d{3} [A-Z]{3} \[\d*\]/
          flush_interval 5
          timeout_label @NORMAL
        </filter>
        <filter kubernetes.**.database-*-0_*_database-**>
          @type grep
            <exclude>
              key message
              pattern /unsupported\ frontend\ protocol/
            </exclude>
        </filter>
        <filter kubernetes.var.log.containers.**>
          @type concat
          key message
          stream_identity_key "$.docker.container_id"
          multiline_start_regexp /^(\[(TRACE|DEBUG|INFO|WARN|ERROR|FATAL)\] )?((\[)?(\d{4}(-\d{1,2}){2})|(\d{1,2}-\D{3}-\d{4}))[T\ ]\d{1,2}(:\d{1,2}){2}([\.\,]\d{3,6})?(\])?/
          flush_interval 5
          timeout_label @NORMAL
        </filter>
        <filter kubernetes.var.log.containers.licensing-**>
          @type concat
          key message
          multiline_start_regexp /^\[?20\d{2}(-\d{1,2}){2}/
          flush_interval 5
          timeout_label @NORMAL
        </filter>
        <filter kubernetes.var.log.containers.licensing-**>
          @type parser
          key_name message
          reserve_data true
          reserve_time true
          <parse>
            @type multiline
            format_firstline /^\[?20\d{2}(-\d{1,2}){2}/
            format1 /^(?<message_time>\[?20\d{2}(-\d{1,2}){2}( |T){1}\d{1,2}:\d{1,2}:\d{2}\.\d{3}Z?\]?) (?<message>.*)/
          </parse>
        </filter>
        <filter kubernetes.var.log.containers.**-media-service-**>
          @type grep
          <exclude>
            key message
            pattern /^Successfully connected to ssl/
          </exclude>
        </filter>
        <filter kubernetes.var.log.containers.**-gateway-**>
          @type grep
          <exclude>
            key message
            pattern /^{/
          </exclude>
        </filter>
        <filter kubernetes.var.log.containers.**-gateway-**>
          @type grep
          <exclude>
            key message
            pattern /\[39mDEBUG/
          </exclude>
        </filter>
        <filter kubernetes.var.log.containers.**-gateway-**>
          @type concat
          key message
          multiline_start_regexp /^.?(\[(?!entity|binary)|\d{1,3}\.\d{1,3}\.)/
          flush_interval 5
          timeout_label @NORMAL
        </filter>
        <filter kubernetes.var.log.containers.**-gateway-**>
          @type parser
          key_name message
          reserve_data true
          reserve_time true
          <parse>
            @type multiline
            format_firstline /^.?(\[(?!entity|binary)|\d{1,3}\.\d{1,3}\.)/
            format1 /^.?(\[(?!entity|binary)|\d{1,3}\.\d{1,3}\.)(?<message>.*)/
          </parse>
        </filter>

Your Error Log

{"retry_time":2,"next_retry_seconds":"2023-06-15 13:49:05 3634480173375825512677/4398046511104000000000 -0400","chunk":"5fc7957e2e81bbff46b3b84856ab18e9","error" │
│ :"#<Fluent::Plugin::ElasticsearchOutput::RecoverableRequestFailure: could not push logs to Elasticsearch cluster ({:host=>\"my-cluster.us-east-1.aws.found.io\", :port=>9243, :scheme=>\"htt │
│ ps\", :user=>\"kubernetes\", :password=>\"obfuscated\", :path=>\"\"}): [400] {\"error\":{\"root_cause\":[{\"type\":\"parse_exception\",\"reason\":\"Failed to parse content to map\"}],\"type\":\"parse_exception\ │
",\"reason\":\"Failed to parse content to map\",\"caused_by\":{\"type\":\"json_parse_exception\",\"reason\":\"Invalid UTF-8 middle byte 0x5c\\n at [Source: (byte[])\\\"{\\\"stream\\\":\\\"stderr\\\",\\\"log\\\" │
│ :\\\"2023/05/25 00:03:00 [error] 66#66: *704 broken header: \\\\\\\"\\\\u0016\\\\u0003\\\\u0001\\\\u0000�\\\\u0001\\\\u0000\\\\u0000�\\\\u0003\\\\u0003�\\\\u0010�,���hK�S-\\\\u007F�\\\\u000e�m\\\\\\\\���Mlۿ��pT │
│ �� 4��\\\\u0018��]d��8^%�AJH�u+]��6Vd\\\\u0007�t�%\\\\u0000&�+�/�,�0̨̩�\\\\t�\\\\u0013�\\\",\\\"docker\\\":{\\\"container_id\\\":\\\"bb7888033a32580cae2256fb6f081fd64f28e3aa7a97da38e16cfd6dd65a2305\\\"},\\\"kuber │
│ netes\\\":{\\\"container_name\\\":\\\"controller\\\",\\\"namespace_name\\\":\\\"nginx-internal\\\",\\\"pod_name\\\":\\\"ingress-nginx-internal-controller-5cb994bc6b-z29gr\\\",\\\"contain\\\"[truncated 1295 byte │
│ s]; line: 1, column: 110]\"}},\"status\":400}>","message":"[elasticsearch] failed to flush the buffer. retry_time=2 next_retry_seconds=2023-06-15 13:49:05 3634480173375825512677/4398046511104000000000 -0400 chu │
│ nk=\"5fc7957e2e81bbff46b3b84856ab18e9\" error_class=Fluent::Plugin::ElasticsearchOutput::RecoverableRequestFailure error=\"could not push logs to Elasticsearch cluster ({:host=>\\\"mycluster.us-east-1.aws.found.io\\\", :port=>9243, :scheme=>\\\"https\\\", :user=>\\\"kubernetes\\\", :password=>\\\"obfuscated\\\", :path=>\\\"\\\"}): [400] {\\\"error\\\":{\\\"root_cause\\\":[{\\\"type\\\":\\\"pars │
│ e_exception\\\",\\\"reason\\\":\\\"Failed to parse content to map\\\"}],\\\"type\\\":\\\"parse_exception\\\",\\\"reason\\\":\\\"Failed to parse content to map\\\",\\\"caused_by\\\":{\\\"type\\\":\\\"json_parse_ │
│ exception\\\",\\\"reason\\\":\\\"Invalid UTF-8 middle byte 0x5c\\\\n at [Source: (byte[])\\\\\\\"{\\\\\\\"stream\\\\\\\":\\\\\\\"stderr\\\\\\\",\\\\\\\"log\\\\\\\":\\\\\\\"2023/05/25 00:03:00 [error] 66#66: *70 │
│ 4 broken header: \\\\\\\\\\\\\\\"\\\\\\\\u0016\\\\\\\\u0003\\\\\\\\u0001\\\\\\\\u0000\\uFFFD\\\\\\\\u0001\\\\\\\\u0000\\\\\\\\u0000\\uFFFD\\\\\\\\u0003\\\\\\\\u0003\\uFFFD\\\\\\\\u0010\\uFFFD,\\uFFFD\\uFFFD\\uF │
│ FFDhK\\uFFFDS-\\\\\\\\u007F\\uFFFD\\\\\\\\u000e\\uFFFDm\\\\\\\\\\\\\\\\\\uFFFD\\uFFFD\\uFFFDMl\\u06FF\\uFFFD\\uFFFDpT\\uFFFD\\uFFFD 4\\uFFFD\\uFFFD\\\\\\\\u0018\\uFFFD\\uFFFD]d\\uFFFD\\uFFFD8^%\\uFFFDAJH\\uFFFD │
│ u+]\\uFFFD\\uFFFD6Vd\\\\\\\\u0007\\uFFFDt\\uFFFD%\\\\\\\\u0000&\\uFFFD+\\uFFFD/\\uFFFD,\\uFFFD0\\u0329\\u0328\\uFFFD\\\\\\\\t\\uFFFD\\\\\\\\u0013\\uFFFD\\\\\\\",\\\\\\\"docker\\\\\\\":{\\\\\\\"container_id\\\\\ │
│ \\":\\\\\\\"bb7888033a32580cae2256fb6f081fd64f28e3aa7a97da38e16cfd6dd65a2305\\\\\\\"},\\\\\\\"kubernetes\\\\\\\":{\\\\\\\"container_name\\\\\\\":\\\\\\\"controller\\\\\\\",\\\\\\\"namespace_name\\\\\\\":\\\\\\\ │
"nginx-internal\\\\\\\",\\\\\\\"pod_name\\\\\\\":\\\\\\\"ingress-nginx-internal-controller-5cb994bc6b-z29gr\\\\\\\",\\\\\\\"contain\\\\\\\"[truncated 1295 bytes]; line: 1, column: 110]\\\"}},\\\"status\\\":400} │
│ \""}

Additional context

I had added the following config changes to the nginx ingress controller with no luck

log-format-escape-json: true
log-format-upstream: {"timestamp":"$time_iso8601","network":{"forwarded_ip":"$http_x_forwarded_for"},"user":{"name":"$remote_user"},"user_agent":{"original":"$http_user_agent"},"http":{"version":"$server_protocol","request":{"body":{"bytes":"$body_bytes_sent"},"bytes":"$request_length","method":"$request_method","referrer":"$http_referer"},"response":{"body":{"bytes":"$body_bytes_sent"},"bytes":"$bytes_sent","status_code":"$status","time":"$request_time"},"upstream":{"bytes":"$upstream_response_length","status_code":"$upstream_status","time":"$upstream_response_time","address":"$upstream_addr","name":"$proxy_upstream_name"}},"url":{"domain":"$host","path":"$uri","query":"$args","original":"$request_uri"}}

Nginx outputs valid Json

@ashie
Copy link
Member

ashie commented Dec 8, 2023

Please forward your issue to uken/fluent-plugin-elasticsearch#1018

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants