ECONNREFUSED due to the connecting to wrong port after while #1001

PScharrenberg · 2023-02-21T17:38:52Z

Problem

fluent-plugin-elasticsearch successfully pushes logs to our elasticsearch server located behind a ssl-offloading nginx proxy listening on port 443.
After a while (a few hours) no logs are transferred anymore and we find this warning-message in the fluentd logs (where X.X.X.X is the correct ip address of our es server):

2023-02-21 11:07:15 +0000 [warn]: #0 [clusterflow:flow] failed to flush the buffer. retry_times=12 next_retry_time=2023-02-21 12:10:46 +0000 chunk="5f52bba4e6c17284274d9814840cea63" error_class=Fluent::Plugin::ElasticsearchOutput::RecoverableRequestFailure error="could not push logs to Elasticsearch cluster ({:host=>\"elasticsearch-fqdn\", :port=>443, :scheme=>\"https\", :user=>\"logging\", :password
=>\"obfuscated\"}): Connection refused - connect(2) for X.X.X.X:9200 (Errno::ECONNREFUSED)"

So after a while it tries to connect to the elasticsearch server directly without proxy, which obviously does not work.

After restarting fluentd inside of the k8s pod (fluent-ctl restart) the logs are shipped again

Steps to replicate

The relevant config part in fluentd.conf:

  <match **>
    @type elasticsearch
    @id clusterflow:flow
    exception_backup true
    fail_on_putting_template_retry_exceed true
    host elasticsearch-fqdn
    logstash_dateformat %Y-%m-%d
    logstash_format true
    logstash_prefix logging
    password xxxxxxxxxx
    port 443
    reload_connections true
    scheme https
    ssl_verify true
    user logging
    utc_index true
    verify_es_version_at_startup true
    <buffer tag,time>
      @type file
      chunk_limit_size 8MB
      path /buffers/clusterflow:flow.*.buffer
      retry_forever true
      timekey 10m
      timekey_wait 1m
    </buffer>
  </match>

Expected Behavior or What you need to ask

We expect it to continue connecting to the configured port.

Using Fluentd and ES plugin versions

We're using the rancher-logging "app" provided by rancher (rancher-logging:100.1.3+up3.17.7)
We're seeing this issue after upgrading from an older version.

Debian Buster
Kubernetes
Fluentd 1.14.6
ES plugin 5.2.2
ES version 6.8.12

The text was updated successfully, but these errors were encountered:

cosmo0920 · 2023-02-26T06:12:10Z

This could be occurred by Elasticsearch Sniffering feature.

How to enable this feature, see: https://github.com/uken/fluent-plugin-elasticsearch#sniffer-class-name

GiZZoR · 2023-04-11T20:02:25Z

You probably hit this time bomb someone left for you: https://github.com/uken/fluent-plugin-elasticsearch#reload-after
This causes the activation of the sniffer. Yes, a sniffer that hunts out the nodes in your ES cluster and then bypasses the configuration you explicitly set, thereby voiding any load balancing you may have configured. Bonus feature: it uses the scheme from the config you supplied to hit the host and port it finds in the nodes catalog.

I'd recommend reload_connections false, as the sniffer just shouldn't be needed in any properly configured environment.
You'd either correctly configure the hosts it uses, or use a load balancer.
This "feature" should only be enabled if explicitly needed, which should be never.

IMHO the sniffer should exist as an optional plugin, and should be promptly removed/disabled.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ECONNREFUSED due to the connecting to wrong port after while #1001

ECONNREFUSED due to the connecting to wrong port after while #1001

PScharrenberg commented Feb 21, 2023 •

edited

Loading

cosmo0920 commented Feb 26, 2023

GiZZoR commented Apr 11, 2023 •

edited

Loading

ECONNREFUSED due to the connecting to wrong port after while #1001

ECONNREFUSED due to the connecting to wrong port after while #1001

Comments

PScharrenberg commented Feb 21, 2023 • edited Loading

Problem

Steps to replicate

Expected Behavior or What you need to ask

Using Fluentd and ES plugin versions

cosmo0920 commented Feb 26, 2023

GiZZoR commented Apr 11, 2023 • edited Loading

PScharrenberg commented Feb 21, 2023 •

edited

Loading

GiZZoR commented Apr 11, 2023 •

edited

Loading