Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

When invoked from the fluent-plugin-cloudwatch-logs, the AWS SDK logs every AWS API call it makes, no matter what global logging level is set for fluentd #237

Open
syedriko opened this issue Aug 11, 2021 · 1 comment

Comments

@syedriko
Copy link
Contributor

syedriko commented Aug 11, 2021

Problem

When invoked from the fluent-plugin-cloudwatch-logs plugin, the AWS SDK logs every AWS API call it makes, no matter what global logging level is set for fluentd. The end result is that fluentd log grows proportionately to the "payload" log traffic that is forwarded to CloudWatch.

I believe the root cause is here: https://github.com/aws/aws-sdk-ruby/blob/version-3/gems/aws-sdk-core/lib/aws-sdk-core/plugins/logging.rb#L53, because this line

config.logger.send(config.log_level, format(config, response))

always logs at the current log level.

However, an issue in this repo provides the necessary application-level context for what is arguably a defect in the AWS SDK.

Steps to replicate

Example fluentd config

fluentd.conf:

<source>
  @type forward
  @id forward_input
</source>

<match test.**>
  @type stdout
  @id stdout_output
</match>

<match cw.**>
  @type cloudwatch_logs
  auto_create_stream true
  region <...>
  log_group_name cw_group_name
  log_stream_name cw_stream_name
  concurrency 2
  aws_key_id <...>
  aws_sec_key <...>
  include_time_key true
  log_rejected_request true
</match>

Fluentd logs

In response to

$ echo '{"foo":"bar"}' | fluent-cat cw

the following is seen in fluentd logs. Note that the log messages from Aws::CloudWatchLogs::Client are issued at every log level:

$ fluentd -c ./fluentd.conf --no-supervisor // info log level

2021-08-04 10:16:55 -0400 [info]: [Aws::CloudWatchLogs::Client 200 0.2607 0 retries] describe_log_groups()  
2021-08-04 10:16:55 -0400 [info]: [Aws::CloudWatchLogs::Client 200 0.068632 0 retries] create_log_group(log_group_name:"cw_group_name",tags:nil)  
2021-08-04 10:16:55 -0400 [info]: [Aws::CloudWatchLogs::Client 200 0.054434 0 retries] describe_log_streams(log_group_name:"cw_group_name",log_stream_name_prefix:"cw_stream_name",next_token:nil)  
2021-08-04 10:16:56 -0400 [info]: [Aws::CloudWatchLogs::Client 200 0.072863 0 retries] create_log_stream(log_group_name:"cw_group_name",log_stream_name:"cw_stream_name")  
2021-08-04 10:16:56 -0400 [info]: [Aws::CloudWatchLogs::Client 200 0.090928 0 retries] put_log_events(log_events:[{timestamp:1628086554740,message:"{\"foo\":\"bar\",\"time\":\"2021-08-04T10:15:54-04:00\"}"}],log_group_name:"cw_group_name",log_stream_name:"cw_stream_name")  

===============

$ fluentd -c ./fluentd.conf --no-supervisor -v // debug log level

2021-08-04 10:22:02 -0400 [debug]: fluent/log.rb:302:debug: [Aws::CloudWatchLogs::Client 200 0.243379 0 retries] describe_log_groups()  
2021-08-04 10:22:02 -0400 [debug]: fluent/log.rb:302:debug: [Aws::CloudWatchLogs::Client 200 0.055964 0 retries] create_log_group(log_group_name:"cw_group_name",tags:nil)  
2021-08-04 10:22:02 -0400 [debug]: fluent/log.rb:302:debug: [Aws::CloudWatchLogs::Client 200 0.048201 0 retries] describe_log_streams(log_group_name:"cw_group_name",log_stream_name_prefix:"cw_stream_name",next_token:nil)  
2021-08-04 10:22:02 -0400 [debug]: fluent/log.rb:302:debug: [Aws::CloudWatchLogs::Client 200 0.055562 0 retries] create_log_stream(log_group_name:"cw_group_name",log_stream_name:"cw_stream_name")  
2021-08-04 10:22:02 -0400 [debug]: fluent/log.rb:302:debug: [Aws::CloudWatchLogs::Client 200 0.083676 0 retries] put_log_events(log_events:[{timestamp:1628086859696,message:"{\"foo\":\"bar\",\"time\":\"2021-08-04T10:20:59-04:00\"}"}],log_group_name:"cw_group_name",log_stream_name:"cw_stream_name")  

==============

$ fluentd -c ./fluentd.conf --no-supervisor -vv // trace log level

021-08-04 10:30:38 -0400 [trace]: fluent/log.rb:281:trace: [Aws::CloudWatchLogs::Client 200 0.285799 0 retries] describe_log_groups()  
2021-08-04 10:30:38 -0400 [trace]: fluent/log.rb:281:trace: [Aws::CloudWatchLogs::Client 200 0.06596 0 retries] create_log_group(log_group_name:"cw_group_name",tags:nil)  
2021-08-04 10:30:38 -0400 [trace]: fluent/log.rb:281:trace: [Aws::CloudWatchLogs::Client 200 0.056249 0 retries] describe_log_streams(log_group_name:"cw_group_name",log_stream_name_prefix:"cw_stream_name",next_token:nil)  
2021-08-04 10:30:39 -0400 [trace]: fluent/log.rb:281:trace: [Aws::CloudWatchLogs::Client 200 0.08038 0 retries] create_log_stream(log_group_name:"cw_group_name",log_stream_name:"cw_stream_name")  
2021-08-04 10:30:39 -0400 [trace]: fluent/log.rb:281:trace: [Aws::CloudWatchLogs::Client 200 0.097014 0 retries] put_log_events(log_events:[{timestamp:1628087373685,message:"{\"foo\":\"bar\",\"time\":\"2021-08-04T10:29:33-04:00\"}"}],log_group_name:"cw_group_name",log_stream_name:"cw_stream_name")  

==============

$ fluentd -c ./fluentd.conf --no-supervisor -q // warn log level

2021-08-04 10:38:07 -0400 [warn]: [Aws::CloudWatchLogs::Client 200 0.244758 0 retries] describe_log_groups()  
2021-08-04 10:38:07 -0400 [warn]: [Aws::CloudWatchLogs::Client 200 0.058111 0 retries] create_log_group(log_group_name:"cw_group_name",tags:nil)  
2021-08-04 10:38:07 -0400 [warn]: [Aws::CloudWatchLogs::Client 200 0.047988 0 retries] describe_log_streams(log_group_name:"cw_group_name",log_stream_name_prefix:"cw_stream_name",next_token:nil)  
2021-08-04 10:38:07 -0400 [warn]: [Aws::CloudWatchLogs::Client 200 0.060144 0 retries] create_log_stream(log_group_name:"cw_group_name",log_stream_name:"cw_stream_name")  
2021-08-04 10:38:07 -0400 [warn]: [Aws::CloudWatchLogs::Client 200 0.073899 0 retries] put_log_events(log_events:[{timestamp:1628087823673,message:"{\"foo\":\"bar\",\"time\":\"2021-08-04T10:37:03-04:00\"}"}],log_group_name:"cw_group_name",log_stream_name:"cw_stream_name")  

==============

$ fluentd -c ./fluentd.conf --no-supervisor -qq // error log level

2021-08-04 10:43:16 -0400 [error]: [Aws::CloudWatchLogs::Client 200 0.241862 0 retries] describe_log_groups()  
2021-08-04 10:43:16 -0400 [error]: [Aws::CloudWatchLogs::Client 200 0.064994 0 retries] create_log_group(log_group_name:"cw_group_name",tags:nil)  
2021-08-04 10:43:16 -0400 [error]: [Aws::CloudWatchLogs::Client 200 0.047917 0 retries] describe_log_streams(log_group_name:"cw_group_name",log_stream_name_prefix:"cw_stream_name",next_token:nil)  
2021-08-04 10:43:17 -0400 [error]: [Aws::CloudWatchLogs::Client 200 0.05636 0 retries] create_log_stream(log_group_name:"cw_group_name",log_stream_name:"cw_stream_name")  
2021-08-04 10:43:17 -0400 [error]: [Aws::CloudWatchLogs::Client 200 0.081287 0 retries] put_log_events(log_events:[{timestamp:1628088133366,message:"{\"foo\":\"bar\",\"time\":\"2021-08-04T10:42:13-04:00\"}"}],log_group_name:"cw_group_name",log_stream_name:"cw_stream_name")  

Expected Behavior

I would expect that AWS API calls are logged at a certain log level (debug) and above, but not below. For example, with the debug log level hardcoded in the AWS SDK code line mentioned above:

$ fluentd -c ./fluentd.conf --no-supervisor -vv // trace

2021-08-04 11:56:57 -0400 [debug]: fluent/log.rb:302:debug: [Aws::CloudWatchLogs::Client 200 0.320894 0 retries] describe_log_groups()  
2021-08-04 11:56:57 -0400 [debug]: fluent/log.rb:302:debug: [Aws::CloudWatchLogs::Client 200 0.063318 0 retries] create_log_group(log_group_name:"cw_group_name",tags:nil)  
2021-08-04 11:56:57 -0400 [debug]: fluent/log.rb:302:debug: [Aws::CloudWatchLogs::Client 200 0.047943 0 retries] describe_log_streams(log_group_name:"cw_group_name",log_stream_name_prefix:"cw_stream_name",next_token:nil)  
2021-08-04 11:56:57 -0400 [debug]: fluent/log.rb:302:debug: [Aws::CloudWatchLogs::Client 200 0.060117 0 retries] create_log_stream(log_group_name:"cw_group_name",log_stream_name:"cw_stream_name")  
2021-08-04 11:56:57 -0400 [debug]: fluent/log.rb:302:debug: [Aws::CloudWatchLogs::Client 200 0.088652 0 retries] put_log_events(log_events:[{timestamp:1628092555600,message:"{\"foo\":\"bar\",\"time\":\"2021-08-04T11:55:55-04:00\"}"}],log_group_name:"cw_group_name",log_stream_name:"cw_stream_name")  

========

$ fluentd -c ./fluentd.conf --no-supervisor -v // debug
# hardcoded :debug

2021-08-04 11:41:22 -0400 [debug]: fluent/log.rb:302:debug: [Aws::CloudWatchLogs::Client 200 0.237807 0 retries] describe_log_groups()  
2021-08-04 11:41:22 -0400 [debug]: fluent/log.rb:302:debug: [Aws::CloudWatchLogs::Client 200 0.05453 0 retries] create_log_group(log_group_name:"cw_group_name",tags:nil)  
2021-08-04 11:41:23 -0400 [debug]: fluent/log.rb:302:debug: [Aws::CloudWatchLogs::Client 200 0.04345 0 retries] describe_log_streams(log_group_name:"cw_group_name",log_stream_name_prefix:"cw_stream_name",next_token:nil)  
2021-08-04 11:41:23 -0400 [debug]: fluent/log.rb:302:debug: [Aws::CloudWatchLogs::Client 200 0.059298 0 retries] create_log_stream(log_group_name:"cw_group_name",log_stream_name:"cw_stream_name")  
2021-08-04 11:41:23 -0400 [debug]: fluent/log.rb:302:debug: [Aws::CloudWatchLogs::Client 200 0.06734 0 retries] put_log_events(log_events:[{timestamp:1628091621300,message:"{\"foo\":\"bar\",\"time\":\"2021-08-04T11:40:21-04:00\"}"}],log_group_name:"cw_group_name",log_stream_name:"cw_stream_name")  
2021-08-04 11:41:23 -0400 [debug]: fluent/log.rb:302:debug: Called PutLogEvents API group="cw_group_name" stream="cw_stream_name" events_count=1 events_bytesize=74 sequence_token=nil thread=11609020 request_sec=0.067749457

========

$ fluentd -c ./fluentd.conf --no-supervisor // info
no output

==========

$ fluentd -c ./fluentd.conf --no-supervisor -q // warn
no output

==========

Fluentd and CloudWatchLogs plugin versions

  • OS version: Fedora 34
  • Bare Metal or within Docker or Kubernetes or others? : Bare metal
  • Fluentd v0.12 or v0.14/v1.0
    • paste result of fluentd --version or td-agent --version
$ fluentd --version
fluentd 1.7.4
$ 
  • Dependent gem versions
    • paste boot log of fluentd or td-agent
$ fluentd -c ./fluentd.conf 
2021-08-11 16:46:39 -0400 [info]: parsing config file is succeeded path="./fluentd.conf"
2021-08-11 16:46:40 -0400 [info]: using configuration file: <ROOT>
  <source>
    @type forward
    @id forward_input
  </source>
  <match test.**>
    @type stdout
    @id stdout_output
  </match>
  <match cw.**>
    @type cloudwatch_logs
    auto_create_stream true
    region "us-east-2"
    log_group_name "cw_group_name"
    log_stream_name "cw_stream_name"
    concurrency 2
    aws_key_id xxxxxx
    aws_sec_key xxxxxx
    include_time_key true
    log_rejected_request true
    <inject>
      time_key time
      time_type string
    </inject>
  </match>
</ROOT>
2021-08-11 16:46:40 -0400 [info]: starting fluentd-1.7.4 pid=18992 ruby="2.5.5"
2021-08-11 16:46:40 -0400 [info]: spawn command to main:  cmdline=["/home/syedriko/.rbenv/versions/2.5.5/bin/ruby", "-Eascii-8bit:ascii-8bit", "/home/syedriko/.rbenv/versions/2.5.5/bin/fluentd", "-c", "./fluentd.conf", "--under-supervisor"]
2021-08-11 16:46:40 -0400 [info]: gem 'fluent-plugin-cloudwatch-logs' version '0.7.6'
2021-08-11 16:46:40 -0400 [info]: gem 'fluentd' version '1.7.4'
2021-08-11 16:46:40 -0400 [info]: adding match pattern="test.**" type="stdout"
2021-08-11 16:46:40 -0400 [info]: adding match pattern="cw.**" type="cloudwatch_logs"
2021-08-11 16:46:40 -0400 [info]: adding source type="forward"
2021-08-11 16:46:40 -0400 [info]: #0 starting fluentd worker pid=19024 ppid=18992 worker=0
2021-08-11 16:46:40 -0400 [info]: #0 [forward_input] listening port port=24224 bind="0.0.0.0"
2021-08-11 16:46:40 -0400 [info]: #0 fluentd worker is now running worker=0
  • paste result of fluent-gem list, td-agent-gem list or your Gemfile.lock
$ fluent-gem list

*** LOCAL GEMS ***

aws-eventstream (1.1.0)
aws-partitions (1.396.0)
aws-sdk-cloudwatchlogs (1.38.0)
aws-sdk-core (3.109.3)
aws-sigv4 (1.2.2)
bigdecimal (default: 1.3.4)
cmath (default: 1.0.0)
concurrent-ruby (1.1.9)
cool.io (1.7.1)
csv (default: 1.0.0)
date (default: 1.0.0)
did_you_mean (1.2.0)
dig_rb (1.0.1)
etc (default: 1.0.0)
fcntl (default: 1.0.0)
fiddle (default: 1.0.0)
fileutils (default: 1.0.2)
fluent-plugin-cloudwatch-logs (0.7.6)
fluentd (1.7.4)
http_parser.rb (0.6.0)
io-console (default: 0.4.6)
ipaddr (default: 1.2.0)
jmespath (1.4.0)
json (default: 2.1.0)
minitest (5.10.3)
msgpack (1.4.2)
net-telnet (0.1.1)
openssl (default: 2.1.2)
power_assert (1.1.1)
psych (default: 3.0.2)
rake (12.3.0)
rdoc (default: 6.0.1)
scanf (default: 1.0.0)
sdbm (default: 1.0.0)
serverengine (2.2.4)
sigdump (0.2.4)
stringio (default: 0.0.1)
strptime (0.2.5)
strscan (default: 1.0.0)
test-unit (3.2.7)
tzinfo (2.0.4)
tzinfo-data (1.2021.1)
webrick (default: 1.4.2)
xmlrpc (0.3.0)
yajl-ruby (1.4.1)
zlib (default: 1.0.0)
$ 
@syedriko
Copy link
Contributor Author

On a second thought, I'm not sure this is an AWS SDK issue necessarily. Consider this test snippet, which is supposed to fail a create_log_group() call at different log levels:

require 'aws-sdk-cloudwatchlogs'
require 'time'
require 'logger'

logger = Logger.new(STDOUT)
logger.level = 'UNKNOWN'

cwl_client = Aws::CloudWatchLogs::Client.new(
  region: 'us-east-2',
  access_key_id: <...>,
  secret_access_key: <...>,
  logger: logger
)

resp = cwl_client.delete_log_group(
  log_group_name: 'foo'
)

resp = cwl_client.create_log_group(
  log_group_name: 'foo'
)
if !resp.successful?
  puts 'create_log_group failed' + resp
  exit(1)
end

['DEBUG', 'INFO', 'WARN', 'ERROR', 'FATAL', 'UNKNOWN'].each do |level|
  begin
    puts "logger.level: " + level
    logger.level = level
    resp = cwl_client.create_log_group(
      log_group_name: 'foo'
    )
  rescue    
  end
end

The output this produces is

$ ruby ./test.rb 
logger.level: DEBUG
D, [2021-08-11T23:21:09.354722 #28770] DEBUG -- : [Aws::CloudWatchLogs::Client 400 0.054844 0 retries] create_log_group(log_group_name:"foo") Aws::CloudWatchLogs::Errors::ResourceAlreadyExistsException The specified log group already exists

logger.level: INFO
logger.level: WARN
logger.level: ERROR
logger.level: FATAL
logger.level: UNKNOWN
$ 

that is, AWS API calls are only logged at DEBUG level.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant