This example shows you how to monitor Fluent Bit, by having it upload its own metrics to CloudWatch.
There is a known issue in the exec input in AWS for Fluent Bit <= 2.31.11 which can occasionally cause it to crash, generally immediately after startup.
This issue is resolved in 2.31.12.
Fluent Bit scrapes its own prometheus endpoint and then parses the logs to a JSON that looks like this after FireLens adds ECS Metadata:
{
"metric": "fluentbit_output_retries_total",
"plugin": "cloudwatch_logs.0",
"value": "0",
"time": "1649288960583",
"ecs_cluster": "firelens-testing",
"ecs_task_arn": "arn:aws:ecs:ap-south-1:144718711470:task/firelens-testing/f2ad7dba413f45ddb4d92f7853b78469",
"ecs_task_definition": "fluentbit-metrics-to-cw-firelens-example:4",
"hostname": "ip-10-192-21-104.ap-south-1.compute.internal"
}
Please note: In order to have ECS metadata added to your logs, enable-ecs-log-metadata
must be enabled (set to true
) in the firelensConfiguration
of your Task Definition. This is the default value, so as long as you have not explicitly set it to false
it will be enabled.
This example shows a hostname field added, which is the ENI IP of the Fargate task. The config to add hostname is optional and is commented out in the example extra.conf
in this repo.
In the CW Log Group for these logs, you can then create a CloudWatch Metric Filter to create metrics from the logs. Metric dimensions are customizeable and can be any value in the log JSON.
Each Fluent Bit instance will upload these logs to a log stream named by its hostname, this should ensure unique log streams are used for each Fluent Bit instance.
Below is a screenshot that shows how the metrics will look in CloudWatch:
There are two tutorials in this guide- the first is for Fluent Bit's internal plugin metrics- error and retries and data processed by each plugin. The second shows you how to also enable Fluent Bit's storage metrics.
This example contains the following:
- Custom Fluent Bit configuration that:
- Enables the Fluent Bit monitoring endpoint
- Uses the exec input to scrape that endpoint and output the results as logs
- Filters out everything except for output metrics, this can be customized/altered. Un-comment the lines in the provided configuration file and remove the filter that exludes metrics not for outputs.
- Parses the data out of the logs to create a JSON event
- Sends the JSON to CloudWatch as logs
- A custom parser file that can parse the prometheus text into a structured JSON log.
FAQ: Why use the exec input to scrape the Fluent Bit prometheus metrics instead of the prometheus input or the Fluent Bit metrics input?
This is necessary because currently, the Fluent Bit metrics and the Prometheus Metrics inputs do not emit their data as logs, and use a separate metric pipeline that most Fluent Bit plugins do not support. However, if we use the exec input to curl the metrics, then the outputted prometheus data is turned into logs which we can parse and process easily. In the future, this experience may be improved.
For a quick setup, use the built-in plugin metrics configuration file available in AWS for Fluent Bit 2.29.1+. This built-in configuration means that you do not have to build a custom Fluent Bit image. However, please note that this built-in configuration enables all plugin metrics (not just output plugin metrics).
"firelensConfiguration": {
"type": "fluentbit",
"options": {
"config-file-type": "file",
"config-file-value": "/fluent-bit/configs/plugin-metrics-to-cloudwatch.conf"
}
},
If you have your own custom configuration already, you can import/include the built-in config:
@INCLUDE /fluent-bit/configs/plugin-metrics-to-cloudwatch.conf
Please note that the built-in config includes a [SERVICE]
section. This section can only be set once, so importing the built-in means that you can not have your own custom [SERVICE]
section..
Alternatively, the extra.conf
and fb_metrics_parser.conf
in this directory show the necessary config content.
Please note, you must set the environment variables described in step 2 and the metric filter in step 3.
Please follow the FireLens example for config-file-type and use the Dockerfile
and extra.conf
from this example.
Then, customize the included task definition with your custom Fluent Bit image and set these environment variables to configure where the metric logs are sent:
{ "name": "FLUENT_BIT_METRICS_LOG_GROUP", "value": "fluent-bit-metrics-firelens-example-parsed" },
{ "name": "FLUENT_BIT_METRICS_LOG_REGION", "value": "us-west-2" }
Create a CloudWatch Metric Filter on the log group to convert the JSON Logs to metrics. This can be customized as you desire. You can add additional dimensions.
-
Filter pattern:
{ $.value = * }
-
Metric Name: Choose a name
-
Metric Namespace: Choose a namespace
-
Metric Value:
$.value
-
Unit: Count
-
Dimensions:
fbmetric_name:$.metric
You can customize the dimensions as desired, any key in the logs can be a dimension. Here we show the Fluent Bit metric name as the only dimension.
To also send storage metrics, the technique the same. The difference is that Fluent Bit publishes storage metrics to a different HTTP path that vends the metrics in JSON instead of prometheus format. Therefore, a different input and filter is needed.
For a quick setup, use the built-in plugin + storage metrics configuration file available in AWS for Fluent Bit 2.29.1+. This built-in configuration means that you do not have to build a custom Fluent Bit image. However, please note that this built-in configuration enables all plugin metrics (not just output plugin metrics).
"firelensConfiguration": {
"type": "fluentbit",
"options": {
"config-file-type": "file",
"config-file-value": "/fluent-bit/configs/plugin-and-storage-metrics-to-cloudwatch.conf"
}
},
Then set these environment variables on your FireLens container to configure where the metric logs are sent:
{ "name": "FLUENT_BIT_METRICS_LOG_GROUP", "value": "fluent-bit-metrics-firelens-example-parsed" },
{ "name": "FLUENT_BIT_METRICS_LOG_REGION", "value": "us-west-2" }
Please note, you are not done yet, you must still create the metric filters on your log group described in this guide.
[INPUT]
Name exec
Command curl -s http://127.0.0.1:2020/api/v1/storage && echo ""
Interval_Sec 5
Tag fb_metrics-storage
[FILTER]
Name parser
Match fb_metrics-storage
Key_Name exec
Parser json
The extra.conf
file in this directory has the configuration to send storage metrics commented out on lines 42 to 63. Un-commment these lines (remove the '#' signs) to enable sending this data to CW. The data can be sent by the same output as the prometheus metrics.
Alternatively you can import/include the built-in config:
@INCLUDE /fluent-bit/configs/plugin-and-storage-metrics-to-cloudwatch.conf
Please note that the built-in config includes a [SERVICE]
section. This section can only be set once, so importing the built-in means that you can not have your own custom [SERVICE]
section..
The storage metric JSON data in your CW log stream will look like this:
{
"date": 1656218290.714451,
"storage_layer": {
"chunks": {
"total_chunks": 2,
"mem_chunks": 2,
"fs_chunks": 0,
"fs_chunks_up": 0,
"fs_chunks_down": 0
}
},
"input_chunks": {
"my_input_alias": {
"status": {
"overlimit": false,
"mem_size": "104b",
"mem_limit": "0b"
},
"chunks": {
"total": 1,
"up": 1,
"down": 0,
"busy": 0,
"busy_size": "0b"
}
}
}
}
You must determine which data inside this structure is important to your use case. The input_chunks
structure allows you to track the storage used by specific input definitions- notice that the name of the key is the name of the input definition. This can be customized with an alias, which you can see how to configure here: Fluent Bit monitoring: configuring aliases.
In this example, we show how to expose the mem_chunks
metric in CloudWatch. This tracks the count of chunks of data stored in memory by Fluent Bit.
A. Filter pattern: { $.storage_layer.chunks.mem_chunks = * }
B. Metric Name: Choose a name
C. Metric Namespace: Choose a namespace
D. Metric Value: $.storage_layer.chunks.mem_chunks
E. Unit: Count
F. Dimensions: Same as the first tutorial, you must choose this yourself, we recommend choosing uncommenting the lines in extra.conf
to add HOSTNAME to the metric data and choosing this as a dimension. This can be accomplished by setting the dimension to hostname:$hostname
.