Now that we've covered Fluentd, it's time to look into Fluent Bit. Fluent Bit is similar to fluentd, but is significantly more lightweight and consumes very little resources. Fluent Bit also has zero dependencies on anything, meaning you can get it up and running relatively quickly. However, both fluentd and Fluent Bit follow the same architecture for logging and metrics collection and have many similarities. You can also use both to create a custom logging architecture.
Fluent Bit is basically the same thing as fluentd. It acts as a logging layer that collects logs from across any input source that you specify, including infrastructure resources. It then parses, filters, and outputs the data in the same way fluentd does. It even uses plugins in the same way fluentd uses them to handle data. The advantage comes up when we need a logging layer in a containerized environment such as with a Kubernetes cluster which has a limited amount of resources and requires a logging layer that can work on each node to produce logs that can then be forwarded to a centralized logging layer such as fluentd.
There is an additional unique feature that Fluent Bit provides, and that is SQL Stream Processing. Earlier, we compared event-based logs vs object-based databases and came to the conclusion that event-based logs are a much better way to handle data when using microservices. However, databases have one thing about them that makes them superior to log data: the ability to use SQL queries to extract information. The SQL Stream Processor within Fluent Bit is used to query the processed log data and allows you to perform Aggregations, grouping, calculations, data analysis and time-series predictions. So while there is no actual database, you get all the benefits of SQL from stream processing.
Note that this step happens before you send the data to storage. You don't query data that is already in storage. So you can consider this to be an additional level of filtering based on queries that you specified. For instance, you might want to get the min values of the data before you send it to storage or the average. That is where this steps in. Once the processing step is complete, you can go ahead and write the data to whatever storage you need in a format that the storage understands. So if you were to store the data in elastic search, you would convert the logs to the elastic search format and store it. Alternatively, you could send it off to a centralized stream processor for further processing, such as fluentd.
When it comes to running Fluent Bit, you install it as a DaemonSet on every Kubernetes node. The logs relating to the pods and general infrastructure are read from that point onwards. If this all sounds familiar to you, that is likely because this is how fluentd works as well. Additionally, the pluggable nature of the application, as well as security considerations that Fluent Bit has are all the same as with fluentd.
To sum up, Fluent Bit is basically fluentd, but with a much smaller footprint and file size. It runs in a more lightweight manner and consumes resources which makes it an ideal log processor for systems that have few resources. There are also a few unique features that Fluent Bit has that fluentd doesn't, and vice versa.