Skip to main content

Fluent-bit

Fluent-bit is a fast log processor and forwarder for Linux, Windows, embedded Linux, MacOS and BSD family operating systems. It's part of the graduated Fluentd ecosystem and a CNCF sub-project.

Fluent-bit allows you to collect log events or metrics from different sources, process them, and deliver them to different backends such as Fluentd, Elasticsearch, Splunk, DataDog, Kafka, New Relic, Azure services, AWS services, Google services, NATS, InfluxDB, or any custom HTTP end-point.

Fluent-bit comes with full SQL stream processing capabilities: data manipulation and analytics using SQL queries.

Fluent-bit runs on x86_64, x86, arm32v7, and arm64v8 architectures.

Key concepts

There are a few key concepts that are really important to understand how Fluent-bit operates.

Before diving into Fluent-bit, it’s best to get acquainted with some of the key concepts of the service. This document provides a gentle introduction to those concepts as well as common terminology.

We’ve provided a list below of all the terms we’ll cover, but we recommend reading this document from start to finish to gain a more general understanding of this log and stream processor.

  • Event or Record
  • Filtering
  • Tag
  • Timestamp
  • Match
  • Structured Message

Event or Record

Every incoming piece of data that belongs to a log or a metric that is retrieved by Fluent-bit is considered an event or a record.

As an example consider the following content of a Syslog file:
Jan 18 12:52:16 flb systemd[2222]: Starting GNOME Terminal Server
Jan 18 12:52:16 flb dbus-daemon[2243]: [session uid=1000 pid=2243] Successfully activated service 'org.gnome.Terminal'
Jan 18 12:52:16 flb systemd[2222]: Started GNOME Terminal Server.
Jan 18 12:52:16 flb gsd-media-keys[2640]: # watch_fast: "/org/gnome/terminal/legacy/" (establishing: 0, active: 0)

It contains four lines and all of them represent four independent events.

Internally, an event always has two components (in an array form): [TIMESTAMP, MESSAGE]

Filtering

In some cases, it is required to perform modifications on the events content. The process to alter, enrich, or drop events is called filtering.

There are many use cases when filtering is required, such as to:

  • Append specific information to the Event like an IP address or metadata.
  • Select a specific piece of the Event content.
  • Drop Events that matches certain pattern.

Tag

Every event that gets into Fluent-bit gets assigned a tag. This tag is an internal string that is used in a later stage by the router to decide which filter or output phase it must go through.

Most of the tags are assigned manually in the configuration. If a tag is not specified, Fluent-bit will assign the name of the input plugin instance from where that Event was generated.

_The only input plugin that does NOT assign tags is input. This plugin speaks the Fluentd wire protocol called Forward where every event already comes with an associated tog. Fluent-bit will always use the incoming tag set by the client.

A tagged record must always have a matching rule.

To learn more about tags and matches, check out the Router section in the Fluent-bit Documentation.

Timestamp

The timestamp represents the time when an event was created. Every event contains an associated timestamp. The timestamp is a numeric fractional integer in the format:

SECONDS.NANOSECONDS

Seconds

The number of seconds that have elapsed since the Unix epoch.

Nanoseconds

Fractional second or one thousand-millionth of a second.

A timestamp always exists, either set by the input plugin or discovered through a data parsing process.

Match

Fluent-bit lets you deliver your collected and processed events to one or multiple destinations. This is done through a routing phase. A match represents a simple rule to select events where a tag matches a defined rule.

To learn more about tags and matches, check out the Router section in the Fluent-bit Documentation.

Structured Messages

Source events may or may not have a structure. A structure defines a set of keys and values inside the event message. As an example, consider the following two messages:

No Structured Message

"Project Fluent Bit created on 1398289291"

Structured Message

{"project": "Fluent Bit", "created": 1398289291}

At a low level, both are just an array of bytes, but the structured message defines keys and values. Having a structure helps to implement faster operations on data modifications.

Fluent-bit always handles every event message as a structured message. For performance reasons, a binary serialization data format called MessagePack is used.

Consider MessagePack as a binary version of JSON on steroids.

Data pipeline

For a detailed explanation of the data pipeline concepts in Fluent-bit, see:

Installation

For comprehensive instructions, see building and installing Fluent-bit.

Installation instructions for:

Example Installation for Ubuntu

Server GPG key

The first step is to add our server GPG key to your keyring to ensure you can get our signed packages. Follow the official Debian wiki guidance: DebianRepository/UseThirdParty - Debian Wiki

curl https://packages.fluentbit.io/fluentbit.key | gpg --dearmor > /usr/share/keyrings/fluentbit-keyring.gpg

Updated key from March 2022

From the 1.9.0 and 1.8.15 releases please note that the GPG key has been updated at https://packages.fluentbit.io/fluentbit.key so ensure this new one is added.

The GPG Key fingerprint of the new key is:

C3C0 A285 34B9 293E AF51  FABD 9F9D DC08 3888 C1CD
Fluentbit releases (Releases signing key) <releases@fluentbit.io>

The previous key is still available at https://packages.fluentbit.io/fluentbit-legacy.key and may be required to install previous versions.

The GPG Key fingerprint of the old key is:

F209 D876 2A60 CD49 E680 633B 4FF8 368B 6EA0 722A

Refer to the supported platform documentation to see which platforms are supported in each release.

Update your sources list

On Ubuntu, you need to add our APT server entry to your sources lists, please add the following content at bottom of your /etc/apt/sources.list file - ensure to set CODENAME to your specific Ubuntu release name (e.g. focal for Ubuntu 20.04):

deb [signed-by=/usr/share/keyrings/fluentbit-keyring.gpg] https://packages.fluentbit.io/ubuntu/${CODENAME} ${CODENAME} main

Update your repositories database

Now let your system update the apt database:

sudo apt-get update
  • We recommend upgrading your system (sudo apt-get upgrade). This could avoid potential issues with expired certificates.

  • If you have the following error "Certificate verification failed", you might want to check if the package ca-certificates is properly installed (sudo apt-get install ca-certificates).

Install Fluent-bit

Using the following apt-get command you are able now to install the latest fluent-bit:

sudo apt-get install fluent-bit

Update the Fluent-bit configuration file

Configure the Fluent-bit configuration file to collect logs and sent them to the c3-exporter fluent-bit.conf. See the attached complete fluent-bit.conf file.

# Example config

[SERVICE]
# Flush
# =====
# set an interval of seconds before to flush records to a destination
flush 15

# Daemon
# ======
# instruct Fluent Bit to run in foreground or background mode.
daemon Off

# Log_Level
# =========
# Set the verbosity level of the service, values can be:
#
# - error
# - warning
# - info
# - debug
# - trace
#
# by default 'info' is set, that means it includes 'error' and 'warning'.
log_level info

# Parsers File
# ============
# specify an optional 'Parsers' configuration file
parsers_file parsers.conf

# Plugins File
# ============
# specify an optional 'Plugins' configuration file to load external plugins.
plugins_file plugins.conf

# HTTP Server
# ===========
# Enable/Disable the built-in HTTP Server for metrics
http_server Off
http_listen 0.0.0.0
http_port 2020

# Storage
# =======
# Fluent Bit can use memory and filesystem buffering based mechanisms
#
# - https://docs.fluentbit.io/manual/administration/buffering-and-storage
#
# storage metrics
# ---------------
# publish storage pipeline metrics in '/api/v1/storage'. The metrics are
# exported only if the 'http_server' option is enabled.
#
storage.metrics on

# storage.path
# ------------
# absolute file system path to store filesystem data buffers (chunks).
#
# storage.path /tmp/storage

# storage.sync
# ------------
# configure the synchronization mode used to store the data into the
# filesystem. It can take the values normal or full.
#
# storage.sync normal

# storage.checksum
# ----------------
# enable the data integrity check when writing and reading data from the
# filesystem. The storage layer uses the CRC32 algorithm.
#
# storage.checksum off

# storage.backlog.mem_limit
# -------------------------
# if storage.path is set, Fluent Bit will look for data chunks that were
# not delivered and are still in the storage layer, these are called
# backlog data. This option configure a hint of maximum value of memory
# to use when processing these records.
#
# storage.backlog.mem_limit 5M


[INPUT]
name systemd
tag systemd
strip_underscores on
read_from_tail on

# Optional
[INPUT]
name syslog
parser syslog-rfc3164
listen 127.0.0.1
port 5140
mode tcp
tag syslog

# Adds a tag field named host.name to correlate metrics to logs
[FILTER]
Name record_modifier
Match *
Record host.name ${HOSTNAME}

# Add your specific Circonus host, user, and password to the following
[OUTPUT]
name opensearch
host <circonusPipelineExporterIp>
port 9200
http_user <ingestUserName>
http_passwd <ingestUserPassword>
generate_id on
logstash_format on
logstash_prefix logs-systemd
logstash_dateformat %Y-%m-%d
tls off
match systemd

# Add your specific Circonus host, user, and password to the following
[OUTPUT]
name opensearch
host <circonusPipelineExporterIp>
port 9200
http_user <ingestUserName>
http_passwd <ingestUserPassword>
generate_id on
logstash_format on
logstash_prefix logs-syslog
logstash_dateformat %Y-%m-%d
tls off
match syslog

# Testing purposes only
#[OUTPUT]
# name stdout
# match *

Optional: Adding syslogs as an [INPUT] collection

In the following directory, /etc/rsyslog.d/ Create a file called 60-c3opensearch.conf with the following content action(type="omfwd" target="127.0.0.1" port="5140" protocol="TCP") to start collecting syslogs using the configuration example from above.

cat /etc/rsyslog.d/60-c3opensearch.conf
action(type="omfwd" target="127.0.0.1" port="5140" protocol="tcp")

After creating the file, restart the rsyslog sudo systemctl restart rsyslog

Now the following step is to instruct systemd to enable the service:

sudo systemctl start fluent-bit

If you do a status check, you should see a similar output like this:

sudo systemctl status fluent-bit.service
● fluent-bit.service - Fluent Bit
Loaded: loaded (/lib/systemd/system/fluent-bit.service; disabled; vendor preset: enabled)
Active: active (running) since Mon 2023-02-13 15:27:25 UTC; 1 week 1 day ago
Docs: https://docs.fluentbit.io/manual/
Main PID: 1265874 (fluent-bit)
Tasks: 3 (limit: 19187)
Memory: 6.9M
CPU: 5min 11.783s
CGroup: /system.slice/fluent-bit.service
└─1265874 /opt/fluent-bit/bin/fluent-bit -c //etc/fluent-bit/fluent-bit.conf

Feb 13 15:27:25 ubuntu-c3playground-observability-v1 fluent-bit[1265874]: [2023/02/13 15:27:25] [ info] [fluent bit] version=2.0.9, commit=, pid=1265874
Feb 13 15:27:25 ubuntu-c3playground-observability-v1 fluent-bit[1265874]: [2023/02/13 15:27:25] [ info] [storage] ver=1.4.0, type=memory, sync=normal, checksum=off, max_chunks_up=128
Feb 13 15:27:25 ubuntu-c3playground-observability-v1 fluent-bit[1265874]: [2023/02/13 15:27:25] [ info] [cmetrics] version=0.5.8
Feb 13 15:27:25 ubuntu-c3playground-observability-v1 fluent-bit[1265874]: [2023/02/13 15:27:25] [ info] [ctraces ] version=0.2.7
Feb 13 15:27:25 ubuntu-c3playground-observability-v1 fluent-bit[1265874]: [2023/02/13 15:27:25] [ info] [input:systemd:systemd.0] initializing
Feb 13 15:27:25 ubuntu-c3playground-observability-v1 fluent-bit[1265874]: [2023/02/13 15:27:25] [ info] [input:systemd:systemd.0] storage_strategy='memory' (memory only)
Feb 13 15:27:25 ubuntu-c3playground-observability-v1 fluent-bit[1265874]: [2023/02/13 15:27:25] [ info] [input:syslog:syslog.1] initializing
Feb 13 15:27:25 ubuntu-c3playground-observability-v1 fluent-bit[1265874]: [2023/02/13 15:27:25] [ info] [input:syslog:syslog.1] storage_strategy='memory' (memory only)
Feb 13 15:27:25 ubuntu-c3playground-observability-v1 fluent-bit[1265874]: [2023/02/13 15:27:25] [ info] [in_syslog] TCP server binding 127.0.0.1:5140
Feb 13 15:27:25 ubuntu-c3playground-observability-v1 fluent-bit[1265874]: [2023/02/13 15:27:25] [ info] [sp] stream processor started

The default configuration of fluent-bit is collecting metrics of CPU usage and sending the records to the standard output, you can see the outgoing data in your /var/log/syslog file.

Configuring Fluent-bit

It's recommended that you configure Fluent-bit via its configuration file. See the Fluent-bit documentation.

CLI flags

Fluent-bit also supports a CLI interface with various flags matching up to the configuration options available.

$ docker run --rm -it fluent/fluent-bit --help
Usage: /fluent-bit/bin/fluent-bit [OPTION]

Available Options
-b --storage_path=PATH specify a storage buffering path
-c --config=FILE specify an optional configuration file
-d, --daemon run Fluent Bit in background mode
-D, --dry-run dry run
-f, --flush=SECONDS flush timeout in seconds (default: 1)
-C, --custom=CUSTOM enable a custom plugin
-i, --input=INPUT set an input
-F --filter=FILTER set a filter
-m, --match=MATCH set plugin match, same as '-p match=abc'
-o, --output=OUTPUT set an output
-p, --prop="A=B" set plugin configuration property
-R, --parser=FILE specify a parser configuration file
-e, --plugin=FILE load an external plugin (shared lib)
-l, --log_file=FILE write log info to a file
-t, --tag=TAG set plugin tag, same as '-p tag=abc'
-T, --sp-task=SQL define a stream processor task
-v, --verbose increase logging verbosity (default: info)
-w, --workdir set the working directory
-H, --http enable monitoring HTTP server
-P, --port set HTTP server TCP port (default: 2020)
-s, --coro_stack_size set coroutines stack size in bytes (default: 24576)
-q, --quiet quiet mode
-S, --sosreport support report for Enterprise customers
-V, --version show version number
-h, --help print this help

Inputs
cpu CPU Usage
mem Memory Usage
thermal Thermal
kmsg Kernel Log Buffer
proc Check Process health
disk Diskstats
systemd Systemd (Journal) reader
netif Network Interface Usage
docker Docker containers metrics
docker_events Docker events
node_exporter_metrics Node Exporter Metrics (Prometheus Compatible)
fluentbit_metrics Fluent Bit internal metrics
prometheus_scrape Scrape metrics from Prometheus Endpoint
tail Tail files
dummy Generate dummy data
dummy_thread Generate dummy data in a separate thread
head Head Input
health Check TCP server health
http HTTP
collectd collectd input plugin
statsd StatsD input plugin
opentelemetry OpenTelemetry
nginx_metrics Nginx status metrics
serial Serial input
stdin Standard Input
syslog Syslog
tcp TCP
mqtt MQTT, listen for Publish messages
forward Fluentd in-forward
random Random

Filters
alter_size Alter incoming chunk size
aws Add AWS Metadata
checklist Check records and flag them
record_modifier modify record
throttle Throttle messages using sliding window algorithm
type_converter Data type converter
kubernetes Filter to append Kubernetes metadata
modify modify records by applying rules
multiline Concatenate multiline messages
nest nest events by specified field values
parser Parse events
expect Validate expected keys and values
grep grep events by specified field values
rewrite_tag Rewrite records tags
lua Lua Scripting Filter
stdout Filter events to STDOUT
geoip2 add geoip information to records
nightfall scans records for sensitive content

Outputs
azure Send events to Azure HTTP Event Collector
azure_blob Azure Blob Storage
azure_kusto Send events to Kusto (Azure Data Explorer)
bigquery Send events to BigQuery via streaming insert
counter Records counter
datadog Send events to DataDog HTTP Event Collector
es Elasticsearch
exit Exit after a number of flushes (test purposes)
file Generate log file
forward Forward (Fluentd protocol)
http HTTP Output
influxdb InfluxDB Time Series
logdna LogDNA
loki Loki
kafka Kafka
kafka-rest Kafka REST Proxy
nats NATS Server
nrlogs New Relic
null Throws away events
opensearch OpenSearch
plot Generate data file for GNU Plot
pgsql PostgreSQL
skywalking Send logs into log collector on SkyWalking OAP
slack Send events to a Slack channel
splunk Send events to Splunk HTTP Event Collector
stackdriver Send events to Google Stackdriver Logging
stdout Prints events to STDOUT
syslog Syslog
tcp TCP Output
td Treasure Data
flowcounter FlowCounter
gelf GELF Output
websocket Websocket
cloudwatch_logs Send logs to Amazon CloudWatch
kinesis_firehose Send logs to Amazon Kinesis Firehose
kinesis_streams Send logs to Amazon Kinesis Streams
opentelemetry OpenTelemetry
prometheus_exporter Prometheus Exporter
prometheus_remote_write Prometheus remote write
s3 Send to S3

For comprehensive instructions on Fluent-bit configuration, see: Configuring Fluent-bit