Fluent Bit

Fluent Bit is a fast log processor and forwarder for Linux, Windows, embedded Linux, MacOS and BSD family operating systems. It’s part of the graduated Fluentd ecosystem and a CNCF sub-project.

Fluent Bit allows you to collect log events or metrics from different sources, process them, and deliver them to different backends such as Fluentd, Elasticsearch, Splunk, DataDog, Kafka, New Relic, Azure services, AWS services, Google services, NATS, InfluxDB, or any custom HTTP end-point.

Fluent Bit comes with full SQL stream processing capabilities: data manipulation and analytics using SQL queries.

Fluent Bit runs on x86_64, x86, arm32v7, and arm64v8 architectures.

Key concepts

There are a few key concepts that are really important to understand how Fluent Bit operates.

Before diving into Fluent Bit, it’s best to get acquainted with some of the key concepts of the service. This document provides a gentle introduction to those concepts as well as common terminology.

We’ve provided a list below of all the terms we’ll cover, but we recommend reading this document from start to finish to gain a more general understanding of this log and stream processor.

  • Event or Record
  • Filtering
  • Tag
  • Timestamp
  • Match
  • Structured Message

Event or Record

Every incoming piece of data that belongs to a log or a metric that is retrieved by Fluent Bit is considered an event or a record.

As an example consider the following content of a Syslog file:
Jan 18 12:52:16 flb systemd[2222]: Starting GNOME Terminal Server
Jan 18 12:52:16 flb dbus-daemon[2243]: [session uid=1000 pid=2243] Successfully activated service 'org.gnome.Terminal'
Jan 18 12:52:16 flb systemd[2222]: Started GNOME Terminal Server.
Jan 18 12:52:16 flb gsd-media-keys[2640]: # watch_fast: "/org/gnome/terminal/legacy/" (establishing: 0, active: 0)

It contains four lines and all of them represent four independent events.

Internally, an event always has two components (in an array form): [TIMESTAMP, MESSAGE]

Filtering

In some cases, it is required to perform modifications on the events content. The process to alter, enrich, or drop events is called filtering.

There are many use cases when filtering is required, such as to:

  • Append specific information to the Event like an IP address or metadata.
  • Select a specific piece of the Event content.
  • Drop Events that matches certain pattern.

Tag

Every event that gets into Fluent Bit gets assigned a tag. This tag is an internal string that is used in a later stage by the router to decide which filter or output phase it must go through.

Most of the tags are assigned manually in the configuration. If a tag is not specified, Fluent Bit will assign the name of the input plugin instance from where that Event was generated.

_The only input plugin that does NOT assign tags is input. This plugin speaks the Fluentd wire protocol called Forward where every event already comes with an associated tog. Fluent Bit will always use the incoming tag set by the client.

A tagged record must always have a matching rule.

To learn more about tags and matches, check out the Router section in the Fluent Bit Documentation.

Timestamp

The timestamp represents the time when an event was created. Every event contains an associated timestamp. The timestamp is a numeric fractional integer in the format:

SECONDS.NANOSECONDS

Seconds

The number of seconds that have elapsed since the Unix epoch.

Nanoseconds

Fractional second or one thousand-millionth of a second.

A timestamp always exists, either set by the input plugin or discovered through a data parsing process.

Match

Fluent Bit lets you deliver your collected and processed events to one or multiple destinations. This is done through a routing phase. A match represents a simple rule to select events where a tag matches a defined rule.

To learn more about tags and matches, check out the Router section in the Fluent Bit Documentation.

Structured Messages

Source events may or may not have a structure. A structure defines a set of keys and values inside the event message. As an example, consider the following two messages:

No Structured Message

"Project Fluent Bit created on 1398289291"

Structured Message

{"project": "Fluent Bit", "created": 1398289291}

At a low level, both are just an array of bytes, but the structured message defines keys and values. Having a structure helps to implement faster operations on data modifications.

Fluent Bit always handles every event message as a structured message. For performance reasons, a binary serialization data format called MessagePack is used.

Consider MessagePack as a binary version of JSON on steroids.

Data pipeline

For a detailed explanation of the data pipeline concepts in Fluent Bit, see:

Installation

For comprehensive instructions, see building and installing Fluent Bit.

Installation instructions for:

Configuring Fluent Bit

It’s recommended that you configure Fluent Bit via its configuration file. See the Fluent Bit documentation.

CLI flags

Fluent Bit also supports a CLI interface with various flags matching up to the configuration options available.

$ docker run --rm -it fluent/fluent-bit --help
Usage: /fluent-bit/bin/fluent-bit [OPTION]

Available Options
  -b  --storage_path=PATH specify a storage buffering path
  -c  --config=FILE       specify an optional configuration file
  -d, --daemon            run Fluent Bit in background mode
  -D, --dry-run           dry run
  -f, --flush=SECONDS     flush timeout in seconds (default: 1)
  -C, --custom=CUSTOM     enable a custom plugin
  -i, --input=INPUT       set an input
  -F  --filter=FILTER     set a filter
  -m, --match=MATCH       set plugin match, same as '-p match=abc'
  -o, --output=OUTPUT     set an output
  -p, --prop="A=B"        set plugin configuration property
  -R, --parser=FILE       specify a parser configuration file
  -e, --plugin=FILE       load an external plugin (shared lib)
  -l, --log_file=FILE     write log info to a file
  -t, --tag=TAG           set plugin tag, same as '-p tag=abc'
  -T, --sp-task=SQL       define a stream processor task
  -v, --verbose           increase logging verbosity (default: info)
  -w, --workdir           set the working directory
  -H, --http              enable monitoring HTTP server
  -P, --port              set HTTP server TCP port (default: 2020)
  -s, --coro_stack_size   set coroutines stack size in bytes (default: 24576)
  -q, --quiet             quiet mode
  -S, --sosreport         support report for Enterprise customers
  -V, --version           show version number
  -h, --help              print this help

Inputs
  cpu                     CPU Usage
  mem                     Memory Usage
  thermal                 Thermal
  kmsg                    Kernel Log Buffer
  proc                    Check Process health
  disk                    Diskstats
  systemd                 Systemd (Journal) reader
  netif                   Network Interface Usage
  docker                  Docker containers metrics
  docker_events           Docker events
  node_exporter_metrics   Node Exporter Metrics (Prometheus Compatible)
  fluentbit_metrics       Fluent Bit internal metrics
  prometheus_scrape       Scrape metrics from Prometheus Endpoint
  tail                    Tail files
  dummy                   Generate dummy data
  dummy_thread            Generate dummy data in a separate thread
  head                    Head Input
  health                  Check TCP server health
  http                    HTTP
  collectd                collectd input plugin
  statsd                  StatsD input plugin
  opentelemetry           OpenTelemetry
  nginx_metrics           Nginx status metrics
  serial                  Serial input
  stdin                   Standard Input
  syslog                  Syslog
  tcp                     TCP
  mqtt                    MQTT, listen for Publish messages
  forward                 Fluentd in-forward
  random                  Random

Filters
  alter_size              Alter incoming chunk size
  aws                     Add AWS Metadata
  checklist               Check records and flag them
  record_modifier         modify record
  throttle                Throttle messages using sliding window algorithm
  type_converter          Data type converter
  kubernetes              Filter to append Kubernetes metadata
  modify                  modify records by applying rules
  multiline               Concatenate multiline messages
  nest                    nest events by specified field values
  parser                  Parse events
  expect                  Validate expected keys and values
  grep                    grep events by specified field values
  rewrite_tag             Rewrite records tags
  lua                     Lua Scripting Filter
  stdout                  Filter events to STDOUT
  geoip2                  add geoip information to records
  nightfall               scans records for sensitive content

Outputs
  azure                   Send events to Azure HTTP Event Collector
  azure_blob              Azure Blob Storage
  azure_kusto             Send events to Kusto (Azure Data Explorer)
  bigquery                Send events to BigQuery via streaming insert
  counter                 Records counter
  datadog                 Send events to DataDog HTTP Event Collector
  es                      Elasticsearch
  exit                    Exit after a number of flushes (test purposes)
  file                    Generate log file
  forward                 Forward (Fluentd protocol)
  http                    HTTP Output
  influxdb                InfluxDB Time Series
  logdna                  LogDNA
  loki                    Loki
  kafka                   Kafka
  kafka-rest              Kafka REST Proxy
  nats                    NATS Server
  nrlogs                  New Relic
  null                    Throws away events
  opensearch              OpenSearch
  plot                    Generate data file for GNU Plot
  pgsql                   PostgreSQL
  skywalking              Send logs into log collector on SkyWalking OAP
  slack                   Send events to a Slack channel
  splunk                  Send events to Splunk HTTP Event Collector
  stackdriver             Send events to Google Stackdriver Logging
  stdout                  Prints events to STDOUT
  syslog                  Syslog
  tcp                     TCP Output
  td                      Treasure Data
  flowcounter             FlowCounter
  gelf                    GELF Output
  websocket               Websocket
  cloudwatch_logs         Send logs to Amazon CloudWatch
  kinesis_firehose        Send logs to Amazon Kinesis Firehose
  kinesis_streams         Send logs to Amazon Kinesis Streams
  opentelemetry           OpenTelemetry
  prometheus_exporter     Prometheus Exporter
  prometheus_remote_write Prometheus remote write
  s3                      Send to S3

For comprehensive instructions on Fluent Bit configuration, see: Configuring Fluent Bit