Integrating Fluent Bit and RTI Connext DDS to Manage Real-time Log Data

An important aspect of distributed system operation is the ability to monitor its overall status. This typically requires collecting logs from various layers, including operating systems, security software, network equipment, middleware and user applications. After logs are collected from each layer, they need to be sent to multiple destinations for alerting, analysis, visualization and archiving. Since there are multiple sources and destinations for log data, integration complexity will increase dramatically if there is no unified interface between producers and consumers.

Fluent Bit is an open source log and data processor and forwarder which allows the unified collection of data from different sources to multiple destinations. It is often used as an edge log and data forwarder in an infrastructure based on Fluentd (enterprise-grade, open source log collector).

Together, these tools are the workforce behind the scenes in many DevOps dashboards based on Grafana or Kibana. They are responsible for collecting and processing large volumes of log data from many machines. Fluentd is one of the most popular log aggregators used in ELK-based logging pipelines. In fact, it’s so popular that the “EFK Stack” (Elasticsearch, Fluentd, Kibana) is an actual thing.

While Fluentd is a server-grade type of application (written in Ruby, with a large memory footprint), Fluent Bit was created with a specific use case in mind — highly distributed environments where limited capacity and reduced overhead (memory and CPU) are a huge consideration. Fluent Bit is written in C and was architected for high performance. It has a very small footprint (less than 500Kb only) and can also be deployed on smaller embedded systems.

Because of its large set of input/output plug-ins, Fluent Bit can be used as a data processing pipeline to quickly integrate data from different sources. It can play a key role in operational monitoring of large distributed systems.

As part of a U.S. Department of Energy (DOE)-funded research project, RTI developed a set of plug-ins for Fluent Bit that enable it to be integrated with RTI Connext DDS. This blog post explores these plug-ins, and provides an example scenario in which this integration could prove useful.

Suppose you need to publish some information over DDS that was produced by an existing application (perhaps a legacy application, where you don't have access to the source code). With the help of Fluent Bit, you can quickly set up a system that publishes data over DDS produced by the external application, without the need to write any code!

image2-8

Using a simple text editor, you can create a configuration file for Fluent Bit that defines a pipeline as follows:

  • Input: use the "Tail" plug-in that in real-time reads lines appended to a log file
  • Parse: you can use regular expressions to define extraction rules
  • Optional: add some miscellaneous transformations (if necessary)
  • Output: RTI Connext DDS Fluent Bit Structured Output Plug-in

In our RTI Community project, we built two output plug-ins for Fluent Bit, both capable of publishing over DDS all events collected through the pipeline:

  • The Unstructured Output Plug-in:
    • Easiest to use (with less configuration options)
    • Transforms every Fluent Bit event into a DDS sample, using a predefined data type composed by a simple sequence of key-value pairs.
    • All the properties associated with a Fluent Bit event are published through DDS.
    • Through the configuration file, users can specify one (or more) XML files containing the QoS to use for publication.
  • The Structured Output Plug-in:
    • More advanced, since it allows finer control of the data published over DDS.
    • Publishes events using your own data type
    • Plug-in reads a data mapping file (specified in the plug-in configuration) containing instructions on how to transform the properties of an event into members of a DDS data type.

As an example of such an application, suppose we want to publish the results of McAfee monitoring agent (our "legacy" application) over DDS using a well-defined data structure (that we define through IDL).

Since McAfee can publish the results of its analysis into syslog or in a log file, we can start building the Fluent Bit pipeline as depicted in the following diagram using the "Syslog" or "Tail" input plug-in

image4-4

When McAfee identifies a threat, it logs a line like this over syslog:

Oct 01 10:33:35 ubuntu ERROR AMOASScanner [27304] Infection caught
File Name: /home/jason/Desktop/malware/G1test.bin File Size: 11264
Infection Name: Generic BackDoor.agg Time: 1569951215 Process Name:
/usr/bin/scp User Name: root Profile Type: 1

After processing the line through the RegExparser, the event is broken down into properties containing all the extracted information:

time: Oct 01 10:33:35
hostname: ubuntu
appName: AMOASScanner
pid: 27304
filepath: /home/jason/Desktop/malware/G1test.bin
filesize: 11264
virusname: Generic BackDoor.agg
scantime: 1569951215
processname: /usr/bin/scp
username: root
profile: 1

tag: mcafee.found

The grep filter is then used to remove from the pipeline all the other unrelated messages (from syslog) not being parsed by the RegEx parser.

Finally, the dds_stroutput is configured to publish over DDS on a specific domain, and to map the broken-down event properties shown above into the following type (IDL):

struct Threat {
string<100> product_name;
string<1024> file_name;
string<100> threat_name;
string<100> date;
};

The map file (which must be defined through one of the required configuration parameter for the dds_str output plug-in) then tells the plug-in to perform the following mapping:

image3-8

As you can see, the Fluent Bit - DDS integration allows us to quickly publish the threats detected by McAfee over DDS, without the need to write a complex application!

The Fluent Bit Plug-ins project also includes a DDS Unstructured Input Plug-in that subscribes to DDS data (using the same data type used by the DDS Unstructured Output Plug-in), and inserts it into the Fluent Bit pipeline.

We ran some performance tests on using the DDS Unstructured Plug-ins (input and output) as a transport between two Fluent Bit instances. To measure throughput, we progressively increased the number of messages per second. We determined that our plug-ins can scale linearly up to 350,000 messages per second! In all cases, messages were 100 Bytes in size. To maximize the throughput, we used the DDS Batch QoS.

To find out more about the Fluent Bit Plug-ins, please refer to the Github project.

 

About the author:

rti-blog-author-fabrizio-bertocciFabrizio Bertocci is a principal engineer at RTI. He has been with RTI for over 10 years. During his tenure, he has contributed to several parts of the RTI Connext DDS core libraries and products. He is currently with RTI Research, exploring new solutions, technologies and applications for DDS. Before joining RTI, he worked with EDA company Mentor Graphics, embedded silicon vendor VLSI (now NXP), along with technology start-ups. Fabrizio holds a Computer Engineering degree from the University of Pisa (Italy).

Getting Started with Connext DDS

Connext® DDS is the world's leading implementation of the Data Distribution Service (DDS) standard for Real-Time Systems. Try a fully-functional version of Connext DDS for 30 days.


Free Trial