Typical distributed systems require data to be shared across multiple devices and multiple networks, from the edge to the fog to the cloud. This is challenging because the sheer volume of data – not to mention the stringent safety and security requirements – can easily overwhelm a network. These challenges require new ways to manage increased data volume, performance requirements, safety risk and security certifications. One of the most important changes is the databus and its unique ability to manage data flow.
A databus is a data-centric software framework for distributing and managing real-time data in intelligent distributed systems. It allows applications and devices to work together as one, integrated system.
The databus simplifies application and integration logic with a powerful data-centric paradigm. Instead of exchanging messages, software components communicate via shared data objects. Applications directly read and write the value of these objects, which are cached in each participant.
Key characteristics of a databus are:
- The participants/applications directly interface with the data
- The infrastructure understands, and can therefore selectively filter the data
- The infrastructure imposes rules and guarantees of Quality of Service (QoS) parameters such as rate, reliability and security of data flow
Difference between database and databus
The databus provides for data in motion where a database provides for data at rest.
A database implements data-centric storage. It saves old information that you can later search by relating properties of the stored data.
A databus implements data-centric interaction. It manages future information by letting you filter by properties of the incoming data. Data centricity can be defined by these properties:
- The interface is the data. There are no artificial wrappers or blockers to interface such as messages, objects, files or access patterns.
- The infrastructure understands that data. This enables filtering/searching, tools and selectivity. It decouples applications from the data and thereby removes much of the complexity from the applications.
- The system manages the data and imposes rules on how applications exchange data. This provides a notion of "truth". It enables data lifetimes, data model matching, CRUD interfaces, etc.
It is important to note that a databus is not just a database that you interact with via a pub-sub interface. There is no database. A database implies storage: the data physically resides somewhere. A databus implements a purely virtual concept called a "global data space" and implies data in motion.
Why implement a databus?
Both database and databus technologies replace the application-application interaction with application-data-application interaction. This change is absolutely critical. It decouples applications and greatly eases scaling, interoperability and system integration which is crucial for intelligent distributed systems. The difference is really one of old data stored in a (likely centralized) database versus future data sent directly to the applications from a distributed databus.
What is a layered databus?
The Industrial Internet Consortium (IIC) Industrial Internet Reference Architecture (IIRA) is a standards-based architectural guideline for developers to use in designing intelligent distributed systems based on a common framework. The IIRA recommends a new architectural pattern for intelligent distributed systems called the “layered databus” pattern.
In intelligent distributed systems, a common architecture pattern emerges that is made up of multiple databuses layered by communication QoS and data model needs. Typically, databuses will be implemented at the edge in the smart machines or lowest level subsystems, such as in a car, an oil rig or a hospital room. Above that will be one or more databuses that integrate these smart machines or subsystems, facilitating data communications between and with the higher-level control center or backend systems. The backend or control center layer could be the highest layer databus in the system, but there can be more than these three layers.
Typical distributed systems require sharing data across multiple networks like this, from the edge to the fog to the cloud. For example, in a connected hospital, devices have to communicate within a patient or operating room, to nurses’ stations and off-site monitors, to real-time analytics applications for smart alarming and clinical decision support, and with IT health records. This is challenging for several reasons. The aggregate volume of streaming device data could easily overwhelm hospital networks; patient data must be securely tracked, even as patients and devices move between rooms and networks; and additionally, devices and applications have to interoperate, even when developed by different manufacturers. A layered databus architecture is the ideal framework for resolving these challenges and developing multi-tiered distributed systems of systems.
Benefits of a layered databus
The benefits of implementing a layered databus architecture includes:
- Fast device-to-device integration - with delivery times in milliseconds or microseconds
- Automatic data and application discovery - with and between databuses
- Scalable integration - comprising hundreds of thousands of machines, sensors and actuators
- Natural redundancy - allowing extreme availability and resilience
- Hierarchical subsystem isolation - enabling development of complex systems design
The Connext Databus: a powerful data-centric paradigm
RTI Connext® DDS features a databus that allows applications to exchange data via a publish-subscribe, trusted peer-to-peer communication method. DDS applications do not rely on a centralized broker but rather discover each other through the databus, joining or leaving the DDS domain at any time. The centralized broker not only introduces a single point of failure, but also a single point of deterministic delivery of data, making it also a single point of security risk. The databus framework eliminates this bottleneck in the network. Connext handles the details of data distribution, synchronization and management, including serialization and lifecycle management. Its reliability, security, performance and scalability are proven in the most demanding industrial systems.