Automatic QoS Enforcement with DDS & Software Defined Networking
Written by Andrew L. King (Guest Author)
February 16, 2016
Good abstractions drive progress in computing. The first turing-complete computers such as the ENIAC exposed a register-level programming abstraction enabling programmers to reconfigure the machines for different tasks. In the 1950s - 1970s the maturation of compiled languages enabled programmers to more easily port software from one machine architecture to another. From 1980 to the late 1990s saw the development of modularity abstractions (e.g., object oriented programming) making it easier to distribute the development of large software across large teams. Throughout the 2000s to the present we've seen the development of programming abstractions designed to make it easier to build distributed and concurrent software systems. DDS is a perfect example and its success proves how useful and necessary good abstractions are.
I'd argue that good abstractions for programming the IIoT must capture timing behavior. For example, DDS lets the programmer specify the maximum interval between updates to a topic data-instance via the DEADLINE parameter. Unfortunately, so far, holistic abstractions for timing have been elusive. While DDS can automatically perform runtime checks to ensure that DataReaders and DataWriters have consistent DEADLINE QoS settings, it is still possible for the DEADLINE QoS to be violated if the underlying network fabric delays or drops an update. Indeed, Bringing timing fully into the programming abstraction is difficult because the timing behavior of a software system only emerges when you map the software onto a particular platform (i.e., processor, operating system, and networking fabric).
In the context of a critical distributed hard real-time system, the developer must devote significant effort provisioning and configuring the underlying network fabric to ensure the required timing behavior. The configuration process usually involves setting priorities or transmission schedules for each time critical flow. Sometimes, If the underlying network fabric will be shared by more than one logically independent application, traffic policers and/or special partitioning features are setup to keep the different applications isolated. These network configurations are usually static and setup offline: If you want to add a new node to the network (e.g., a sensor or actuator) you have to bring the system down, manually reconfigure the network, then bring the system back up. Wouldn't it be much nicer if the programmer could simply specify the logical timing behavior for the system and then trust that the specification would always be satisfied?
DDS and recent advances in Software Defined Networking (SDN) now make it possible to fully lift timing properties for distributed systems into a programming abstraction. The MIDdleware Assurance Substrate (MIDAS) is a research prototype developed in the PRECISE Center at the University of Pennsylvania. MIDAS uses the OpenFlow SDN protocol to capture the QoS specifications of DDS clients as they come online and then control the packet switching behavior of the network fabric to ensure those specifications are always satisfied: Based on the captured QoS specifications, MIDAS will generate new network configurations to support the QoS of the DDS dataflows. MIDAS will then apply schedulability analysis to verify that the configurations will guarantee the specified QoS. Finally, If the requested QoS can be guaranteed MIDAS will push the new configuration onto the network transparently (The QoS of existing flows will not be disrupted by the reconfiguration process itself).
Technologies like DDS make the development and operation of distributed real-time systems easier, faster, and overall less costly. Adding technologies like MIDAS to the mix takes these benefits to the next level:
- Reduce Development Time: Automation of network configuration removes a significant chunk of development effort allowing developers to focus more on creating functionality (and hence value for their customers).
- Reduce Equipment Costs: Many common COTS ethernet switches now support OpenFlow and can be used with MIDAS to support hard real-time systems. Many deployed ethernet switches can be made OpenFlow capable with a simple firmware update.
- Increased Flexibility and Adaptability: MIDAS lets you add and remove network nodes at runtime. All resource provisioning is dynamic and ensures that no QoS is disrupted during configuration.
- Increased Reliability: MIDAS' network configurations ensures traffic bursts won't overload buffers in networking equipment which means packets will only be dropped if there is a hardware failure.
If you are interested in knowing more about MIDAS and how it works please read the associated technical paper* or contact me (Andrew King, firstname.lastname@example.org). I'd also like to invite you to watch the the following youtube video. It demonstrates a distributed hard real-time system built using RTI's Connext DDS with real-time behavior enforced by MIDAS:
*The technical paper describes an earlier prototype which used a simple custom publish/subscribe middleware. The current prototype is designed to work with DDS but the fundamental technical details are unchanged.