It was two weeks until the demo.
We had this single opportunity to build a working microgrid control system that needed to:
Run on Intel and ARM processors
Target Linux and Windows platforms
Include applications written in C, C++, Java, SCALA, Lua, and LabVIEW
Talk to legacy equipment speaking ModBus and DNP3 protocols
Perform real-time control while meeting all of the above requirements
In this post, I'll talk about the real-world problems we faced and how the tools included in RTI Connext DDS Pro helped us solve our integration problems in just a couple of days. Common issues encountered in most projects are highlighted, with specific RTI tools for addressing each. Along the way you'll find links to supporting videos and articles for those who want a deeper dive. My hope is that you find this a useful jumping off point for learning how to apply RTI tools to make your DDS development quicker and easier.
The Big Demo
This was the first working demo of the Smart Grid Interoperability Panel's Open Field Message Bus (OpenFMB), a new way of controlling devices at the edge of the power grid in real time by applying IoT technologies like DDS.
Here's a block diagram of the system showing hardware architectures, operating systems, and languages:
As we brought the individual participants onto the network, we encountered a number of problems. A description of challenges and the tools we used to address each follows. Scan the list of headings and see if you've had to debug any of these issues in your DDS system, then check out the links to learn a few new tips. As you do, think about how you would try to diagnose the problems without the tools mentioned.
Problem: Network configuration problemsTools: RTI DDS Ping
The team from Oak Ridge National Labs was working on the LabVIEW GUI that would be the main display. Their laptop could not see data from any of the clients on the network. We checked the basics to make sure their machine was on the same subnet - always check the basics first! While the standard ping utility can confirm basic reachability between machines, it doesn't check that the ports necessary for DDS discovery are open. The rtiddsping utility does exactly that, and it told us in seconds that the firewall installed on their government-issued laptop was preventing DDS discovery traffic. For a great rundown on how to check the basics, see this community post.
Problem: Is my app sending data?Tools: Spy, Admin Console
A common question among the vendors using DDS for the first time was whether their application was behaving properly: Was it sending data at the proper intervals, and did the data make sense? For a quick check, we used the RTI DDS Spy utility. Spy provides a simple subscriber that can filter selectively for specific types and topics, and it can print the individual samples it receives, allowing you to quickly see the data your app is writing. Every vendor used DDS Spy as a sanity check after initially firing up their application.
Sometimes an update to the same topic can come from multiple publishers in the system. Not sure which one wrote the latest update? A command line switch for Spy ("-showSampleIdentity") allows you to see where an update originated.
Spy is a console app that can be deployed on embedded targets for basic testing. Its small size, quick startup, and simplicity are its main advantages. Details on usage are here.
Problem: Data type mismatchTools: Admin Console, Monitor
One vendor reported that in an earlier test they were seeing data from one of the other apps, and now they were not. Admin Console quickly showed us that a data type mismatch was to blame – that is, two topics with the same name but different data types. These types of mismatches can be difficult to diagnose, especially for large types with many members. Admin Console leverages the data-centricity of DDS to introspect the data types as understood by each application in your system. It then presents both a simplified view and an "equivalent IDL" view that makes it easy to compare the types in side-by-side panes. This is especially valuable in situations where you don't have the source IDL from every application.
In this case, one vendor had not synced up with the GitHub repository for the latest IDL, so they were working from an older version of the file. They pulled the latest files from GitHub, rtiddsgen created new type-specific code for them, and after a quick recompile their app was able to read and write the updated topics.
Problem: QoS mismatchTools: Admin Console, Monitor
Next to discovery, Quality of Service (QoS) mismatches are the most common problem experienced by DDS users during integration. With so many knobs to turn, how do you make sure that settings are compatible? The OpenFMB project had its fair share of QoS mismatches at first. Admin Console spots these quickly and tells you the specific QoS settings that are in conflict. You can even click on the QoS name and go directly to the documentation. QoS information shared during discovery is used by Admin Console to detect mismatches.
Problem: Is the system functioning as expected?Tools: Admin Console, Monitor
While Spy provides basic text output for live data, you can't beat a graph for seeing how data changes over time. For more sophisticated data visualization, we turned to Admin Console. The data visualization feature built in to Admin Console was a huge help in quickly determining how the system as a whole was working. It even allowed us to scroll through historical data to better understand how we arrived at the current state. To find out more about data visualization, see this short intro video, or this deep dive video.
Problem: Performance tuningTools: Monitor, Admin Console
When it comes to performance tuning, Monitor should be your go-to tool. Monitor works with a special version of the DDS libraries that periodically publish real-time performance data from your application. The debug libraries are minimally intrusive, and the data is collected and presented by Monitor.
Using Monitor, you can learn about:
- Transmission and reception statistics
- Missed deadlines
- High-water marks on caches
- QoS mismatches
- Data type conflicts
- Samples lost or rejected
- Loss of liveliness
It's important to note that not every QoS setting is advertised during discovery. Many QoS settings apply to an application's local resource management and performance tuning, and these are not sent during discovery. With Monitor you can inspect these, too. For a great introduction to Monitor, check out this video.
Problem: Transforming data in flightTools: Prototyper with Lua, DDS Toolkit for LabVIEW
We wanted a large GUI to show what was happening in the microgrid in real time. The team at Oak Ridge National Labs volunteered to create a GUI in LabVIEW. The DDS Toolkit for Labview allows you to grab data from DDS applications and use it in LabVIEW Virtual Instruments (VIs). There are some limitations however, as we found out. The Toolkit does not handle arrays of sequences, which some types in the OpenFMB data model use. We needed a quick solution that would allow the LabVIEW VI to read these complex data types.
One of the cool new tools in the Connext DDS Pro 5.2 toolbox is Prototyper with Lua. Prototyper allows you to quickly create DDS-enabled apps with little to no programming: define your topics and domain participants in XML, add a simple Lua script, and you can be up on a DDS domain in no time. (Check out Gianpiero's blog post on Prototyper)
Back at the hotel one evening I wrote a simple Lua script that allows Prototyper to read the complex DDS topics containing arrays of sequences and then republish them to a different, flattened topic for use by the LabVIEW GUI. I was able to test it offline using live data recorded earlier in the lab, which brings us to...
Problem: Disconnected developmentTools: Record, Replay, Prototyper with Lua
A geographically dispersed development team built the OpenFMB demo. With the exception of those few days in Knoxville, no one on the team had access to all the components in the microgrid at one time. So how do you write code for your piece of the puzzle when you don't have access to the other devices in the system?
When I worked on the Lua bridge for the LabVIEW GUI, I used the Connext Pro Record and Replay services. In the lab, I had recorded about 10 minutes of live data as we ran the system through all the use cases. Later that evening in the hotel, I was able to play this data back as I worked on the Lua scripts. Replay allows you selectively play back topics, looping the playback so it runs continuously. You can also choose to play the data at an accelerated rate – this is a huge time saver that enables you to simulate days or hours worth of runtime in just a few minutes.
One of the really neat things Prototyper does once it's running is to periodically reload the Lua script. This made developing the bridge to LabVIEW very quick: Replay played data continuously in an accelerated mode; I had an editor open on the Lua script; and as I made and saved changes they were instantly reflected in Prototyper which was running constantly – no need to restart to see changes to the script. The conversion script was done in just a couple of hours.
Prototyper also came in handy for quickly creating apps to generate simulated data. The LabVIEW GUI was developed entirely offline without any of the real-world devices, using some topics generated by the Replay services and others that were bridged or simulated with Prototyper. I'd email a simulator script to ORNL, they'd do some LabVIEW work and send me an updated VI, and then I'd run that locally to verify it. ORNL did an amazing job integrating real-time data from the DDS domain along with visual elements from the SGIP cartoons, and the GUI was the centerpiece of the demo.
When we showed up in New Orleans a couple weeks later, the entire system was brought up in about 30 minutes, which is remarkable considering some of the applications (like the LabVIEW GUI) had never even been on a network with the actual hardware. Everything just worked.
The rich set of tools provided by RTI Connext DDS Pro allowed us to solve our integration problems quickly during the short week in Knoxville, and to carry on development at many remote locations. Admin Console, Monitor, DDS Ping, and DDS Spy got our system up and running. Record, Replay, and Prototyper made it possible for remote development teams to work in the absence of real hardware. DDS Toolkit for LabVIEW enabled us to create a sophisticated GUI quickly. And even after the event, we can continue to do development and virtual demos using these tools.