How to Debug DDS Applications with RTI Log Parser

June 6, 2018 by
|

We are proud to announce the latest addition to RTI Labs: RTI Log Parser! As a reminder, RTI Labs is a free program that provides our customers with early access to new technology we are developing for the Industrial IoT. Log Parser is a command-line tool that helps you easily understand RTI Connext® DDS logs to more efficiently debug DDS-related issues and identify misuse of the Connext API.

Log Parser is open-source under the Apache 2.0 license and is available in our RTI Community GitHub repository. To use Log Parser, you will need to have Python installed on your computer.

rti-diagram-log-parser-picture-of-system-0318

Figure 1. Get a picture of the DDS communication flow with Log Parser

With Log Parser, DDS internal information is presented in an easy, readable format (see Figure 1). It takes advantage of the highest level of log verbosity to analyze the behavior of a DDS application. You will be able to understand low-level details like the HeartBeat and ACK-NACK mechanism in just a few lines! But Log Parser’s real power is debugging DDS issues and understanding the flow of the communication in your system. Imagine that you are faced with a very common issue: missing the first samples of the communication. It’s unclear why this is happening because the subscriber application was already running before the publisher, but still, the first sample is not showing up.

To analyze this scenario, enable the highest log verbosity and output the content into a file. You can find several ways to do it in the LogParser’s README file.

Now, if you run the publisher application again, you will see thousands of log lines in an incomprehensible language (at least it’s not the Shakespeare programming language). The log lines look like this:

[1525870671.487724] COMMENDActiveFacadeReceiver_loop:rR03000456001 parsing message
[1525870671.487728] MIGInterpreter_parse:INFO_TS from 0XA4601F7,0X4560
[1525870671.487732] MIGInterpreter_parse:DATA from 0XA4601F7,0X4560
[1525870671.487776] COMMENDActiveFacadeReceiver_loop:rR03000456001 disowning receive resource
[1525870671.487784] NDDS_Transport_Shmem_receive_rEA:rR03000456001 blocking on 0X1CF4

The full log file has the solution to our issue. Redirect the standard output into a file and run Log Parser from the command line as follows:

python rtilogparser.py -i logfile.txt

The output starts with the following lines:

[1]|        |    |          | Created topic, name: 'Example MyType', type: 'MyType'
[2]| | | | Created writer for topic 'Example MyType'
[3]| | | | Warning: [LP-4] Try to register instance with no key field
[4]| | | | [App] Writing count 0
[5]|H3.A2 | | | Discovered new participant (H3.A2.P1)
[6]|H3.A2.P1| | | Discovered new reader R-K_800000
[7]|H3.A2.P1|--->|W-K_800000| Received ACKNACK [0] from reader R-K_800000 for 0 +0
[8]|H3.A2.P1|<---|W-K_800000| Sent preemptive GAP to volatile reader R-K_800000
[9]| |<---|W-K_800000| Sent HB [2] to verify GAP for samples in [1, 1]

By analyzing this log, you can clarify the following order of events:

  1. [1, 2] The publisher application creates a topic 'Example MyType' and a DataWriter.
  2. [4] The DataWriter sends the first sample.
  3. [5, 6] The publisher application discovers the remote DataReader.
  4. [7] The DataReader asks for samples to the DataWriter.
  5. [8, 9] The DataWriter sends a GAP and HB messages saying it doesn’t have the first sample anymore.

The problem is that you are sending the first sample before matching the remote DataReader. Since the Durability QoS was set to volatile, it wasn’t keeping written samples for late-joiner DataReaders. For this reason, the DataReader won’t receive the first sample because it didn’t know it was there when we were sending it!

Once this issue is understood, the fix is quite simple. You could wait to match the expected number of entities by checking the PublicationMatchedStatus of the DataWriter. Or, you could apply the following QoS changes:

  1. Set Durability QoS to TransientLocal. The DataWriter will keep past samples in its queue for late-joiner DataReaders.
  2. Set History QoS to KeepLast with depth > 1. The entities queue can store more than one sample.

If you run the application again with the QoS changes in the USER_QOS_PROFILES.xml file, the new parsed log contains the following line: Resent DATA[1] to reader R-K_800000. This means previous samples are being sent to the late-joiner DataReader as expected.

You may have noticed in the previous parsed log the line 3: Warning: [LP-4] Try to register instance with no key field.

Log Parser is able to detect common misuse of the Connext DDS API in any language and print a clear warning or error message. Some of these messages start with a code: [LP-4]. This is reference code to get more information about the issue along with possible solutions. The details of the message are contained in the file MessageCode.md. In our experience, we have seen customer logs reporting hundreds of warnings and errors that explained the unexpected behaviors they were having in their applications.

These aren’t the only features in Log Parser. It also includes:

  • Obfuscation via a cryptographic secure or custom seeds
  • Markdown output convertible into HTML
  • Network statistics for RTPS submessages and network traffic in different network interfaces
  • A number of DDS entities and applications per host with IP addresses
  • The ability to keep track of DDS events like entity creation and deletion
  • Data flow communication analysis, even in Shared Memory transport

We encourage you to give Log Parser a try to debug your issues and check if your application is throwing warnings. It is also a great way to send helpful information to the RTI Support team when opening up a support case – you can even obfuscate Log Parser’s output to protect the sensitive details of your system. And as said, this is an open-source project. Your contributions via Pull Request or Issues/Feature reports are more than welcome! Check it out now at: https://github.com/rticommunity/rticonnextdds-logparser 

If you want to learn more about Log Parser, we recommend our quick tutorial: https://github.com/rticommunity/rticonnextdds-logparser/tree/master/tutorial 

Also, listen to Episode 18 of The Connext Podcast where we talk in more detail about the benefits of RTI Labs and Log Parser. 

Let Us Know What You Thought of the Article.

Leave a Comment Below.

The Connext Podcast:
Delivering Emerging IIoT Technology with RTI Labs
rti-podcast-ep018-cover-250x250-0518.jpg
Listen Now