In episode 23, we continue our discussion of performance benchmarking with Sander Mertens. In part 2 we learn how to find the best optimization tools and which statistics are most important for measuring the performance of your IIoT system. We’ll also dive into Connext DDS Secure with a surprise host.
In Episode 23 of The Connext Podcast:
- [1:20] - What statistics are most important to look at when measuring the performance of an attribute like latency?
- [4:11] - How percentiles provide visibility on the behavior and predictability of your system.
- [6:30] - What’s the first step in finding the perfect tools to optimize your system?
- [9:39] - How to measure performance when scaling an entire system vs. a couple of applications.
- [12:16] - Diving into DDS Security in regards to performance
- [Blog] Benchmarking Connext DDS vs. Open Source DDS
- [Blog] Announcing the Latest RTI Perftest for Connext DDS
- [Datasheet] RTI Connext DDS Professional
- [Datasheet] RTI Connext DDS Secure
- [Podcast] Measuring Performance Metrics of Connext DDS vs. Open Source DDS, Part 1
Steven Onzo : Hi! Welcome to Episode 23 of the Connext podcast. In today's episode we'll be continuing a discussion with Principal Applications Engineer, Sander Mertens about the statistics behind performance benchmarking and how to find the perfect system optimization tools. We'll also learn how percentiles provide deeper visibility into the behavior and predictability of your system. We hope you enjoy this episode.
Niheer Patel: Welcome to another episode of the Connext podcast, this is Niheer Patel and today with me I have Sander Mertens again. We talked last time about performance benchmarking and DDS in particular and we wanted to extend and meet some of our promises we made in that podcast to dig deeper into the statistics behind performance benchmarking. From there we'll talk a little bit more about tooling and comparisons between tools, and then at the end of it we're going to have a little twist for you so stay tuned for that.
Niheer Patel: Sander thanks for joining me again. Thanks for all of the insight into performance for DDS and just as a reminder to our audience, Sander is from the professional services team so for performance needs, make sure you reach out to Sander and the team. But Sander, thanks for joining.
Sander Mertens: Sure thing.
[1:20] What statistics are most important to look at when measuring performance of an attribute like latency?
Niheer Patel: So Sander last time we talked about apples-to-apples comparisons and extending that into this conversation going to be really critical, especially when we start talking about statistics and digging into the fundamentals of what it means to really compare performance between DDS implementations or just applications and systems on a whole. If we take one of the attributes from last time, we talked about latency, jitter and throughput. Let's just start with latency and let's break down. Is it important for us to look at max versus max or minimum versus minimum or what about median and mean?
Sander Mertens: Right. So there's different ways in which you can measure latency. There's different statistics that you can keep track of so just to start, a lot of tools will give you the average latency. It basically means you take all the measurements and you add them all up and then you divide them by the number of measurements. And so it's useful to get a general sense of how your implementation is performing but it doesn't tell you a lot about how predictable the implementation is because one of the things that happens when you're measuring average is you're sort of getting rid of a lot of information on the highs and the lows.
Sander Mertens: Maybe my average performance is good but maybe there's a lot of variation in that performance and that's something that you lose in an average number. So average by itself is usually not enough. You need more than that and so what you'll see is that some other tools provide you with, for example, the minimum latency as well in addition to the average.
Sander Mertens: So in addition to average minimum we also have maximum latency. And maximum is the opposite of minimum so you measure the maximum of your benchmark. Maximum is actually quite useful because it tells you something about the variation between the average and the maximum and usually when the difference between the average and the maximum is low, that is good because that means that your system as a whole is behaving pretty predictably. So by taking the average and the maximum, those two are usually pretty useful. If you only have minimum and average, that's usually, that doesn't say a lot about performance as a whole.
[4:11] - How percentiles provide visibility on the behavior and predictability of your system
Niheer Patel: So if I can see if I can break this down. Average, it's an important attribute but we need to see, okay how many times are we really hitting that average and we're talking about variation here. So going back to your comments on min and max, just because you have a very low min doesn't mean that it's a great system, that it's running very performance. Just because you have a really high max doesn't mean that you have a really bad system but that difference is what really matters, is that difference between the average and the max or the difference between the average and the min. Ideally, in a very highly performant system, you'd want to reduce that jitter. We're talking real time control systems here, distributed real time control systems, so it's really important that you can trust that jitter is going to be within your range of tolerance.
Sander Mertens: Exactly. Yes. You're exactly right. So the distance between the minimum and the maximum and the average. If that is very low then that means your system is behaving very predictably. Now often that's not the case however because if you're running on an operating system that is not real time and most operating systems aren't, then you will have some variation and typically the maximum, there are some spikes. So you will want to keep track of how often do the spikes occur.
Sander Mertens: If it's only one per 10,000 samples, might be okay. The really good tools and we provide one of those tools, they provide you something more. They actually provide you with percentiles. And a percentile basically tells you about how much of a given benchmark was above a certain latency. And so that way you can see if 90% of my benchmarks was below this certain number, then that means that my system was behaving predictably for 90% of the time. If there are some very high benchmarks, very high latency benchmarks, both the 99%, then I know that maybe for 1% my system wasn't behaving very predictably but at least I know that for 99% it was.
Sander Mertens: So percentiles give you lot more visibility in the actual behavior of the system and if you wanna get a really good feel about how it behaves and how predictable it is, then that's the way to go.
[6:30] - What’s the first step in finding the perfect tools to optimize your system?
Niheer Patel: Right, that's really important right? It's really knowing how often your system is behaving in the way you want it to behave. For those of you who are completely enthralled with this conversation on statistics, Sander has written a blog helping to describe just all these things we've been talking about with the different min, max, median, average and the percentile so make sure you go check that out on the RTI website. We'll have the link and the show notes.
Niheer Patel: But I think you had a really good point, well two really good points that I'd like to address. One is that we have an awesome tool that provides that and that's a tool called PerfTest and you'll hear us talking about this a lot so that's on our community site available for free. No need to even get a license for DDS but talk to the team, Sander, myself and we can get you hooked up.
Niheer Patel: And then the other good point is that there are these other tools. Even if these other tools are producing min, median, max, we need to really consider how they're measuring and providing that information before you start comparing performance analysis of one tool on one system with a different tool on a different system. We're no longer comparing apples to apples here. We're comparing apples to puppies as maybe our CEO might describe.
Niheer Patel: Can you talk a little bit more about how should our listeners reconcile the plethora of tools out there and the different technologies they're gonna use for their systems? What is maybe a first step they can take to assess if they're on the right path?
Sander Mertens: Right, so what is really important is that you know what your tool is doing. In a lot of cases, what I've seen is implementations provides tools to benchmark the performance and they don't necessarily tell the user what its doing. You basically just run the tool, you provide a few parameters and then it just starts off doing its own thing and it will give you some numbers, right. And this is actually something that we've mentioned before and I've also mentioned in the blog. I mention that you essentially have to go into the source code to check what a tool is doing before you can really trust the metrics that come out of it.
Sander Mertens: However, we actually did a recent update to our PerfTest tool because this is actually an issue that was affecting us as well. Well we've made some improvement there so now when you start the tool it actually tells you exactly what the tool is doing, which quality of service settings it's using, how it's measuring the benchmarks and which benchmarks it's measuring. So that way when you start the tool you immediately know what's going on and that's really important. Not all the tools do that but at least when you're using PerfTest you can see what's happening.
[9:39] - How to measure performance when scaling an entire system vs. a couple applications
Niheer Patel: And this is an important point for the audience is if you are using RTI tools and you do feel like there's a way that PerfTest could provide you with more transparency, please reach out. The whole point is to get inside into your system and that's what Sander is talking about here is how do we get into looking at the performance of our system and how do we know and trust that data? So anything that we can do to help you trust the data that you generate, we are happy to do so.
Niheer Patel: Okay so we've talked about statistics, we talked about performance tools. Talked about your blog and the update to PerfTest so let's just, before we get into that twist I mentioned, let's just cap this performance discussion a little bit more with comparing benchmarking a couple applications versus an entire system. So with our applications you're just testing, I'm sending us a data sample from application A to application B over some transport. What do I do now if I'm scaling a system? What are the considerations I have to make?
Sander Mertens: Right. So that's a whole different ball game. With our benchmark tools you can basically measure the ... Let's say the core benchmarks. Like how fast can I transfer data from one application to another application. When you're benchmarking a system there are a lot of things that you also have to take in account like your application logic. These benchmark tools, they don't do anything besides pumping data around but when you have an actual system then these applications are gonna do something with the data and that is gonna impact the timing.
Sander Mertens: Also what you typically see in the system is that communication can be staggered. So one message triggers another message which triggers some behavior. Those things are important and they can affect the overall performance of a system so it's important that you measure the output of the system. And that's something that is difficult to measure with a generic tool. We do our best but some things are...You just can't measure them with a general tool. I guess what I'm saying is even though it is good to get a sense of the overall performance of an implementation before I get started with it, so PerfTest, definitely run it.
Sander Mertens: It is also important to keep in mind that an actual system might behave in unpredictable ways and you have to make sure that you also benchmark that part.
[12:16] - DDS Security in regards to performance
Niheer Patel: So it sounds like you have to take into account the data flows of your system. It's not just two applications talking to each other, you have multiple applications talking to each other and complex manners so identifying critical paths and trying to measure the performance from one end of that path to the complete output of that system is important. Certainly, again in real time control systems it's absolutely critical. So thanks Sander I really appreciate that.
Niheer Patel: So now for our twist, ladies and gentlemen, Sander is going to interview me with regards to DDS security, so we'll talk a little bit about the high levels of DDS security in the context of performance. As a little bit of a teaser for you to reach out and ask more in-depth questions. So Sander, take it away.
Sander Mertens: Yeah, sure. Okay well so maybe you can start with explaining why is security relevant to this conversation at all?
Niheer Patel: When you apply security, and we're talking encryption signing, there's a lot of cryptography involved. These are power hungry algorithms so using them in a meaningful way helps make that trade off between security and performance. DDS security in particular offers fine grained and data centric so it's really extending the principles of DDS. It's part of the DDS specification. Really extends to focusing on protecting the data and you as an architect of a system can apply DDS security in a meaningful way that still fits your performance requirements.
Sander Mertens: Can you maybe describe what kind of things does DDS security provide?
Niheer Patel: Yeah. There's a lot that it does provide, so I'll focus on the more performance intensive applications here. DDS security can encrypt, it can sign data packets or DDS topics. So if I have two participants, we'll keep it simple, two participants with three or four different topics going back and forth, I can choose to encrypt or sign any one of those topics and then I can choose not to do any kind of security for other topics. So if I have some temperature data, just ambient temperature going between a healthcare device to some sort of monitor, I could take a thermometer, I can read that data in that room. It may not make sense to waste compute cycles to encrypt that data but maybe I need to trust that data for some other critical control systems so I might want to assign that data to know that nobody's tampered with it. So I can sign that data and not exercise extra compute cycles trying to encrypt it.
Sander Mertens: All right, so sounds like we have pretty granular control over what you can do with security and DDS?
Niheer Patel: Absolutely. That's our tagline is fine grain control of security. I mentioned the topics but if we go down to the packet level, we can even get finer grain there. Instead of encrypting the entire RTPS packet I could encrypt maybe the sub message or just the user data payload. You have a whole lot of choices in where you apply security so that you can still meet performance requirements. So not only can you choose which topics are protected, but you can go down to the packet level and choose what parts of the packet are protected.
Sander Mertens: A lot of systems today, they are using TLS for example. Can you maybe contrast the things that you get that people are doing today with TLS with DDS?
Niheer Patel: Yeah absolutely. With DDS security it's one layer of security in a system. There may be places for TLS so this by no means is to say one is right for your system or one is not right for your system but when we compare and contrast, TLS typically is a point to point solution and encrypts and signs everything. So you have basically an all or none security solution. For some systems that's appropriate. With our web systems nowadays, I don't trust a website that doesn't use TLS to encrypt my credit card data.
Niheer Patel: But when we're talking scaled to real time control systems, DDS can run over any transport for one and then you can encrypt or sign any fraction or component of that data. So you're not tied into this all or none security solution.
Sander Mertens: Can you maybe tell me a little bit about how we can leverage things like features that are provided by an operating system or by hardware acceleration?
Niheer Patel: Certainly. With a lot of hardware platforms coming out nowadays, there's hardware based cryptography or crypto accelerators so we can continue using software, crypto libraries that just execute a lot faster. The instructions execute a lot faster on that hardware. For example, Intel has their AES NI crypto acceleration which we take advantage of that really cuts down on the required compute cycles so you're offloading that to this other engine but from taking advantage of any kind of operating system, if there's any kind of high performant cryptography library in an operating system or hardware, it's really agnostic to DDS security which will take advantage of those offerings and then apply it to your data.
Niheer Patel: So you could plug in any kind of crypto engine. Hardware crypto engine, software crypto engine and DDS will simply make API calls out to that library or that hardware and apply it to security. So now not only are you taking advantage of fine grained security with DDS security but you're taking advantage on top of that with hardware acceleration, so you get that added bump in performance.
Sander Mertens: All right. Yeah that makes sense, I'm learning something.
Niheer Patel: Well thanks for being a good sport and interviewing me Sander. Wanted to give you guys a taste of DDS security in the context of performance. Again, there's a lot more to talk about with performance itself, with security and security and performance. So be sure to reach out to either your local RTI account team or to any of the folks here on the podcast. We're happy to help, get you in touch with anyone at RTI that can help you architect and scale your system in a performant way. Thank you.
Steven: Thanks for listening to this episode of the Connext podcast. If you have any questions or suggestions for future interview, please be sure to reach out to us either on social media or at firstname.lastname@example.org. Thanks and have a great day.