Replication and persistence features of RTI Queuing Service
Written by Fernando Martin
August 14, 2015
There are many queuing services available, but few support both persistence and replication. If preserving data integrity is vital to your business and you also need high performance or a full remote administration, RTI Queuing Service may be just what you need.
When considering replication, persistence or any other aspect of RTI Queuing Service, you should keep in mind the most basic and unique of its features: it uses DDS as the underlying messaging technology. If you don't know RTI Queuing Service yet but you are familiar with DDS, you will find its concepts and APIs familiar. RTI Queuing Service seamlessly integrates with your existing DDS system. You can also use it to bring the power of DDS to your already existing queuing-based systems.
There are different scenarios where a replicated queuing service can help you. You may need a highly available service that will stay up when part of your system or network goes down. Additionally, you may also need to assure the integrity and consistency of your data when a failure occurs. If you are dealing with a queue of financial transactions, for example, you not only want your system to stay operative most of the time, you also need to know which transactions went through and which didn’t when a failure occurs. A replicated queuing service can keep your system running and at the same time provide a safe place for your data even under failure.
To provide a highly available service, RTI Queuing Service leverages the decentralized architecture and discovery capabilities of DDS. If part of your system fails, the surviving DDS subsystems will still communicate and keep you running. During a failure RTI Queuing Service will allow you to remotely administer the surviving subsystem. You will be able to issue remote administration commands without fear of a final, inconsistent global state once the failing nodes recover.
Remarkably, all RTI Queuing Service replication features also apply to each and every aspect of its remote administration capacities, and these capacities are wide! You can remotely create and destroy queues, see their messages, filter messages using SQL expressions or empty a queue. You will be able to perform these operations during a failure, changing your system configuration with the guarantee that it will end up in a globally consistent state. Not many Queuing Services (if any) can offer this.
If you not only want a highly available service but also need to ensure data consistency under failure, RTI Queuing Service provides a robust replication protocol. The protocol is based on a level of redundancy provided by the user. This redundancy level applies to messages as well as message operations, such as the acknowledgment or the assignment of a message to a recipient. The focus of the replication protocol is not on redistributing messages to all operating nodes but on ensuring that only messages that are successfully replicated are transacted and that only message operations agreed upon by multiple nodes occur. DDS reliable Quality of Service already ensures that messages will eventually be delivered to the Queuing Service nodes when possible.
RTI Queuing Service makes sure that a queue message producer gets an acknowledgment for a message only if the message was successfully replicated with the desired redundancy level. The message will be negatively acknowledged if the enqueuing process fails to achieve the desired redundancy. Only acknowledged messages will be sent to consumers, and the rest will be consistently discarded across all the nodes. Optionally you can also require your messages to be delivered if the message recipient is known with the desired redundancy level; this is useful when a message has to be redelivered after a failure.
To configure RTI Queuing Service you are asked for the maximum number of nodes you are planning to run – the number of queue instances. From this number the replication level is calculated as the lowest integer that is higher than queue instances divided by 2.
Under the hood the queuing service nodes choose a master as the orchestra conductor for all operations. The master node is chosen automatically and it can change at any time depending on a variety of factors. You can set a master timeout period that controls for how long the nodes can remain unable to coordinate with others before they undergo an internal reconfiguration with a new master election process. The election is based on which nodes are most up to date so there is a high probability a node among your best nodes is elected master. You don’t need to worry about it.
If replication is not sufficient, you may consider whether any of the various persistence modes of RTI Queuing Service fits your needs. Using replication does not guarantee the integrity of your data under catastrophic failure affecting all or most of your nodes. But your data will survive almost anything if persistency – perhaps combined with replication – is used across multiple nodes. RTI Queuing service features configuration persistence and two modes of data persistence.
The two data persistence modes supported are quite different and may be useful in different situations. If you have many queues or you are keeping in the queues a very large number of messages (so large that they will not fit in your volatile memory), you can use a persistence mode based on an underlying database implementation. In this mode all the messages and their metadata (the message state) is at all times in your database (in your hard drive) and only there. On the other hand, if you do not have a large number of messages in the queues and want better performance, you can choose an alternative mode that keeps the messages’ metadata in both the volatile memory and the hard drive while messages themselves are kept only in a file system in the persistent storage.
Both persistence modes support the familiar hard drive synchronization modes that control whether the data goes straight to the hard drive or is allowed to live for a while in your OS buffers. Hard drive operations are slow and it is frequent practice to combine multiple hard drive writing operations in one. If you set synchronization FULL, every single modification of your data or metadata will be in the hard drive the moment it takes place.
RTI Queuing Service also allows you to persist the configuration of your entire system. This is very handy if you use remote administration to create queues or to dynamically configure a large system. If you end up with a very large and finely tuned system, you’ll want to ensure that the configuration isn’t lost.
Now you can benefit from the high-throughput, low-latency capabilities of DDS, the middleware used in many of the most sophisticated real-time systems around the world. And you can do queuing with the highest guarantees for the safety and integrity of your data.
For more information about RTI Queuing Service, please visit https://www.rti.com/products/dds/queuing-service.html.