Efficient Replication using Multicast Protocols

By Oscar Wahlberg, Director of Product Management

Replication as Data Protection

The use of replication as a data protection method is not new. Historically, it was relegated to disaster recovery for tier-1 applications and did not have a role in day-to-day backups.

Recently, however, many customers are using replication as their primary mechanism for backup and recovery for all tiers of systems – primarily due to the advent of object storage systems with built in replication and versioning. Unfortunately, the side effect is that this significantly increases the amount of data that needs to be replicated.

If one is to rely on replication as the primary method of data protection, one must replicate each version of every object to multiple nodes. Many systems transmit the entire object when it changes, then replicate the object multiple times if they are replicating it to multiple destinations. This exacerbates the issue by requiring even more bandwidth.

There is another way to protect your data…

NexentaEdge reduces the amount of bandwidth necessary to send the data to multiple nodes by doing two things: 1) using a multicast replication protocol that can send data to multiple nodes with a single transmission and 2) sending only portions that have been changed within an object instead of the entire object.

Replicast is our multicast replication protocol that allows us to transmit data to as many nodes as necessary without having to transmit it multiple times. An object is split into many chunks and each chunk transmitted by a single Replicast transmission to multiple destinations.
During transmission, the chunk is split into multiple UDP datagrams. Those destinations then reassemble received datagrams and compare them against a SHA-256 hash value for the chunk. This provides a much stronger check against data corruption than typical network checksums. Replicast transmission is intelligent enough to sustain the loss of UDP datagrams during chunk transmission and will retransmit only lost datagrams – hence, this greatly reduces bandwidth usage on busy networks.

Replicast also results in efficiencies when determining where to send data. Most replication systems decide which nodes to replicate on a round-robin basis without any consideration for how busy an individual node happens to be. Replicast uses multicast datagrams to dynamically select the storage servers for each chunk.

Instead of deciding amongst all servers, Replicast looks up a multicast group that is a subset of all nodes, then those nodes dynamically select which nodes will ultimately store the chunk. This allows fewer nodes to serve a higher number of IOPs by ensuring that the most available nodes are the ones that will receive new data, and other nodes that are congested may be given a break.

Splitting each object into chunks also gives us the ability to use cryptographic hashes to identify which chunks have changed and which haven’t. To store a new version of an object, we only need to replicate the chunks that have changed and update the metadata for that object. This results in significant bandwidth savings over alternative approaches.

Replication is playing a much greater role in day-to-day data protection, and there is significant room for improvement in its efficiency. The combined use of Replicast, our multicast replication protocol, and only transmitting changed chunks for new object versions is about as efficient as replication can get.


Nexenta is the global leader is Software-defined Storage.

Tagged with: , , , , , , ,
Posted in Software-defined data center, Software-defined storage

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: