Cassandra is designed to handle big data. Cassandra’s main objective is to store data over multiple nodes without single point of failure.
Just because the hardware failure can occur at any time. Any node can be destroyed. In case of any failure data stored in another node can be used.
Hence, Cassandra is designed with its distributed architecture.
Cassandra stores data over different nodes with a peer to peer distributed fashion architecture.
All the nodes exchange information with each other using a protocol known as Gossip.
Gossip is a protocol in Cassandra by which nodes can communicate with each other.
Components of Cassandra:
As hardware failure may occur or links may get down at any time during data process, a solution is needed to provide a backup when a problem occurs. Hence, data is replicated for assuring no single point of failure.
Cassandra places replicas of data on different nodes based on these two factors.
One Replication factor means that there is only a single copy of data while three replication factor means that there are three copies of the data on three different nodes.
For ensuring there is no single point of failure, replication factor must be three.
There are two kinds of replication strategies in Cassandra
SimpleStrategy is used when you are having just one data center. SimpleStrategy places the first replica on a node selected by the petitioner.
After that, remaining replicas are placed in clockwise direction in the Node ring.