As you’re about to create a multi-node Cassandra cluster, you must identify how many servers you’d like to have in your cluster and configure each of them.
It is recommended, but not required, to have the same or similar specifications.
To complete this tutorial, you’ll need following prerequisites:
You need at least two Ubuntu 14.04 servers configured using this initial setup guide.
Every server must be secured with a firewall using this IPTables guide.
Every server must also have Cassandra installed by following this Cassandra installation guide.
Step 1 — Deleting Default Data
Servers in a Cassandra cluster are called as nodes.
What you have on every server right now is a single-node Cassandra cluster. In this step, we will set up nodes to function as a multi-node Cassandra cluster.
All the commands in this and subsequent steps must be repeated over every node in the cluster, so be sure to have as many terminals open as you have nodes in the cluster.
The first command you’ll run on every node will stop the Cassandra daemon.
sudo service Cassandra stop
When that’s completed, delete the default dataset.
sudo rm -rf /var/lib/cassandra/data/system/*
Step 2 — Configuring the Cluster
Go to cassandra.yaml file placed in the bin directory of the setup folder.
Only the following directives are required to be modified to set up a multi-node Cassandra cluster:
cluster_name: It’s the name of your cluster.
Note: All the nodes in your cluster need to have exactly the same name.
seeds: This is a comma-delimited list of the IP address of every node in your cluster.
listen_address: This is IP address which other nodes in the cluster will use to connect to this one. By defaults its localhost and needs changed to the IP address of the node.
rpc_address: This is IP address for remote procedure calls. It defaults to localhost. If server’s hostname is properly configured, leave this as is. Otherwise, change to server’s IP address or the loopback address (127.0.0.1).
endpoint_snitch: Name of snitch, which is what tells Cassandra about what its network looks like. This defaults to SimpleSnitch, which is utilized for networks in one datacenter. Here, we’ll change it to GossipingPropertyFileSnitch, which is preferred for production setups.
Step 3 — Configuring the Firewall
At this point, the cluster has been configured, but nodes are not communicating. Here, we’ll configure the firewall to allow Cassandra traffic.
First, restart Cassandra daemon on every node.
sudo service cassandra start
If you need to check status of cluster, you’ll find that only the local node is listed, because it’s not yet able to communicate with the other nodes.
sudo nodetool status
IN order to allow communication, we need to open the following network ports for every node:
- 7000: which is TCP port for commands and data.
- 9042: which is TCP port for native transport server. cqlsh, the Cassandra command line utility, will connect to cluster through this port.
To modify firewall rules, open the rules file for IPv4.
sudo vi /etc/iptables/rules.v4
Copy and paste the following line within the INPUT chain, which will allow traffic on the a for mentioned ports.
You can insert following line just before the
# Reject anything that’s fallen through to this point comment.
The IP address specified by -s must be the IP address of another node in a cluster.
If you have two nodes with IP addresses
the rule on the
22.214.171.124 machine must use the IP address 126.96.36.199.
-A INPUT -p tcp -s your_other_server_ip -m multiport –dports 7000,9042 -m state –state NEW,ESTABLISHED -j ACCEPT
After adding rule, save and close the file, then restart IPTables.
Step 4 — Check the Cluster Status
We’ve now completed all the steps needed to make the nodes into a multi-node cluster. You can verify that they’re all communicating by checking their status.
sudo nodetool status
If you can now see all the nodes you configured, you’ve just successfully set up a multi-node Cassandra cluster.
You can also check if you can connect to the cluster using cqlsh, the Cassandra command line client.
Note that you can specify the IP address of any node in the cluster for this command.
cqlsh your_server_ip 9042