Introduction
In this blogpost we will build clean and simple containerised Apache Cassandra cluster for local testing. A modern alternative to ccm (Cassandra Cluster Manager), taking advantage of Docker containers, while keeping the full control of Cassandra configuration.
This approach is based on the official image for cassandra (by Docker Official Images). Being able to rely on official image is important, because it is trusted, maintained and is scanned for security vulnerabilities.
We will be managing Cassandra configuration directly by attaching it via volumes. This will allow us to change any configuration we need quickly, without a need to re-build the image every time.
docker-compose will be used to orchestrate the Cassandra containers, network and volumes.
Kubernetes could be used for local orchestration as well, eg. minikube. Though the downside would be much more complex and verbose configuration, plus the control plane has a performance overhead, which can become critical for local testing. Basically it’s just an overkill for this purpose.
Step 0: Meet the requirements
- Make sure Docker is installed
- So is the docker-compose
- The host machine needs to have 6Gb of RAM free
Step 1: Get the config files for the Cassandra version you need
Pick a specific Cassandra version from the image tags. When it comes to databases it always better to use a specific version, rather than just picking the latest.
Pull the image first:
docker image pull cassandra:3.11.8
docker run --rm -d --name tmp cassandra:3.11.8
docker cp tmp:/etc/cassandra etc_cassandra-3.11.8_vanilla
docker stop tmp
Now we have Cassandra config templates under etc_cassandra-3.11.8_vanilla that we will use a bit later.
Step 2: docker-compose.yml file
Create the file named docker-compose.yml with the content below.
version: '2.4' # 2.4 is the last version that supports depends_on conditions for service health
networks:
cassandra: # docker network where all cassandra nodes will be put in
services:
cass1:
image: cassandra:3.11.8 # better to use a specific version, if you want to control upgrades
container_name: cass1
hostname: cass1
mem_limit: 2g # It's not strictly required, but it's better to have some memory limit
healthcheck:
test: ["CMD", "cqlsh", "-e", "describe keyspaces" ]
interval: 5s
timeout: 5s
retries: 60
networks:
- cassandra
ports:
- "9042:9042" # Expose native binary CQL port for your apps
volumes:
- ./data/cass1:/var/lib/cassandra # This is the volume that will persist data for cass1 node
- ./etc/cass1:/etc/cassandra # Use your own config files for full control
environment: &environment # Declare and save environments variables into "environment"
CASSANDRA_SEEDS: "cass1,cass2" # The first two nodes will be seeds
CASSANDRA_CLUSTER_NAME: SolarSystem
CASSANDRA_DC: Mars
CASSANDRA_RACK: West
CASSANDRA_ENDPOINT_SNITCH: GossipingPropertyFileSnitch
CASSANDRA_NUM_TOKENS: 128
cass2:
image: cassandra:3.11.8
container_name: cass2
hostname: cass2
mem_limit: 2g
healthcheck:
test: ["CMD", "cqlsh", "-e", "describe keyspaces" ]
interval: 5s
timeout: 5s
retries: 60
networks:
- cassandra
ports:
- "9043:9042" # Expose native binary CQL port for your apps
volumes:
- ./data/cass2:/var/lib/cassandra # This is the volume that will persist data for cass2 node
- ./etc/cass2:/etc/cassandra # Use your own config files for full control
environment: *environment # point to "environment" to use the same environment variables as cass1
depends_on:
cass1: # start cass2 only after cass1 is healthy
condition: service_healthy
cass3:
image: cassandra:3.11.8
container_name: cass3
hostname: cass3
mem_limit: 2g
healthcheck:
test: ["CMD", "cqlsh", "-e", "describe keyspaces" ]
interval: 5s
timeout: 5s
retries: 60
networks:
- cassandra
ports:
- "9044:9042" # Expose native binary CQL port for your apps
volumes:
- ./data/cass3:/var/lib/cassandra # This is the volume that will persist data for cass3 node
- ./etc/cass3:/etc/cassandra # Use your own config files for full control
environment: *environment # point to "environment" to use the same environment variables as cass1
depends_on:
cass2: # start cass3 only after cass1 is healthy
condition: service_healthystart cass3 only after cass2
This is going to create a Cassandra cluster consisting of 3 nodes, that will start in specific order: cass1, cass2, cass3.
Each node has two volumes setup: one for data and one for config files
cass1 and cass2 are nominated as seed nodes.
Note, that even though we can get hold of config files, you still need to set CASSANDRA_* environment variables.
Ports are exposed to the host as well, for you to be able to connect with your app on localhost. Alternatively you can put your app into a container in the same docker network.
Step 3: copy config files for each node in the cluster
mkdir -p etc
cp -a etc_cassandra-3.11.8_vanilla etc/cass1
cp -a etc_cassandra-3.11.8_vanilla etc/cass2
cp -a etc_cassandra-3.11.8_vanilla etc/cass3
Step 4: start and test the cluster
docker-compose up -d
Check that cassandra containers are starting
docker ps
Monitor cluster status
docker exec cass1 nodetool status
Check CQL is working
docker exec -it cass1 cqlsh -e "describe keyspaces"
Congratulations! You have a working cluster!
Step 5: Do any configuration you need
The purpose of this approach is to be able to change ANY configuration you need. So, let’s change some!
We will enable Cassandra user authentication.
You will need to edit cassandra.yaml for *every* node.
These files:
./etc/cass1/cassandra.yaml
./etc/cass3/cassandra.yaml
./etc/cass2/cassandra.yaml
authenticator: PasswordAuthenticator
authorizer: CassandraAuthorizer
And restart the cluster:
docker-compose restart
Try connecting with cqlsh again
docker exec -it cass1 cqlsh -e "describe keyspaces"
Connection error: (‘Unable to connect to any servers’, {‘127.0.0.1’: AuthenticationFailed(‘Remote end requires authentication.’,)})
Oops! It isn’t working any more!
Which is a good thing, because we haven’t provided the default cassandra credentials. So, let’s do that:
docker exec -it cass1 cqlsh -u cassandra -p cassandra -e "describe keyspaces"
This was just an example, and you can do any changes you need, to any Cassandra configuration files under etc/<node>/
You can use your own configuration management to maintain those configs eg. Ansible. Don’t forget to save those(and docker-compose.yml as well) to version control.
Conclusion
We are now able to bootstrap a beautiful, reproducible Cassandra cluster based on the official docker image, while preserving the data and having the ability to change any configuration we need.
The automated and ready-to-use version of the approach is available in github
Such cluster can also be used to test more complex things, a few examples
- TLS setup – you can just put keystores and trustores jks files on the config volume and refer to them in the config
- Rolling cluster upgrade – just change the image version tag for each container one-by-one and run docker-compose up -d again. You will also need to make sure you copied the configs for the new version
- Replication factor and data distribution
You can also take advantage of many goodies Docker can provide. Like resource limiting, healtchecks, or pausing the entire cluster when it’s not used, eg.
docker-compose pause
# to resume it back
docker-compose unpause
And last but not least, this approach can be used not just for Cassandra, but basically anything.

Stanislav Kelberg
DevOps Engineer
Related Articles
Incremental backups with rsync and hard links
There are many different options that control the behaviour of the backup process and how it determines what files to copy, link or delete, this blog describes how to build a simple incremental backup solution using rsync and hard links.
Cassandra with AxonOps on Kubernetes
How to deploy Apache Cassandra on Kubernetes with AxonOps management tool. AxonOps provides the GUI management for the Cassandra cluster.
Prometheus Blackbox-Exporter – monitoring TLS certificates
A short blog on how to monitor SSL certificate expiry on databases such as Apache Cassandra using Prometheus and visualise on a Grafana dashboard.