Key Cassandra metrics

By monitoring Cassandra performance, you can identify bottlenecks, slowdowns, or resource limitations and address them in a timely manner.

Node metrics

When inspecting Cassandra nodes for performance issues, the following metrics are the most helpful in determining the root cause:

The count of errors and warnings in the logs.
The indication of free disk space and Garbage Collection metrics.
Such CPU metrics as system_io_wait and user_wait.
Such disk metrics as disk queue length and throughput.
Such network metrics as latency.

Cassandra metrics

Monitor the following Cassandra metrics for troubleshooting and fault prevention:

Metric	Details	Threshold	Additional information
SSTable count	The `nodetool cfstat` command provides the SSTable countThe `./nodetool cfstats \| grep "SSTable count" \| awk '{print $3}' \| sort -n` provides the SSTable sorted count	Less than 30
Partition size	The `nodetool tablehistograms` setting provides the partition sizes. 2GB is the maximum value. However, any approximate value close indicates an issue.	Less than 10MB	The `SizeTieredCompactionStrategy` strategy must come with at least 50% free disk space to allow `C*` to write data during compaction. You can use the `LeveledCompactionStrategy` only if 90% of requests are read.
`Nodetool tpstats`	This setting provides details for dropped mutations or messages that were not saved to disk yet but are stored in memory.	50 mutations or messages. If the data is prevented from being saved after the `nodetool flush` command, there is an issue with the data model.
Node status	Run the `nodetool status` command to check the cluster status.
Compaction rate	Saves the data from memtable to sstables. The default value is 16 MB/sec

Useful commands

The following is a list of the most useful Cassandra commands that are helpful in maintaining the good health of the cluster:

nodetool flush

Writes data from memtables to SSTables in the file system. Run this command if the nodetool tpstats command returned a high count of thread pools.

nodetool cleanup

Removes unwanted data, that is, the data that us no longer owned by node. Run this command after a new node joins the cluster and after data redistribution.

nodetool repair

Repairs one or more nodes in a cluster and provides options for restricting repair to a set of nodes. The following additional repair modes are available with the nodetool repair command:

incremental – Separates fixed data from to be fixed data. Examines all sstables but repairs only damaged ones.
full – Examines and repairs all sstables. Irrespective of an SSTable being damaged or not.
seq – Sequential repair. Puts less load on the cluster during repair and takes more time.
par – Parallel repair. Puts more load on the cluster during repair and takes less time.

nodetool bootstrap

Checks the status of addition of a new node to the cluster. Run the nodetool cleanup on each of already existing nodes to remove unwanted data in them. Also in cassandra.yaml file, set the autobootstrap setting to false to prevent automatic token transfer as soon as you add a node. To start the transfer manually, run the nodetool bootstrap resume command.