Key Cassandra metrics

By monitoring Cassandra performance, you can identify bottlenecks, slowdowns, or resource limitations and address them in a timely manner.

Node metrics

When inspecting Cassandra nodes for performance issues, the following metrics are the most helpful in determining the root cause:
  • The count of errors and warnings in the logs.
  • The indication of free disk space and Garbage Collection metrics.
  • Such CPU metrics as system_io_wait and user_wait.
  • Such disk metrics as disk queue length and throughput.
  • Such network metrics as latency.

Cassandra metrics

Monitor the following Cassandra metrics for troubleshooting and fault prevention:

Metric Details Threshold Additional information
SSTable count The nodetool cfstat command provides the SSTable countThe ./nodetool cfstats | grep "SSTable count" | awk '{print $3}' | sort -n provides the SSTable sorted count Less than 30
Partition size The nodetool tablehistograms setting provides the partition sizes. 2GB is the maximum value. However, any approximate value close indicates an issue. Less than 10MB The SizeTieredCompactionStrategy strategy must come with at least 50% free disk space to allow C* to write data during compaction. You can use the LeveledCompactionStrategy only if 90% of requests are read.
Nodetool tpstats This setting provides details for dropped mutations or messages that were not saved to disk yet but are stored in memory. 50 mutations or messages. If the data is prevented from being saved after the nodetool flush command, there is an issue with the data model.  
Node status Run the nodetool status command to check the cluster status.  
Compaction rate Saves the data from memtable to sstables. The default value is 16 MB/sec  

Useful commands

The following is a list of the most useful Cassandra commands that are helpful in maintaining the good health of the cluster:
nodetool flush
Writes data from memtables to SSTables in the file system. Run this command if the nodetool tpstats command returned a high count of thread pools.
nodetool cleanup
Removes unwanted data, that is, the data that us no longer owned by node. Run this command after a new node joins the cluster and after data redistribution.
nodetool repair
Repairs one or more nodes in a cluster and provides options for restricting repair to a set of nodes. The following additional repair modes are available with the nodetool repair command:
  • incremental – Separates fixed data from to be fixed data. Examines all sstables but repairs only damaged ones.
  • full – Examines and repairs all sstables. Irrespective of an SSTable being damaged or not.
  • seq – Sequential repair. Puts less load on the cluster during repair and takes more time.
  • par – Parallel repair. Puts more load on the cluster during repair and takes less time.
nodetool bootstrap
Checks the status of addition of a new node to the cluster. Run the nodetool cleanup on each of already existing nodes to remove unwanted data in them. Also in cassandra.yaml file, set the autobootstrap setting to false to prevent automatic token transfer as soon as you add a node. To start the transfer manually, run the nodetool bootstrap resume command.