What is Cassandra's anti-entropy repair?

Answer

Anti-entropy repair is Cassandra's mechanism for synchronizing data across replica nodes to ensure eventual consistency. Run with: nodetool repair keyspace_name table_name. How it works: for each replica, Cassandra builds a Merkle tree (a hash tree where each leaf is the hash of a data row range, and each parent is the hash of its children's hashes). Nodes exchange Merkle trees and compare them — any differences indicate inconsistent data. Cassandra then streams the inconsistent data from the most up-to-date node to stale replicas. Why repairs are needed: hints missed (node was down when a write occurred), network partition, partial failures. Repair frequency: run within every gc_grace_seconds period (default 10 days) — or tombstones may not be cleaned up properly, causing data resurrection. Incremental repair: only repairs data changed since the last repair — much faster than full repair. Repair in production: use nodetool repair -pr (primary range only) to distribute repair load across nodes. Never let repairs lapse in production.