What is Cassandra's anti-entropy repair?
Answer
Anti-entropy repair is Cassandra's mechanism for synchronizing data across replica nodes to ensure eventual consistency. Run with: nodetool repair keyspace_name table_name. How it works: for each replica, Cassandra builds a Merkle tree (a hash tree where each leaf is the hash of a data row range, and each parent is the hash of its children's hashes). Nodes exchange Merkle trees and compare them — any differences indicate inconsistent data. Cassandra then streams the inconsistent data from the most up-to-date node to stale replicas. Why repairs are needed: hints missed (node was down when a write occurred), network partition, partial failures. Repair frequency: run within every gc_grace_seconds period (default 10 days) — or tombstones may not be cleaned up properly, causing data resurrection. Incremental repair: only repairs data changed since the last repair — much faster than full repair. Repair in production: use nodetool repair -pr (primary range only) to distribute repair load across nodes. Never let repairs lapse in production.
Previous
What is Cassandra's consistency vs availability trade-off with tunable consistency?
Next
What are RabbitMQ streams?
More RabbitMQ & Cassandra Questions
View all →- Intermediate How does Cassandra's read path work?
- Intermediate What is Cassandra's compaction and its types?
- Intermediate What are RabbitMQ's quorum queues?
- Intermediate How do you model time-series data in Cassandra?
- Intermediate What is Cassandra's consistency vs availability trade-off with tunable consistency?