How does Cassandra's read path work?
Answer
Cassandra's read path must merge data from multiple sources. When a read request arrives at a coordinator: 1. Route to replicas: the coordinator determines which nodes hold the partition (based on the partition key hash) and contacts the required number based on consistency level. 2. On each replica: Row cache: if enabled and cache hit, return cached row immediately. Bloom filter: per-SSTable probabilistic filter quickly eliminates SSTables that definitely don't contain the partition. Partition key cache: caches partition key to SSTable offset mappings. SSTable scan: reads from each SSTable that may contain the partition. MemTable: checks in-memory writes. Merge: all read sources are merged, applying timestamps (latest wins) and filtering tombstones. 3. Coordinator merges: combines responses from replicas, returning the most recent data. Read repair: if replicas disagree, repair stale replicas. Reads are typically slower than writes in Cassandra — multiple SSTable reads are merged. Compaction reduces the number of SSTables to improve read performance.
More RabbitMQ & Cassandra Questions
View all →- Intermediate What is Cassandra's compaction and its types?
- Intermediate What are RabbitMQ's quorum queues?
- Intermediate How do you model time-series data in Cassandra?
- Intermediate What is Cassandra's consistency vs availability trade-off with tunable consistency?
- Intermediate What is Cassandra's anti-entropy repair?