What is the CAP theorem and how does it apply to database choice?
Why Interviewers Ask This
This tests whether you can apply System Design knowledge to real-world scenarios. Interviewers are looking for clarity of thought and evidence that you've encountered this in production code.
Answer
The CAP theorem states distributed systems can only guarantee two of: Consistency, Availability, Partition Tolerance. Since partition tolerance is non-negotiable in real distributed systems (networks fail), the practical choice is CP vs AP: CP databases (Consistency + Partition Tolerance): During a network partition, the system refuses requests rather than return potentially stale data. All nodes agree or the request fails. Examples and their CP properties: HBase/ZooKeeper: strongly consistent, may be unavailable during partition; MongoDB (majority concern): returns consistent data, unavailable on minority side of partition; Etcd/Consul: Raft consensus ensures strong consistency, minority partitions become unavailable; SQL databases with synchronous replication. Choose when: financial transactions, inventory counts, any case where stale data causes business problems. AP databases (Availability + Partition Tolerance): During partition, all nodes remain available but may return stale data. Examples: Cassandra: highly available, configurable consistency; CouchDB/Couchbase: always accepts reads/writes, merges later; DynamoDB (default): eventually consistent reads. Choose when: social media likes, shopping carts, product catalogs — stale by seconds is acceptable. Real-world nuance: most databases offer tunable consistency (DynamoDB strongly consistent reads, Cassandra quorum reads) — the database isn't fixed to one side of CAP; specific operations can be configured.
Pro Tip
Demonstrate both theoretical understanding and practical experience. Say what it is, then give an example of how you actually used it in a System Design codebase.