How would you design Google's Bigtable?

Answer

Google Bigtable (2006) is a distributed NoSQL database optimized for structured data at massive scale. It influenced HBase, Cassandra, DynamoDB. Data model: sparse, distributed, persistent, multidimensional sorted map. Key: (row_key, column_family:column_qualifier, timestamp) → value (bytes). Row keys are sorted lexicographically — enables efficient range scans. Column families group related columns, defined at schema creation. Multiple versions of each cell stored with timestamp. Architecture: Master: assigns tablets to tablet servers, detects tablet server failures, balances load. Not on the critical read/write path (clients cache tablet locations). Tablet servers: each serves multiple tablets (contiguous row ranges, 100-200MB each). Handle reads/writes directly from clients. GFS (Colossus): underlying distributed storage for persistent data. Tablet storage (LSM-Tree): writes go to memtable (in-memory sorted buffer) → periodically flushed to SSTable (sorted string table on GFS) → SSTables compacted periodically (merge sort, write new SSTable) to limit read amplification. Bloom filters per SSTable to quickly determine if a key is absent. Read path: check memtable → check recent SSTables → merge results from multiple SSTables using timestamps. Compaction makes reads faster (fewer SSTables to scan). Locality groups: store frequently-accessed column families together, rarely-accessed separately — avoids reading unnecessary data. Modern equivalent: Cloud Bigtable (managed), HBase (open-source Hadoop ecosystem), Cassandra (inspired by Bigtable + Dynamo).

Answer

More System Design Questions