How would you design a global distributed database like Google Spanner?
Why Interviewers Ask This
Senior System Design engineers are expected to reason about architecture, performance, and edge cases. This question separates mid-level from senior candidates by testing deep system-level understanding.
Answer
Google Spanner achieves the seemingly impossible: globally distributed, strongly consistent, ACID-compliant SQL database with high availability. Key innovations: TrueTime API: GPS receivers and atomic clocks in every datacenter provide a globally consistent time reference with a bounded uncertainty interval [earliest, latest]. The actual time is guaranteed to be within this interval. Spanner uses TrueTime to assign commit timestamps and guarantee external consistency: if transaction T1 commits before T2 starts, T2 sees T1's writes. Without TrueTime, you'd need Lamport clocks or coordination that adds latency. Architecture: data organized into tablets (contiguous key range shards); tablets stored in Colossus (distributed file system); replicated via Paxos groups across zones; multiple Paxos groups cover different key ranges; Spanner directory (smallest unit of replication — configurable per-database or per-table). Read-write transactions: two-phase locking + 2PC across Paxos groups; commit wait — after acquiring commit timestamp, Spanner waits until TrueTime.now().earliest > commit_timestamp, guaranteeing no future transaction can have an earlier timestamp. Snapshot reads: consistent reads at a specific timestamp — no locks needed, any replica can serve. Extremely scalable. F1 SQL: full SQL with JOINs, subqueries, interleaved table hierarchies (child table co-located with parent — avoids cross-partition joins). Impact: Spanner inspired CockroachDB (open-source, uses hybrid logical clocks instead of TrueTime) and YugabyteDB.
Pro Tip
If you're unsure about a detail, say so honestly and explain your reasoning. Interviewers respect candidates who can think through uncertainty rather than bluffing.
Previous
What is the consistent hashing with virtual nodes in detail?
Next
What is the difference between optimistic and pessimistic locking in distributed systems?
More System Design Questions
View all →- Advanced How would you design a distributed file system like HDFS?
- Advanced How would you design a video streaming service like Netflix?
- Advanced What is the consistent hashing with virtual nodes in detail?
- Advanced What is the difference between optimistic and pessimistic locking in distributed systems?
- Advanced How would you design Google's Bigtable?