How would you design a global distributed database like Google Spanner?

Q: How would you design a global distributed database like Google Spanner?

Google Spanner achieves the seemingly impossible: globally distributed, strongly consistent, ACID-compliant SQL database with high availability. Key innovations: TrueTime API: GPS receivers and atomic clocks in every datacenter provide a globally consistent time reference with a bounded uncertainty interval [earliest, latest]. The actual time is guaranteed to be within this interval. Spanner uses TrueTime to assign commit timestamps and guarantee external consistency: if transaction T1 commits b

Answer

Google Spanner achieves the seemingly impossible: globally distributed, strongly consistent, ACID-compliant SQL database with high availability. Key innovations: TrueTime API: GPS receivers and atomic clocks in every datacenter provide a globally consistent time reference with a bounded uncertainty interval [earliest, latest]. The actual time is guaranteed to be within this interval. Spanner uses TrueTime to assign commit timestamps and guarantee external consistency: if transaction T1 commits before T2 starts, T2 sees T1's writes. Without TrueTime, you'd need Lamport clocks or coordination that adds latency. Architecture: data organized into tablets (contiguous key range shards); tablets stored in Colossus (distributed file system); replicated via Paxos groups across zones; multiple Paxos groups cover different key ranges; Spanner directory (smallest unit of replication — configurable per-database or per-table). Read-write transactions: two-phase locking + 2PC across Paxos groups; commit wait — after acquiring commit timestamp, Spanner waits until TrueTime.now().earliest > commit_timestamp, guaranteeing no future transaction can have an earlier timestamp. Snapshot reads: consistent reads at a specific timestamp — no locks needed, any replica can serve. Extremely scalable. F1 SQL: full SQL with JOINs, subqueries, interleaved table hierarchies (child table co-located with parent — avoids cross-partition joins). Impact: Spanner inspired CockroachDB (open-source, uses hybrid logical clocks instead of TrueTime) and YugabyteDB.

Answer

More System Design Questions