What is database sharding and what are its challenges?

Answer

Sharding horizontally partitions data across multiple database servers (shards), each holding a subset of rows. A shard key (e.g., user_id, tenant_id) determines which shard stores each row. Sharding solves write scalability beyond a single server's capacity. Challenges: cross-shard queries (JOINs across shards require scatter-gather); cross-shard transactions (distributed transactions are complex and slow); hotspots (one shard receives disproportionate traffic — use consistent hashing or hash-based sharding); re-sharding (redistributing data as you add shards is disruptive); and operational complexity (monitoring and maintaining N databases). Consider sharding only after exhausting single-server optimization.