What is data partitioning?

Q: What is data partitioning?

Data partitioning (also called data partitioning or sharding) divides a large dataset into smaller, manageable pieces distributed across multiple storage nodes. This enables horizontal scaling of storage and query performance. Types: (1) Horizontal partitioning (sharding): different rows of the same table go to different partitions — user ID 1-1000 on shard 1, 1001-2000 on shard 2. Each shard has the same schema. Most common type; (2) Vertical partitioning: split a table by columns — user pr

Answer

Data partitioning (also called data partitioning or sharding) divides a large dataset into smaller, manageable pieces distributed across multiple storage nodes. This enables horizontal scaling of storage and query performance. Types: (1) Horizontal partitioning (sharding): different rows of the same table go to different partitions — user ID 1-1000 on shard 1, 1001-2000 on shard 2. Each shard has the same schema. Most common type; (2) Vertical partitioning: split a table by columns — user profile (ID, name, email) in one partition, user settings (ID, preferences, notifications) in another. Related data accessed together stays together; (3) Functional partitioning: data segregated by functional area — orders data in one cluster, product catalog in another. Similar to microservices data isolation. Partition strategies: range (value ranges), hash (hash function distributes evenly), list (specific values to specific partitions), composite (combination). Considerations: Hotspots: a partition receiving disproportionate load (hash-based helps avoid this); Cross-partition queries: joining data from multiple partitions is expensive — design data model to minimize this; Rebalancing: adding nodes requires moving data — consistent hashing minimizes this; Referential integrity: foreign key constraints across partitions are not enforceable by the database. Partitioning is often the last resort after exhausting vertical scaling, caching, and read replicas.

Answer

More System Design Questions