How would you design a distributed file system like HDFS?
Why Interviewers Ask This
This is a differentiating question used for senior and lead roles. Interviewers want to see if you can explain not just what happens, but why — and what the trade-offs are in different approaches.
Answer
HDFS (Hadoop Distributed File System) design principles: store very large files (TB-PB), optimized for throughput over latency, write-once read-many, commodity hardware. Architecture: NameNode (master): stores filesystem metadata — namespace (directory tree, file → block list, block → DataNode list). Holds all metadata in memory for fast access. Single master but with HA standby. DataNodes (workers): store actual data blocks. Report their block list to NameNode on startup (block report) and send periodic heartbeats. File storage: files split into large blocks (128MB default). Each block replicated to 3 DataNodes (configurable). Replica placement: one on local rack, two on a different rack — balances failure tolerance and bandwidth. Write flow: client contacts NameNode → NameNode selects DataNodes, returns pipeline (ordered list) → client writes to first DataNode → first forwards to second → second forwards to third (pipeline replication) → acknowledgments flow back → NameNode records successful replication. Read flow: client asks NameNode for block locations → NameNode returns closest replicas → client reads directly from DataNode. Fault tolerance: NameNode detects DataNode failure via heartbeat timeout → finds under-replicated blocks → schedules replication to restore factor. HA NameNode: active + standby NameNode sharing NFS/QJM (Quorum Journal Manager) for edit log. ZooKeeper for leader election/failover. Erasure coding: newer HDFS versions support erasure coding (like RAID) — better storage efficiency than 3× replication for cold data.
Pro Tip
Demonstrate both theoretical understanding and practical experience. Say what it is, then give an example of how you actually used it in a System Design codebase.
Previous
How would you design a chat application like WhatsApp?
Next
How would you design a video streaming service like Netflix?
More System Design Questions
View all →- Advanced How would you design a video streaming service like Netflix?
- Advanced What is the consistent hashing with virtual nodes in detail?
- Advanced How would you design a global distributed database like Google Spanner?
- Advanced What is the difference between optimistic and pessimistic locking in distributed systems?
- Advanced How would you design Google's Bigtable?