How does MongoDB handle index builds on large collections?

Why Interviewers Ask This

Advanced questions like this reveal whether a candidate has internalized MongoDB deeply enough to make architectural decisions. Strong answers demonstrate both breadth and depth of experience.

Answer

Building indexes on large collections is a significant operation. MongoDB 4.2+ introduced Hybrid Index Build which replaced the two previous approaches (foreground and background builds). Hybrid Index Build (MongoDB 4.2+): holds an exclusive lock only briefly at the start and end of the build. During the bulk of the build (scanning and sorting all documents), only an intent lock is held — reads and writes can continue normally. At the end, a brief exclusive lock finalizes the index. This effectively makes all index builds non-blocking in practice. Index build flow: (1) Lock the collection briefly; start the build; (2) Scan all documents and insert into a sorted buffer; (3) As new writes come in, track them in a "side writes" table; (4) Flush the sorted buffer to disk creating the initial index structure; (5) Drain the "side writes" table — apply writes that arrived during the build; (6) Acquire a brief exclusive lock; finalize the index; commit; (7) Release lock. Performance impact: index builds consume significant I/O (scanning all data + sort) and CPU. On a replica set, builds are coordinated: (1) Primary builds the index first; (2) After commit, replication propagates the index creation to secondaries; (3) Secondaries build the index one at a time (rolling build). Rolling index build (manual): for zero-impact builds on production: build on secondaries first (they're not serving primary reads), then step down primary, build on new secondary. Monitoring: db.currentOp({ "command.createIndexes": { $exists: true } }) shows index build progress. db.adminCommand({ currentOp: 1, $all: true }) for all operations.

Pro Tip

Demonstrate both theoretical understanding and practical experience. Say what it is, then give an example of how you actually used it in a MongoDB codebase.