🍃 MongoDB Intermediate

What is the MongoDB aggregation pipeline optimization?

Why Interviewers Ask This

Mid-level MongoDB roles require deep understanding of this topic. Interviewers ask this to separate candidates who truly understand the mechanics from those who only know surface-level concepts.

Answer

MongoDB performs several automatic optimizations on aggregation pipelines, plus best practices you should follow: Automatic optimizations: (1) $match + $sort coalescence: if a $sort immediately follows a $match, MongoDB applies the sort before the match if the sort uses an index — enables index scan for matching in sorted order; (2) $limit + $skip coalescence: consecutive $limit and $skip stages are combined; (3) $match + $match merging: two consecutive $match stages are merged into one; (4) Stage reordering: MongoDB may move $match before $project, $unwind, or $group to reduce document count early. Manual optimizations: (1) Put $match early: filter documents ASAP before other stages — reduce data volume flowing through the pipeline; (2) Use indexes: the first $match stage can use a collection index. If $match appears after other stages, it can't use indexes; (3) Projection early: $project to remove unneeded fields early — reduces memory and processing; (4) $limit early: if only N results needed, add $limit after $sort to avoid sorting all documents; (5) Avoid $unwind + $group when $group alone works: e.g., sum of array elements can use $sum directly on the array field in $group; (6) Index intersection for $match: design indexes to support the $match filter; (7) allowDiskUse: for large aggregations exceeding 100MB memory limit: db.orders.aggregate([...], { allowDiskUse: true }). Monitor with explain(): db.orders.explain("executionStats").aggregate([...]).

Pro Tip

This topic has MongoDB-specific nuances that differ from general programming. Highlighting those nuances in your answer shows expertise rather than generic knowledge.