🍃 MongoDB Intermediate

What is the MongoDB aggregation $bucket and $bucketAuto stage?

Why Interviewers Ask This

Mid-level MongoDB roles require deep understanding of this topic. Interviewers ask this to separate candidates who truly understand the mechanics from those who only know surface-level concepts.

Answer

$bucket categorizes documents into user-defined groups (buckets) based on a field's value — like a histogram or GROUP BY with value ranges. Syntax: { $bucket: { groupBy: "$price", boundaries: [0, 50, 100, 200, 500], default: "Other", output: { count: { $sum: 1 }, avgPrice: { $avg: "$price" }, products: { $push: "$name" } } } }. Each document is placed in the bucket whose lower boundary ≤ value < upper boundary. Documents outside all boundaries go to the "default" bucket. $bucketAuto: automatically determines bucket boundaries to evenly distribute documents across a specified number of buckets. { $bucketAuto: { groupBy: "$price", buckets: 5, output: { count: { $sum: 1 }, avgPrice: { $avg: "$price" } } } }. MongoDB computes boundaries to make each bucket have approximately the same number of documents. Use cases: (1) Price range analysis: "how many products in each price range?"; (2) Age distribution: "user count by age group"; (3) Performance histograms: "request count by latency bucket"; (4) Grade distribution. granularity option in $bucketAuto: allows buckets based on standard progression series (POWERSOF2, E12, R5, etc.) for physically meaningful ranges. $bucket vs $group with $switch: both achieve similar results; $bucket is more concise for range-based grouping; $group + $switch offers more flexibility.

Pro Tip

If you're unsure about a detail, say so honestly and explain your reasoning. Interviewers respect candidates who can think through uncertainty rather than bluffing.