How would you design a notification system?

Why Interviewers Ask This

Mid-level System Design roles require deep understanding of this topic. Interviewers ask this to separate candidates who truly understand the mechanics from those who only know surface-level concepts.

Answer

A notification system delivers messages across multiple channels (push notifications, email, SMS, in-app) to users based on events. Requirements: multiple channel types, high throughput (millions per day), at-least-once delivery, extensible for new channels, user preferences (do not disturb, channel preferences). Architecture: Event sources (payment service, social service, etc.) publish events to a message queue (Kafka); Notification service consumes events, applies user preferences, determines channels; Channel workers (email worker, push worker, SMS worker) each consume from their dedicated queue and deliver via third-party providers. Channels: Push: APNs (iOS), FCM (Android); Email: SendGrid, SES, Mailgun; SMS: Twilio, AWS SNS. Flow: event → Kafka → notification service (look up user preferences + device tokens) → fan out to channel-specific queues → workers call provider APIs → provider delivers to device. User preferences: store in DB: (user_id, channel, enabled, quiet_hours). Check before sending. Rate limiting: don't spam users — limit notifications per user per hour. Delivery tracking: store notification record (id, user_id, channel, status: sent/delivered/read, timestamp). Update status via provider delivery receipts/webhooks. Retry: providers may fail — retry with exponential backoff + dead letter queue for persistent failures. Template engine: store notification templates in DB, render with user-specific data. Unsubscribe/opt-out: honor immediately, store in DB, check before any send.

Pro Tip

This topic has System Design-specific nuances that differ from general programming. Highlighting those nuances in your answer shows expertise rather than generic knowledge.