What is AWS multi-region and disaster recovery strategies?

Why Interviewers Ask This

Interviewers ask this to evaluate whether you have the depth of knowledge needed to mentor others and lead technical decisions. The expected answer goes beyond definitions into practical implications and real-world consequences.

Answer

AWS disaster recovery (DR) strategies, from cheapest to most expensive: 1. Backup and Restore (RTO: hours, RPO: hours): back up data to another region (S3 cross-region replication, RDS cross-region snapshots, EC2 AMI copy). On disaster: restore from backup in DR region. Cost: only backup storage. Lowest cost, longest recovery time. Use for non-critical systems. 2. Pilot Light (RTO: minutes to hours, RPO: minutes): minimal version of core infrastructure always running in DR region (RDS replica, critical servers). On disaster: scale up the pilot light infrastructure. Database already in sync (near real-time replication). Slightly more expensive than backup/restore. 3. Warm Standby (RTO: minutes, RPO: seconds to minutes): scaled-down but fully functional copy in DR region. Receives real-time data replication. On disaster: scale up to full production capacity. Route 53 health checks + failover routing for automatic DNS switch. Good balance of cost and recovery speed. 4. Multi-Site Active/Active (RTO: near zero, RPO: near zero): full production load in multiple regions simultaneously. Route 53 latency-based or geolocation routing distributes traffic. Both regions fully sized. Expensive — double infrastructure cost. Use for mission-critical applications. Key services for DR: Route 53 health checks + failover; S3 CRR; RDS read replicas + promotion; Global Aurora (cross-region replication with <1s RPO); DynamoDB Global Tables (active-active, multi-region); CloudFormation (recreate infrastructure from code); AMI copying across regions; Elastic Disaster Recovery (DRS — continuous server replication). RTO (Recovery Time Objective) = max acceptable downtime. RPO (Recovery Point Objective) = max acceptable data loss.

Common Mistake

Candidates often give textbook answers here. Interviewers are more impressed when you relate the concept to a specific problem you solved in a real AWS / Cloud Computing project.