What is horizontal scaling in microservices?

Answer

Horizontal scaling (scaling out) means adding more instances of a service to handle increased load, rather than upgrading the hardware of a single instance (vertical scaling). Microservices are designed for horizontal scaling — because each service is stateless and independently deployable, you can scale only the services that are under load without touching others. For example, if the image-processing service is the bottleneck, you scale it from 2 to 20 instances while leaving the user service unchanged. Kubernetes Horizontal Pod Autoscaler (HPA) automates this by monitoring CPU/memory metrics and automatically adding or removing pods. This is one of the primary cost and performance advantages of microservices over monoliths.