What is distributed logging and how do you correlate logs across microservices?

Answer

Distributed logging is the practice of collecting, centralizing, and querying log output from all microservice instances across a system. Because a single user request can touch dozens of services, correlating log entries across services is critical for debugging. The key enabler is a correlation ID (also called a request ID or trace ID) — a unique identifier generated at the system entry point (API gateway) and propagated in request headers through every downstream service call. Each service includes the correlation ID in every log line it emits. This allows an operator to search the central log store (Elasticsearch/Kibana via the ELK stack, or Grafana Loki) for all log lines from a specific user request across all services. Structured logging (JSON format with consistent fields like service, level, trace_id, timestamp) makes logs machine-queryable. Log aggregation agents (Fluentd, Filebeat, Vector) collect logs from all container stdout streams and ship them to the central store without any application code changes.