How do you monitor and operate Kafka in production?

Question

Accepted Answer

Production Kafka monitoring stack: JMX Metrics: Kafka exposes 700+ metrics via JMX. Critical ones: UnderReplicatedPartitions (should be 0), OfflinePartitionsCount (should be 0), ActiveControllerCount (should be 1 per cluster), BytesInPerSec/BytesOutPerSec (throughput), RequestQueueSize (broker overload indicator). Prometheus + Grafana: use JMX Exporter to scrape and Kafka-specific Grafana dashboards. Consumer lag: use kafka-consumer-groups.sh or Burrow/Kafka Lag Exporter. Operational tools: kafk

How do you monitor and operate Kafka in production?

Answer

More Apache Kafka Questions