What is Kafka Streams?

Answer

Kafka Streams is a client library for building real-time stream processing applications using Kafka as both input and output. Unlike batch processing (process, then output), Kafka Streams processes data as it arrives. Key features: Topology: a graph of processing nodes (source, processor, sink). High-level DSL: operations like filter(), map(), groupBy(), aggregate(), join(). State stores: local, fault-tolerant RocksDB state for stateful operations (counts, aggregates). Windowing: time-window operations (tumbling, hopping, session). No separate cluster: Kafka Streams is a library that runs inside your application — no external processing cluster (unlike Flink or Spark). Fault tolerance: state is backed up to Kafka changelog topics. Kafka Streams is ideal for Java/Scala applications needing stream processing without operational complexity.