What is Kafka tiered storage?
Answer
Kafka Tiered Storage (introduced in Kafka 3.6 stable) allows Kafka to store older log segments in cheap remote storage (S3, Azure Blob, GCS) while keeping recent data on local broker disk. Brokers retain only recent segments locally for fast consumption; older segments are transparently fetched from remote storage when needed. Benefits: Cost reduction: remote storage costs 1/10th of local SSD. Broker scalability: broker disks only need to hold recent data, enabling smaller/cheaper brokers. Longer retention: retain data for months or years affordably. Independent scaling: add partitions for throughput without expanding storage. Before tiered storage, the only option for long-term retention was large, expensive disks or running a separate S3 sink connector alongside Kafka. Tiered storage makes Kafka a viable long-term event store without operational workarounds.
Previous
What is Debezium and how does it work with Kafka?
Next
What is Kafka's exactly-once semantics (EOS) implementation?