What is Kafka log compaction?
Answer
Log compaction is a Kafka retention policy that retains the latest value for each message key indefinitely, discarding older records with the same key. Enable it with cleanup.policy=compact on a topic. Unlike time-based or size-based retention (which deletes old messages based on age/size), compaction keeps at least the last message for every key — like a key-value store embedded in Kafka. Compaction runs in the background as a periodic cleanup job. Use cases: Database change data capture (CDC) — a compacted topic of userId → userRecord gives the latest state for every user. Application configuration topics. Event sourcing materialized views. A key with a null value is a tombstone — after compaction, both the tombstone and the key are deleted, effectively deleting the entry. Compaction and time-based retention can be combined.