What is Kafka topic retention?
Answer
Kafka topic retention defines how long messages are kept before being deleted. Two retention strategies: Time-based: retention.ms — delete segments older than the configured duration (e.g., 604800000 = 7 days). Default is 7 days. Size-based: retention.bytes — delete the oldest segments when the total partition size exceeds the limit. Both can be combined — whichever limit is hit first triggers deletion. Key nuances: messages are deleted at the segment level, not individually — an entire segment file is deleted once all its messages are past the retention threshold. The active segment (being written to) is never deleted. Log compaction (cleanup.policy=compact) is an alternative — keep the latest value per key instead of time/size-based deletion. For compliance scenarios, use very long retention or S3 tiered storage.