What is a time-series database and when should you use one?

Why Interviewers Ask This

This tests whether you can apply System Design knowledge to real-world scenarios. Interviewers are looking for clarity of thought and evidence that you've encountered this in production code.

Answer

A time-series database (TSDB) is optimized for storing and querying data indexed by time — sequences of data points with timestamps. Unlike general-purpose databases that treat time as just another column, TSDBs have specialized storage engines, compression, and query capabilities designed for temporal data. Characteristics of time-series data: data arrives in time order (monotonically increasing timestamps), high write throughput (thousands of metrics per second), queries often aggregate over time ranges (average CPU last hour), data retention policies (delete data older than N days/months), recent data accessed more than old data. Optimizations in TSDBs: chunk storage by time windows (recent chunks in memory, older chunks compressed on disk), time-based compression (delta encoding for timestamps, XOR compression for values — exploits small differences between consecutive values), efficient range queries with time-based indexing, automatic downsampling (keep raw data for 7 days, hourly aggregates for 1 year), TTL-based retention. Examples: InfluxDB, TimescaleDB (PostgreSQL extension), Prometheus, VictoriaMetrics, Druid, ClickHouse (OLAP, handles time-series well). Use cases: infrastructure metrics (CPU, memory, network), IoT sensor data, financial tick data, application performance monitoring (APM), user analytics (page views per minute), server logs. When NOT to use TSDB: data without a strong time dimension, data that needs complex joins with other data models.

Pro Tip

Back up your answer with a specific project or situation. Saying 'In my last System Design project, I used this when...' immediately makes your answer more credible and memorable.