What is a time-series database and when should you use one?

Q: What is a time-series database and when should you use one?

A time-series database (TSDB) is optimized for storing and querying data indexed by time — sequences of data points with timestamps. Unlike general-purpose databases that treat time as just another column, TSDBs have specialized storage engines, compression, and query capabilities designed for temporal data. Characteristics of time-series data: data arrives in time order (monotonically increasing timestamps), high write throughput (thousands of metrics per second), queries often aggregate over

Answer

A time-series database (TSDB) is optimized for storing and querying data indexed by time — sequences of data points with timestamps. Unlike general-purpose databases that treat time as just another column, TSDBs have specialized storage engines, compression, and query capabilities designed for temporal data. Characteristics of time-series data: data arrives in time order (monotonically increasing timestamps), high write throughput (thousands of metrics per second), queries often aggregate over time ranges (average CPU last hour), data retention policies (delete data older than N days/months), recent data accessed more than old data. Optimizations in TSDBs: chunk storage by time windows (recent chunks in memory, older chunks compressed on disk), time-based compression (delta encoding for timestamps, XOR compression for values — exploits small differences between consecutive values), efficient range queries with time-based indexing, automatic downsampling (keep raw data for 7 days, hourly aggregates for 1 year), TTL-based retention. Examples: InfluxDB, TimescaleDB (PostgreSQL extension), Prometheus, VictoriaMetrics, Druid, ClickHouse (OLAP, handles time-series well). Use cases: infrastructure metrics (CPU, memory, network), IoT sensor data, financial tick data, application performance monitoring (APM), user analytics (page views per minute), server logs. When NOT to use TSDB: data without a strong time dimension, data that needs complex joins with other data models.

Answer

More System Design Questions