How do you model time-series data in Cassandra?

Answer

Time-series data is one of Cassandra's strongest use cases. Recommended pattern: Partition by time bucket + entity: CREATE TABLE sensor_readings (sensor_id UUID, day DATE, recorded_at TIMESTAMP, value DOUBLE, PRIMARY KEY ((sensor_id, day), recorded_at)) WITH CLUSTERING ORDER BY (recorded_at DESC);. The partition key combines entity ID (sensor_id) and time bucket (day). This gives each day's data for each sensor its own partition — bounded partition size. Clustering by recorded_at DESC enables efficient retrieval of the most recent readings. Query: SELECT * FROM sensor_readings WHERE sensor_id = ? AND day = ? LIMIT 100;. Why time bucketing: a single partition for all a sensor's history would grow unboundedly. Bucketing by day/week/month caps partition size. Bucket granularity: choose based on data volume — high-frequency sensors need smaller buckets (hour), low-frequency need larger (month). Use TTLs to automatically expire old data: INSERT ... USING TTL 2592000 (30 days). Never use ALLOW FILTERING for time-range queries — design the schema to avoid it.

Answer

More RabbitMQ & Cassandra Questions