How do you model time-series data in Cassandra?
Answer
Time-series data is one of Cassandra's strongest use cases. Recommended pattern: Partition by time bucket + entity: CREATE TABLE sensor_readings (sensor_id UUID, day DATE, recorded_at TIMESTAMP, value DOUBLE, PRIMARY KEY ((sensor_id, day), recorded_at)) WITH CLUSTERING ORDER BY (recorded_at DESC);. The partition key combines entity ID (sensor_id) and time bucket (day). This gives each day's data for each sensor its own partition — bounded partition size. Clustering by recorded_at DESC enables efficient retrieval of the most recent readings. Query: SELECT * FROM sensor_readings WHERE sensor_id = ? AND day = ? LIMIT 100;. Why time bucketing: a single partition for all a sensor's history would grow unboundedly. Bucketing by day/week/month caps partition size. Bucket granularity: choose based on data volume — high-frequency sensors need smaller buckets (hour), low-frequency need larger (month). Use TTLs to automatically expire old data: INSERT ... USING TTL 2592000 (30 days). Never use ALLOW FILTERING for time-range queries — design the schema to avoid it.
Previous
What are RabbitMQ's quorum queues?
Next
What is Cassandra's consistency vs availability trade-off with tunable consistency?
More RabbitMQ & Cassandra Questions
View all →- Intermediate How does Cassandra's read path work?
- Intermediate What is Cassandra's compaction and its types?
- Intermediate What are RabbitMQ's quorum queues?
- Intermediate What is Cassandra's consistency vs availability trade-off with tunable consistency?
- Intermediate What is Cassandra's anti-entropy repair?