What are Cassandra's anti-patterns and how do you avoid them?

Answer

Common Cassandra anti-patterns: 1. Using ALLOW FILTERING: causes full-cluster scans. Fix: redesign the table with the queried column in the partition or clustering key. 2. Unbounded partitions: writing all data to the same partition key (e.g., status = 'active'). Fix: add a time bucket or UUID to the partition key. 3. Large partitions: partitions exceeding 100MB cause compaction and read performance issues. Fix: add time bucketing to spread data across partitions. 4. Too many tombstones: frequent deletes create tombstone accumulation. Fix: use TTL instead of explicit deletes; reduce gc_grace_seconds if appropriate; avoid delete-heavy patterns. 5. Misusing batches: using LOGGED BATCH across multiple partitions for performance — it actually hurts performance and adds coordinator overhead. BATCH in Cassandra is for atomicity on a single partition, not bulk performance. 6. Secondary indexes on high/low cardinality columns: full-cluster queries. Fix: use materialized views or query tables. 7. Using LWT everywhere: 10x performance hit. Fix: redesign to avoid compare-and-set requirements. 8. Skipping repairs: causes data inconsistency and prevents tombstone cleanup. Fix: run repairs regularly within gc_grace_seconds.

Answer

More RabbitMQ & Cassandra Questions