Skip to content

Cassandra

What is Apache Cassandra?

Apache Cassandra is a distributed, wide-column NoSQL database designed for high availability and linear scalability with no single point of failure.

  • Type: Wide-column / Column-family NoSQL database
  • Written in: Java
  • License: Apache 2.0
  • Protocol: CQL (Cassandra Query Language) over TCP
  • Default Ports: 9042 (CQL), 7000 (inter-node)
  • Originally developed by: Facebook (2008)

Core Concepts

Data Model

Data Model

Primary Key = Partition Key + Clustering Columns

Terminology

Concept Description
Keyspace Top-level namespace (like database)
Table Collection of rows
Partition Key Determines which node stores data
Clustering Column Sorts rows within a partition
Primary Key Partition key + clustering columns
Partition Unit of data distribution
Node Single Cassandra server
Ring Cluster topology
Token Hash value for data distribution

Architecture

Ring Architecture

All nodes are equal - no master/slave. Data distributed by consistent hashing. Each node owns a range of tokens.

Replication

Replication

Data replicated to RF nodes. Coordinator handles client requests.


Core Features

Core Features


Common Use Cases

1. Time-Series Data

-- User activity timeline
CREATE TABLE user_activity (
    user_id UUID,
    activity_time TIMESTAMP,
    activity_type TEXT,
    details MAP<TEXT, TEXT>,
    PRIMARY KEY (user_id, activity_time)
) WITH CLUSTERING ORDER BY (activity_time DESC);

-- Insert activity
INSERT INTO user_activity (user_id, activity_time, activity_type, details)
VALUES (123e4567-e89b-12d3-a456-426614174000,
        '2024-01-15 10:30:00',
        'page_view',
        {'page': '/products', 'duration': '30'});

-- Get recent activities
SELECT * FROM user_activity
WHERE user_id = 123e4567-e89b-12d3-a456-426614174000
LIMIT 50;

2. IoT Sensor Data

-- Sensor readings with time-based partitioning
CREATE TABLE sensor_readings (
    sensor_id TEXT,
    date DATE,
    reading_time TIMESTAMP,
    temperature DOUBLE,
    humidity DOUBLE,
    PRIMARY KEY ((sensor_id, date), reading_time)
) WITH CLUSTERING ORDER BY (reading_time DESC);

-- Query day's readings
SELECT * FROM sensor_readings
WHERE sensor_id = 'sensor-001'
AND date = '2024-01-15'
AND reading_time >= '2024-01-15 00:00:00'
AND reading_time < '2024-01-15 12:00:00';

3. Messaging / Chat

-- Messages table
CREATE TABLE messages (
    conversation_id UUID,
    message_time TIMEUUID,
    sender_id UUID,
    content TEXT,
    PRIMARY KEY (conversation_id, message_time)
) WITH CLUSTERING ORDER BY (message_time DESC);

-- Get recent messages
SELECT * FROM messages
WHERE conversation_id = ?
LIMIT 50;

-- With TTL (auto-delete after 30 days)
INSERT INTO messages (conversation_id, message_time, sender_id, content)
VALUES (?, now(), ?, ?)
USING TTL 2592000;

4. User Profiles / Sessions

-- User sessions
CREATE TABLE user_sessions (
    session_id TEXT PRIMARY KEY,
    user_id UUID,
    created_at TIMESTAMP,
    last_accessed TIMESTAMP,
    session_data TEXT
);

-- With TTL for auto-expiration
INSERT INTO user_sessions (session_id, user_id, created_at, last_accessed, session_data)
VALUES ('sess_abc123', 123e4567-e89b-12d3-a456-426614174000,
        toTimestamp(now()), toTimestamp(now()), '{}')
USING TTL 86400;  -- 24 hours

5. Product Catalog

-- Products by category
CREATE TABLE products_by_category (
    category TEXT,
    product_id UUID,
    name TEXT,
    price DECIMAL,
    description TEXT,
    attributes MAP<TEXT, TEXT>,
    PRIMARY KEY (category, product_id)
);

-- Products by ID (for direct lookup)
CREATE TABLE products_by_id (
    product_id UUID PRIMARY KEY,
    category TEXT,
    name TEXT,
    price DECIMAL,
    description TEXT,
    attributes MAP<TEXT, TEXT>
);

-- Query by category
SELECT * FROM products_by_category WHERE category = 'electronics';

6. Leaderboards

-- Gaming leaderboard (with bucketing)
CREATE TABLE leaderboard (
    game_id TEXT,
    bucket INT,  -- Top N in each bucket
    score BIGINT,
    user_id UUID,
    username TEXT,
    PRIMARY KEY ((game_id, bucket), score, user_id)
) WITH CLUSTERING ORDER BY (score DESC);

-- Get top players
SELECT * FROM leaderboard
WHERE game_id = 'game1' AND bucket = 0
LIMIT 100;

Data Modeling Principles

1. Query-First Design

❌ Traditional (normalize, then query)
   Entities → Relationships → Tables → Queries

✓ Cassandra (know queries first)
   Queries → Tables (one table per query pattern)

2. Denormalization

-- Instead of joins, denormalize data
-- Query 1: Get orders by customer
CREATE TABLE orders_by_customer (
    customer_id UUID,
    order_time TIMESTAMP,
    order_id UUID,
    product_name TEXT,  -- Denormalized from products
    amount DECIMAL,
    PRIMARY KEY (customer_id, order_time)
) WITH CLUSTERING ORDER BY (order_time DESC);

-- Query 2: Get orders by product
CREATE TABLE orders_by_product (
    product_id UUID,
    order_time TIMESTAMP,
    order_id UUID,
    customer_name TEXT,  -- Denormalized from customers
    amount DECIMAL,
    PRIMARY KEY (product_id, order_time)
) WITH CLUSTERING ORDER BY (order_time DESC);

3. Partition Sizing

Partition too large:
- Hotspots
- Slow queries
- Memory pressure

Partition too small:
- Too many partitions
- Overhead

Sweet spot: 10MB - 100MB per partition
Max: ~2 billion cells per partition

4. Bucketing for Large Partitions

-- Problem: All user's data in one partition
CREATE TABLE user_events (
    user_id UUID,
    event_time TIMESTAMP,
    data TEXT,
    PRIMARY KEY (user_id, event_time)
);

-- Solution: Bucket by time
CREATE TABLE user_events (
    user_id UUID,
    month TEXT,  -- "2024-01"
    event_time TIMESTAMP,
    data TEXT,
    PRIMARY KEY ((user_id, month), event_time)
);

Consistency Levels

Write Consistency

Level Description Use Case
ANY Write to any node (including hints) Maximum availability
ONE Write to one replica Fast writes, eventual consistency
TWO Write to two replicas Moderate consistency
THREE Write to three replicas Higher consistency
QUORUM Write to majority (RF/2 + 1) Strong consistency
LOCAL_QUORUM Quorum in local DC Multi-DC with local consistency
EACH_QUORUM Quorum in each DC Strong multi-DC consistency
ALL Write to all replicas Maximum consistency, lowest availability

Read Consistency

Level Description Use Case
ONE Read from one replica Fast reads
QUORUM Read from majority Strong consistency
LOCAL_QUORUM Quorum in local DC Multi-DC
ALL Read from all replicas Maximum consistency

Consistency Formula

For strong consistency:
READ_CL + WRITE_CL > REPLICATION_FACTOR

Example (RF=3):
- QUORUM (2) + QUORUM (2) = 4 > 3 ✓
- ONE (1) + QUORUM (2) = 3 = 3 ✗
- ONE (1) + ALL (3) = 4 > 3 ✓

Write Path

Write Path

Components

Component Description
Commit Log Append-only WAL for durability
Memtable In-memory write buffer
SSTable Immutable on-disk sorted files
Compaction Merges SSTables, removes tombstones

Java Driver Usage

// Connect
CqlSession session = CqlSession.builder()
    .addContactPoint(new InetSocketAddress("cassandra", 9042))
    .withLocalDatacenter("datacenter1")
    .withKeyspace("myapp")
    .build();

// Prepared statements (reuse!)
PreparedStatement insertUser = session.prepare(
    "INSERT INTO users (id, name, email) VALUES (?, ?, ?)");

PreparedStatement getUser = session.prepare(
    "SELECT * FROM users WHERE id = ?");

// Insert
BoundStatement bound = insertUser.bind(
    UUID.randomUUID(), "John Doe", "[email protected]");
session.execute(bound);

// Query
ResultSet rs = session.execute(getUser.bind(userId));
Row row = rs.one();
if (row != null) {
    String name = row.getString("name");
    String email = row.getString("email");
}

// Async
CompletionStage<AsyncResultSet> future = session.executeAsync(getUser.bind(userId));
future.thenAccept(resultSet -> {
    Row row = resultSet.one();
    // process row
});

// Batch (for same partition only!)
BatchStatement batch = BatchStatement.newInstance(BatchType.LOGGED)
    .add(insertUser.bind(id1, "User1", "[email protected]"))
    .add(insertUser.bind(id1, "User1 Updated", "[email protected]"));
session.execute(batch);

Trade-offs

Pros Cons
No single point of failure No joins
Linear scalability Limited query flexibility
High write throughput Query-first data modeling
Multi-DC replication Denormalization required
Tunable consistency Eventual consistency by default
Automatic partitioning Operational complexity
TTL support No aggregations (native)
Time-series optimized Secondary indexes limited

Performance Characteristics

Metric Typical Value
Write latency 1-5ms
Read latency 1-10ms
Throughput 10,000+ ops/sec/node
Storage Petabytes
Nodes 3 to 1000+
Partition size 10-100MB recommended

Anti-Patterns

-- ❌ SELECT without partition key (full table scan)
SELECT * FROM users WHERE email = '[email protected]';

-- ❌ High cardinality partition key (too many small partitions)
PRIMARY KEY (uuid_for_each_row)

-- ❌ Unbounded partition growth
PRIMARY KEY (user_id)  -- All user's data forever

-- ❌ Using IN with many values
SELECT * FROM users WHERE user_id IN (1,2,3,4,5,...100);

-- ❌ Cross-partition batch
BEGIN BATCH
  INSERT INTO users (id, name) VALUES (1, 'A');
  INSERT INTO users (id, name) VALUES (2, 'B');  -- Different partition!
APPLY BATCH;

-- ❌ Using ALLOW FILTERING in production
SELECT * FROM users WHERE age > 25 ALLOW FILTERING;

When to Use Cassandra

Good For: - Time-series data - IoT sensor data - User activity/events - Messaging applications - Product catalogs - Session management - High write throughput - Multi-datacenter deployments

Not Good For: - Ad-hoc queries - Complex relationships (use RDBMS) - Transactions across partitions - Aggregations (use Spark on top) - Small datasets (overkill) - Frequently changing schemas


Cassandra vs Alternatives

Feature Cassandra MongoDB DynamoDB ScyllaDB
Data Model Wide-column Document Key-value Wide-column
Consistency Tunable Tunable Strong/Eventual Tunable
Scalability Linear Horizontal Automatic Linear
Joins No Limited No No
Query Language CQL MQL API CQL
Managed DataStax Astra Atlas AWS Scylla Cloud
Written In Java C++ - C++

Best Practices

  1. Design for queries - Know access patterns first
  2. Right-size partitions - 10-100MB, avoid hotspots
  3. Use prepared statements - Always, for performance
  4. Avoid ALLOW FILTERING - Design tables to avoid it
  5. Use LOCAL_QUORUM - For multi-DC deployments
  6. Set appropriate TTL - For time-bound data
  7. Use UUIDs or TIMEUUIDs - For unique IDs
  8. Monitor compaction - Can impact performance
  9. Repair regularly - Ensure consistency
  10. Limit partition tombstones - Can cause performance issues

CQL Cheat Sheet

-- Keyspace
CREATE KEYSPACE myapp WITH replication = {
    'class': 'NetworkTopologyStrategy',
    'dc1': 3, 'dc2': 3
};

-- Table
CREATE TABLE users (
    id UUID PRIMARY KEY,
    name TEXT,
    email TEXT
);

-- Table with clustering
CREATE TABLE events (
    user_id UUID,
    event_time TIMESTAMP,
    event_type TEXT,
    PRIMARY KEY (user_id, event_time)
) WITH CLUSTERING ORDER BY (event_time DESC);

-- Insert
INSERT INTO users (id, name, email) VALUES (uuid(), 'John', '[email protected]');

-- Insert with TTL
INSERT INTO users (id, name, email) VALUES (uuid(), 'John', '[email protected]')
USING TTL 86400;

-- Update
UPDATE users SET name = 'Jane' WHERE id = ?;

-- Delete
DELETE FROM users WHERE id = ?;

-- Select
SELECT * FROM users WHERE id = ?;
SELECT * FROM events WHERE user_id = ? AND event_time > '2024-01-01';

-- Secondary index (use sparingly)
CREATE INDEX ON users (email);

-- Materialized view
CREATE MATERIALIZED VIEW users_by_email AS
    SELECT * FROM users
    WHERE email IS NOT NULL AND id IS NOT NULL
    PRIMARY KEY (email, id);