Fundamentals¶
What is a Distributed System?¶
A collection of independent computers that appears to users as a single coherent system.
CAP Theorem¶
You can only have 2 of 3 guarantees at the same time during a network partition.
Reality: Network partitions WILL happen, so you're really choosing between CP and AP.
PACELC Theorem¶
Extension of CAP: What happens when there's NO partition?
Consistency Models¶
Strong Consistency¶
Every read returns the most recent write. Appears as if there's only one copy of data.
// Strong consistency example
// After write completes, ALL readers see new value
client1.write("x", 10); // Completes
client2.read("x"); // Always returns 10
client3.read("x"); // Always returns 10
Eventual Consistency¶
Given enough time without updates, all replicas converge to same value.
// Eventual consistency example
client1.write("x", 10); // Returns immediately
client2.read("x"); // Might return old value
// ... time passes ...
client2.read("x"); // Eventually returns 10
Causal Consistency¶
Operations that are causally related are seen in same order by all nodes.
// Causal consistency
client1.write("x", 1); // A
client1.write("y", 2); // B (causally after A)
// All nodes see A before B
client2.write("z", 3); // C (concurrent with A, B)
// C can be seen in any order relative to A, B
Read-Your-Writes Consistency¶
A process always sees its own writes.
Monotonic Reads¶
If a process reads value v, subsequent reads won't return older values.
// Monotonic reads
client1.read("x"); // Returns version 5
client1.read("x"); // Returns version >= 5
Linearizability¶
Strongest consistency. Operations appear instantaneous at some point between invocation and response.
Consensus Algorithms¶
Why Consensus?¶
Distributed nodes must agree on: - Leader election - Transaction commit - State machine replication - Configuration changes
Paxos¶
Raft (Easier to Understand)¶
Raft Guarantees: - Election Safety: At most one leader per term - Leader Append-Only: Leader never overwrites/deletes entries - Log Matching: If two logs have entry with same index/term, all preceding entries are identical - Leader Completeness: Committed entry will be present in all future leaders' logs
Vector Clocks¶
Track causality in distributed systems without synchronized clocks.
Use cases: Conflict detection in eventually consistent systems (DynamoDB, Riak)
Replication Strategies¶
Single-Leader (Primary-Backup)¶
Multi-Leader¶
Leaderless (Dynamo-style)¶
Partitioning (Sharding)¶
Hash Partitioning¶
// Simple hash partitioning
int partition = hash(key) % numPartitions;
// Problem: Adding/removing nodes reshuffles most keys
// N=3: hash(key) % 3 = 1
// N=4: hash(key) % 4 = 2 // Different partition!
Consistent Hashing¶
Range Partitioning¶
Failure Detection¶
Heartbeat¶
Phi Accrual Failure Detector¶
Outputs suspicion level (φ) instead of binary alive/dead.
φ = -log10(P(heartbeat_delay > t))
• φ = 1: 10% probability node is alive
• φ = 2: 1% probability node is alive
• φ = 3: 0.1% probability node is alive
Application decides threshold based on needs.
Gossip Protocol¶
Distributed Transactions¶
Two-Phase Commit (2PC)¶
Three-Phase Commit (3PC)¶
Adds "pre-commit" phase to reduce blocking, but still not partition-tolerant.
Saga Pattern¶
Common Interview Questions¶
- Explain CAP theorem with examples
- Network partitions happen; choose CP (consistency) or AP (availability)
-
Examples: ZooKeeper (CP), Cassandra (AP)
-
How does Raft leader election work?
- Followers timeout, become candidates
- Request votes, majority wins
-
Leader sends heartbeats
-
Difference between strong and eventual consistency?
- Strong: All reads see latest write
-
Eventual: Given time, all replicas converge
-
How does consistent hashing help scaling?
- Only K/N keys move when adding/removing nodes
-
Virtual nodes improve distribution
-
2PC vs Saga pattern?
- 2PC: ACID, blocking, coordinator failure issues
- Saga: Eventually consistent, compensating transactions