Database Internals¶
Storage Engines¶
B-Tree (Most Common)¶
B+ Tree (Variation)¶
LSM Tree (Log-Structured Merge Tree)¶
B-Tree vs LSM Tree¶
Indexing¶
Index Types¶
Composite Index¶
-- Order matters!
CREATE INDEX idx_composite ON orders(customer_id, status, created_at);
-- This index can be used for:
WHERE customer_id = ? -- ✓
WHERE customer_id = ? AND status = ? -- ✓
WHERE customer_id = ? AND status = ? AND created_at > ? -- ✓
WHERE status = ? -- ✗ (need leftmost)
WHERE customer_id = ? AND created_at > ? -- Partial (skips status)
-- Index-only scan (covering index)
CREATE INDEX idx_covering ON orders(customer_id)
INCLUDE (status, total); -- Avoids table lookup
Index Selection Guidelines¶
When to Create Indexes: - Column frequently in WHERE clauses - Column used in JOIN conditions - Column used in ORDER BY - High selectivity (many distinct values)
Avoid index when: - Small tables (full scan is fast) - Low selectivity (few distinct values) - Heavy write workload (index maintenance) - Column rarely queried
Index Maintenance Costs: - INSERT: Update all relevant indexes - UPDATE: Update if indexed column changes - DELETE: Update all relevant indexes - Storage: Each index uses disk space
Query Execution¶
Query Processing Pipeline¶
EXPLAIN ANALYZE¶
EXPLAIN ANALYZE
SELECT o.*, c.name
FROM orders o
JOIN customers c ON o.customer_id = c.id
WHERE o.status = 'pending'
AND o.created_at > '2024-01-01';
-- Output:
Hash Join (cost=10.50..100.25 rows=50 width=200) (actual time=0.5..1.2 rows=45 loops=1)
Hash Cond: (o.customer_id = c.id)
-> Index Scan using idx_orders_status on orders o (actual time=0.1..0.4 rows=100 loops=1)
Index Cond: (status = 'pending')
Filter: (created_at > '2024-01-01')
Rows Removed by Filter: 20
-> Hash (actual time=0.3..0.3 rows=1000 loops=1)
-> Seq Scan on customers c (actual time=0.01..0.2 rows=1000 loops=1)
Planning Time: 0.5 ms
Execution Time: 1.5 ms
Join Algorithms¶
Nested Loop Join
O(n x m), good for small tables or indexed lookupsHash Join 1. Build hash table from smaller table 2. Probe hash table with larger table
O(n + m), good for equality joins, needs memory
Merge Join (Sort-Merge) 1. Sort both tables on join key 2. Merge sorted lists
O(n log n + m log m), good if already sorted
Transactions & ACID¶
ACID Properties¶
Isolation Levels¶
MVCC (Multi-Version Concurrency Control)¶
Write-Ahead Logging (WAL)¶
Buffer Pool / Page Cache¶
Partitioning¶
Common Interview Questions¶
- B-Tree vs LSM Tree?
- B-Tree: Better reads, in-place updates
-
LSM: Better writes, append-only, compaction
-
What is MVCC?
- Keep multiple versions of rows
- Readers see consistent snapshot
-
No read locks needed
-
How does WAL work?
- Log changes before data pages
- Sequential writes (fast)
-
Enables crash recovery
-
When to add an index?
- Frequent WHERE conditions
- JOIN columns
- High selectivity
-
Avoid on write-heavy, low-selectivity columns
-
Explain isolation levels
- Trade-off between consistency and concurrency
- Read Committed most common default
- Serializable for full isolation