Amazon ElastiCache¶

Introduction¶

Amazon ElastiCache is a fully managed in-memory data store service compatible with Redis and Memcached. It provides sub-millisecond latency for caching and real-time use cases.

ElastiCache Architecture

Supported Engines¶

Redis - Rich data structures, persistence, replication
Memcached - Simple, multi-threaded, volatile cache

Key Features¶

Sub-millisecond latency - In-memory performance
Fully managed - Patching, backup, failover
Scalability - Cluster mode, read replicas
High availability - Multi-AZ, automatic failover
Security - VPC, encryption, IAM auth

When to Use¶

Ideal Use Cases¶

Session store - User session management
Database caching - Reduce database load
Real-time leaderboards - Gaming, competitions
Rate limiting - API throttling
Message queues - Pub/sub messaging
Geospatial data - Location-based services
Real-time analytics - Counting, aggregation

Signs ElastiCache is Right for You¶

Need sub-millisecond response times
Database is the bottleneck
Have read-heavy workloads
Need distributed caching
Require session management

Redis vs Memcached¶

Feature	Redis	Memcached
Data structures	Strings, lists, sets, hashes, sorted sets	Simple key-value
Persistence	Yes (snapshots, AOF)	No
Replication	Yes (read replicas)	No
Cluster mode	Yes (sharding)	Yes (multiple nodes)
Multi-threaded	No (single-threaded)	Yes
Pub/Sub	Yes	No
Lua scripting	Yes	No
Transactions	Yes	No
Backup/Restore	Yes	No
Auto failover	Yes (Multi-AZ)	No

Choose Redis When¶

Need data persistence
Need rich data structures
Need replication/HA
Need pub/sub messaging
Need atomic operations

Choose Memcached When¶

Simple caching only
Need multi-threaded performance
Can tolerate data loss
Simpler operational model

What to Be Careful About¶

Performance¶

Memory sizing - Insufficient memory causes evictions
Network - Use same AZ for lowest latency
Connection limits - Monitor max connections
Key design - Avoid hot keys
Serialization - Affects performance

Cost Management¶

Node sizing - Right-size for memory and network
Reserved nodes - Up to 55% savings
Data tiering - Redis on SSD for larger datasets
Replicas - Each replica costs

Operational¶

Eviction policies - Understand behavior when full
TTL strategy - Set appropriate expiration
Cluster mode - Plan sharding strategy
Backup window - Schedule during low traffic
Parameter groups - Tune for workload

Security¶

VPC deployment - Always use private subnets
Security groups - Restrict access
Encryption - At-rest and in-transit
IAM authentication - Redis 6.0+ (ElastiCache 6.x+)
AUTH token - Password protection

Data Management¶

Cache invalidation - Hardest problem in computing
Consistency - Cache vs database consistency
Serialization format - JSON, MessagePack, etc.
Key naming - Namespaced, predictable keys

Caching Patterns¶

Cache-Aside (Lazy Loading)¶

def get_user(user_id):
    # Check cache first
    user = cache.get(f"user:{user_id}")
    if user:
        return user

    # Cache miss - get from database
    user = db.get_user(user_id)
    cache.set(f"user:{user_id}", user, ttl=3600)
    return user

Pros: Only caches what's needed Cons: Cache miss penalty, stale data possible

Write-Through¶

def update_user(user_id, data):
    # Update database
    db.update_user(user_id, data)
    # Update cache
    cache.set(f"user:{user_id}", data)

Pros: Cache always current Cons: Write latency, unused data cached

Write-Behind (Write-Back)¶

def update_user(user_id, data):
    # Update cache immediately
    cache.set(f"user:{user_id}", data)
    # Async update to database
    queue.send({"user_id": user_id, "data": data})

Pros: Fast writes Cons: Data loss risk, complexity

Common Interview Questions¶

When would you choose Redis over Memcached?
Need data structures beyond simple strings
Need data persistence
Need replication/high availability
Need pub/sub messaging
Need atomic operations (transactions)
How do you handle cache invalidation?
TTL-based expiration
Event-driven invalidation (update on write)
Cache-aside pattern (lazy loading)
Use message queue for distributed invalidation
What is Redis Cluster mode?
Data sharded across multiple nodes
Each shard has primary + replicas
Automatic data distribution
Higher throughput and capacity
Some commands have limitations
How do you handle a cache stampede?
Locking (only one process refreshes)
Probabilistic early expiration
Background refresh before expiration
Circuit breaker pattern
What happens when ElastiCache runs out of memory?
Eviction based on policy (LRU, LFU, etc.)
Writes may fail if maxmemory-policy is noeviction
Monitor for evictions, scale up if needed

Cluster Configurations¶

Redis Non-Cluster Mode¶

Single shard
1 primary + up to 5 replicas
All data on single node
Automatic failover with Multi-AZ

Redis Cluster Mode¶

Multiple shards (up to 500)
Each shard: 1 primary + up to 5 replicas
Data distributed by key hash
Horizontal scaling

Scaling Options¶

Scaling Type	Cluster Mode Disabled	Cluster Mode Enabled
Vertical	Change node type	Change node type
Add replicas	Yes (up to 5)	Yes (up to 5 per shard)
Add shards	No	Yes (online)

Alternatives¶

AWS Alternatives¶

Service	When to Use Instead
DynamoDB DAX	DynamoDB-specific caching
MemoryDB	Redis-compatible with durability
CloudFront	Edge caching for HTTP content

External Alternatives¶

Provider	Service
Redis Cloud	Managed Redis
Upstash	Serverless Redis
Momento	Serverless cache
Google Cloud	Memorystore
Azure	Azure Cache for Redis

Best Practices¶

Use Redis for most cases - Rich features, HA
Size for memory + overhead - 25-30% buffer
Enable encryption - At-rest and in-transit
Deploy in VPC - Private subnets only
Use Multi-AZ - For production workloads
Set appropriate TTLs - Prevent stale data
Monitor evictions - Scale before evicting
Use connection pooling - Reduce connection overhead
Implement cache-aside - Standard pattern
Plan key naming - Namespace keys (user:123)

Pricing¶

Component	Cost Factors
Node hours	Instance type, size
Reserved nodes	1 or 3 year, partial/all upfront
Backup storage	Per GB over free tier
Data transfer	Cross-AZ, internet egress

Example Node Types¶

Type	vCPU	Memory	Network
cache.t3.micro	2	0.5 GB	Low
cache.r6g.large	2	13.07 GB	Up to 10 Gbps
cache.r6g.4xlarge	16	105 GB	Up to 10 Gbps