System Design Interview Delivery Framework¶
A structured approach to delivering system design interviews confidently and effectively.
Interview Timeline (45-60 minutes)¶
| Phase | Time | Focus |
|---|---|---|
| 1. Requirements | 5-7 min | Clarify scope, constraints, scale |
| 2. Estimations | 3-5 min | Traffic, storage, bandwidth |
| 3. High-Level Design | 10-15 min | Core components, data flow |
| 4. Deep Dive | 15-20 min | Detailed design of critical components |
| 5. Bottlenecks & Trade-offs | 5-10 min | Scaling, failure handling |
| 6. Wrap-up | 2-3 min | Summary, future improvements |
Phase 1: Requirements Clarification (5-7 min)¶
Why This Matters¶
- Shows structured thinking
- Prevents designing the wrong system
- Demonstrates communication skills
- Sets scope to avoid rabbit holes
Functional Requirements¶
Ask about core features:
"What are the core features we need to support?"
"Who are the users of this system?"
"What actions can users perform?"
Example (Twitter): - Post tweets - Follow users - View timeline/feed - Search tweets
Non-Functional Requirements¶
Always clarify:
| Requirement | Questions to Ask |
|---|---|
| Scale | How many users? DAU? Requests/sec? |
| Performance | Latency requirements? (p50, p99) |
| Availability | 99.9%? 99.99%? Can we have downtime? |
| Consistency | Strong or eventual? |
| Durability | Can we lose data? How critical? |
Scope Definition¶
Explicitly state what you will/won't cover:
"For this design, I'll focus on:
- Core posting and reading flow
- Feed generation
- Basic scaling
I'll leave out of scope:
- Analytics dashboard
- Ad serving
- Content moderation"
Sample Questions by System¶
| System | Key Questions |
|---|---|
| URL Shortener | Custom URLs? Expiration? Analytics? |
| Chat App | Group chat? Read receipts? Media? |
| Rate Limiter | Per user? Per IP? Distributed? |
| News Feed | Real-time? Ranking algorithm? |
| File Storage | Max file size? Sharing? Versioning? |
Phase 2: Back-of-Envelope Estimations (3-5 min)¶
When to Do This¶
- After requirements are clear
- Before diving into design
- Shows you think about scale
What to Estimate¶
1. Traffic (QPS)
- Write QPS = DAU × writes_per_user / 86,400
- Read QPS = DAU × reads_per_user / 86,400
- Peak QPS = Average × 2-3
2. Storage
- Daily = requests × data_per_request
- Yearly = daily × 365
- 5-year = yearly × 5 × replication_factor
3. Bandwidth
- Incoming = Write_QPS × request_size
- Outgoing = Read_QPS × response_size
Example Calculation (Verbalize This)¶
"Let's estimate for a URL shortener:
- 100M new URLs/month → ~40 writes/sec
- 100:1 read ratio → 4,000 reads/sec
- Peak: ~10K reads/sec
- Storage: 500 bytes/URL × 100M × 12 × 5 years = 3TB
So we need a system handling 10K QPS reads with 3TB storage."
Pro Tip¶
Don't spend too long here. Round numbers are fine: - 86,400 ≈ 100,000 - Use powers of 10
Phase 3: High-Level Design (10-15 min)¶
Start with the Basics¶
Draw core components:
Build Incrementally¶
- Single server → Add components as needed
- Identify bottlenecks → Address with scaling
- Explain each addition → Show your reasoning
Core Components to Consider¶
| Component | Purpose | When to Add |
|---|---|---|
| Load Balancer | Distribute traffic | Multiple servers |
| CDN | Serve static content | Global users, media |
| Cache | Reduce DB load | Read-heavy systems |
| Message Queue | Async processing | Decouple components |
| Database | Persistent storage | Always needed |
| Search Index | Full-text search | Search features |
| Blob Storage | Files, media | Images, videos |
Data Flow¶
Always explain the data flow:
"When a user posts a tweet:
1. Request hits load balancer
2. Routed to application server
3. Server validates and stores in DB
4. Async job updates follower feeds
5. Invalidates relevant caches
6. Returns success to user"
API Design¶
Define key endpoints:
POST /tweets
- body: { content, media_ids }
- returns: { tweet_id, created_at }
GET /feed?user_id=123&cursor=xxx
- returns: { tweets: [...], next_cursor }
GET /tweets/{tweet_id}
- returns: { tweet }
Database Schema¶
Sketch key tables:
users
- id (PK)
- username
- email
- created_at
tweets
- id (PK)
- user_id (FK)
- content
- created_at
follows
- follower_id (PK)
- followee_id (PK)
- created_at
Phase 4: Deep Dive (15-20 min)¶
How to Choose What to Deep Dive¶
Option 1: Ask the interviewer
"I can deep dive into feed generation, real-time updates,
or the caching strategy. Which interests you most?"
Option 2: Pick the most critical/complex component - Feed ranking for social media - Consistency for financial systems - Real-time for chat apps
Deep Dive Topics¶
Database Design¶
"For the database, I'd use:
Primary DB: PostgreSQL for users/tweets (ACID, relational)
- Partition tweets by user_id
- Index on (user_id, created_at) for timeline
Cache: Redis
- User profiles (high read, low write)
- Hot tweets (viral content)
- Session data
Why not NoSQL for tweets?
- Need joins for mentions, replies
- Strong consistency for tweet creation
- But could shard by user_id at scale"
Caching Strategy¶
"I'd implement multi-layer caching:
L1: CDN for static assets
L2: Redis for computed feeds
L3: Local cache on app servers
Cache invalidation:
- TTL-based (5 min for feeds)
- Event-driven (new tweet invalidates author's feed)
Cache-aside pattern:
1. Check cache
2. If miss, query DB
3. Populate cache
4. Return data"
Scaling Approach¶
"To scale reads:
1. Add read replicas
2. Cache hot data
3. CDN for static content
To scale writes:
1. Shard database
2. Async processing via queue
3. Rate limiting
Sharding strategy:
- Shard by user_id (even distribution)
- Consistent hashing (easy to add nodes)"
Phase 5: Bottlenecks & Trade-offs (5-10 min)¶
Identify Bottlenecks¶
Proactively address:
"Potential bottlenecks I see:
1. Database writes at scale
→ Solution: Sharding by user_id
2. Feed generation for users with many followers
→ Solution: Pre-compute feeds, push model
3. Hot partitions (celebrity users)
→ Solution: Separate hot/cold data, caching"
Discuss Trade-offs¶
Show you understand there's no perfect solution:
| Decision | Trade-off |
|---|---|
| Push vs Pull feed | Storage vs Latency |
| SQL vs NoSQL | Consistency vs Scale |
| Sync vs Async | Latency vs Complexity |
| Strong vs Eventual consistency | Correctness vs Availability |
| Cache vs No cache | Speed vs Staleness |
Example Trade-off Discussion¶
"For feed generation, we have two options:
PUSH Model (Fan-out on write):
+ Fast reads (feed pre-computed)
- Slow writes for popular users
- Storage cost for duplicate data
- Best for: Most users
PULL Model (Fan-out on read):
+ Fast writes
+ Less storage
- Slow reads (compute on demand)
- Best for: Celebrity users
Hybrid approach:
- Push for regular users
- Pull for celebrities (>1M followers)"
Failure Handling¶
"For fault tolerance:
1. Server failure
→ Multiple instances behind LB
→ Health checks, auto-replacement
2. Database failure
→ Multi-AZ replication
→ Automated failover
3. Cache failure
→ Fall back to database
→ Cache warming on recovery
4. Datacenter failure
→ Cross-region replication
→ DNS failover"
Phase 6: Wrap-up (2-3 min)¶
Summarize Key Points¶
"To summarize, we designed a Twitter-like system that:
- Handles 500K read QPS, 5K write QPS
- Uses PostgreSQL sharded by user_id
- Implements hybrid push/pull for feeds
- Caches with Redis for hot data
- Achieves 99.99% availability with multi-AZ"
Mention Future Improvements¶
"If I had more time, I'd explore:
- Search functionality with Elasticsearch
- Real-time notifications with WebSockets
- Analytics pipeline with Kafka
- ML-based feed ranking"
Communication Best Practices¶
DO ✅¶
| Practice | Example |
|---|---|
| Think out loud | "I'm considering two options here..." |
| Draw diagrams | Always visualize architecture |
| State assumptions | "I'm assuming 80% reads, 20% writes" |
| Ask for feedback | "Does this direction make sense?" |
| Acknowledge trade-offs | "This approach sacrifices X for Y" |
| Use numbers | "This handles ~10K QPS" |
DON'T ❌¶
| Anti-pattern | Why It's Bad |
|---|---|
| Jumping to solution | Shows lack of process |
| Over-engineering | Adds unnecessary complexity |
| Ignoring scale | Missing the point of the interview |
| One-way monologue | Not collaborative |
| Getting stuck in details | Lose the big picture |
| Using buzzwords without understanding | Easily exposed |
Handling Uncertainty¶
Good: "I'm not 100% sure about X, but my understanding is..."
Good: "I'd need to research X, but I'd approach it by..."
Bad: Making up facts or pretending to know
Handling Pushback¶
Interviewer: "What if this doesn't scale?"
Response: "Good point. At higher scale, we could:
1. Add more shards
2. Implement read replicas
3. Use a distributed cache
Shall I explore any of these?"
Common Patterns Reference¶
Scaling Reads¶
- Caching (Redis, Memcached)
- Read replicas
- CDN
- Database denormalization
Scaling Writes¶
- Sharding/partitioning
- Message queues
- Batch processing
- Write-behind caching
High Availability¶
- Load balancing
- Multi-AZ deployment
- Database replication
- Circuit breakers
- Graceful degradation
Low Latency¶
- Caching
- CDN
- Edge computing
- Connection pooling
- Async processing
System Design Template¶
Use this mental checklist:
□ Requirements
□ Functional (what features?)
□ Non-functional (scale, latency, availability?)
□ Scope (what's in/out?)
□ Estimations
□ Traffic (QPS read/write)
□ Storage (how much data?)
□ Bandwidth (network needs?)
□ High-Level Design
□ Core components
□ Data flow
□ API design
□ Data model
□ Deep Dive
□ Critical component details
□ Database choices
□ Caching strategy
□ Scaling approach
□ Trade-offs
□ Identified bottlenecks
□ Discussed alternatives
□ Justified decisions
□ Wrap-up
□ Summary
□ Future improvements
Quick Reference: Component Selection¶
| Need | Solution |
|---|---|
| Distribute traffic | Load Balancer (ALB/NLB) |
| Serve static content globally | CDN (CloudFront) |
| Reduce database load | Cache (Redis/ElastiCache) |
| Handle async tasks | Message Queue (SQS/Kafka) |
| Store structured data | SQL (PostgreSQL/MySQL) |
| Store unstructured data | NoSQL (DynamoDB/MongoDB) |
| Store files | Object Storage (S3) |
| Full-text search | Search Engine (Elasticsearch) |
| Real-time communication | WebSockets/Pub-Sub |
| Workflow orchestration | Step Functions/Airflow |
| Rate limiting | Redis + Token Bucket |
| Service discovery | DNS/Consul/Cloud Map |
Sample Opening Statement¶
"Before diving in, I'd like to understand the requirements.
Let me start with a few clarifying questions:
1. What's the expected scale - daily active users?
2. What are the core features we must support?
3. Are there specific latency or availability requirements?
4. Should I focus on any particular aspect?
[After getting answers]
Great. Based on that, let me outline our scope:
- We'll design X, Y, Z features
- Targeting N users with K QPS
- Aiming for 99.9% availability
Let me start with some quick estimations, then move
to the high-level architecture. Sound good?"
Final Tips¶
- Practice drawing - Diagrams should be clear and quick
- Know your numbers - Latencies, capacities, common scales
- Have opinions - "I prefer X because..." shows experience
- Stay calm - It's a conversation, not an interrogation
- Be collaborative - Treat interviewer as a teammate
- Manage time - Don't spend 20 min on requirements
- Show depth - Anyone can draw boxes; show you understand why