System Design Interview Delivery Framework¶

A structured approach to delivering system design interviews confidently and effectively.

Interview Timeline (45-60 minutes)¶

Phase	Time	Focus
1. Requirements	5-7 min	Clarify scope, constraints, scale
2. Estimations	3-5 min	Traffic, storage, bandwidth
3. High-Level Design	10-15 min	Core components, data flow
4. Deep Dive	15-20 min	Detailed design of critical components
5. Bottlenecks & Trade-offs	5-10 min	Scaling, failure handling
6. Wrap-up	2-3 min	Summary, future improvements

Phase 1: Requirements Clarification (5-7 min)¶

Why This Matters¶

Shows structured thinking
Prevents designing the wrong system
Demonstrates communication skills
Sets scope to avoid rabbit holes

Functional Requirements¶

Ask about core features:

"What are the core features we need to support?"
"Who are the users of this system?"
"What actions can users perform?"

Example (Twitter): - Post tweets - Follow users - View timeline/feed - Search tweets

Non-Functional Requirements¶

Always clarify:

Requirement	Questions to Ask
Scale	How many users? DAU? Requests/sec?
Performance	Latency requirements? (p50, p99)
Availability	99.9%? 99.99%? Can we have downtime?
Consistency	Strong or eventual?
Durability	Can we lose data? How critical?

Scope Definition¶

Explicitly state what you will/won't cover:

"For this design, I'll focus on:
 - Core posting and reading flow
 - Feed generation
 - Basic scaling

I'll leave out of scope:
 - Analytics dashboard
 - Ad serving
 - Content moderation"

Sample Questions by System¶

System	Key Questions
URL Shortener	Custom URLs? Expiration? Analytics?
Chat App	Group chat? Read receipts? Media?
Rate Limiter	Per user? Per IP? Distributed?
News Feed	Real-time? Ranking algorithm?
File Storage	Max file size? Sharing? Versioning?

Phase 2: Back-of-Envelope Estimations (3-5 min)¶

When to Do This¶

After requirements are clear
Before diving into design
Shows you think about scale

What to Estimate¶

1. Traffic (QPS)
   - Write QPS = DAU × writes_per_user / 86,400
   - Read QPS = DAU × reads_per_user / 86,400
   - Peak QPS = Average × 2-3

2. Storage
   - Daily = requests × data_per_request
   - Yearly = daily × 365
   - 5-year = yearly × 5 × replication_factor

3. Bandwidth
   - Incoming = Write_QPS × request_size
   - Outgoing = Read_QPS × response_size

Example Calculation (Verbalize This)¶

"Let's estimate for a URL shortener:
- 100M new URLs/month → ~40 writes/sec
- 100:1 read ratio → 4,000 reads/sec
- Peak: ~10K reads/sec
- Storage: 500 bytes/URL × 100M × 12 × 5 years = 3TB

So we need a system handling 10K QPS reads with 3TB storage."

Pro Tip¶

Don't spend too long here. Round numbers are fine: - 86,400 ≈ 100,000 - Use powers of 10

Phase 3: High-Level Design (10-15 min)¶

Start with the Basics¶

Draw core components:

Basic System Architecture

Build Incrementally¶

Single server → Add components as needed
Identify bottlenecks → Address with scaling
Explain each addition → Show your reasoning

Core Components to Consider¶

Component	Purpose	When to Add
Load Balancer	Distribute traffic	Multiple servers
CDN	Serve static content	Global users, media
Cache	Reduce DB load	Read-heavy systems
Message Queue	Async processing	Decouple components
Database	Persistent storage	Always needed
Search Index	Full-text search	Search features
Blob Storage	Files, media	Images, videos

Data Flow¶

Always explain the data flow:

"When a user posts a tweet:
1. Request hits load balancer
2. Routed to application server
3. Server validates and stores in DB
4. Async job updates follower feeds
5. Invalidates relevant caches
6. Returns success to user"

API Design¶

Define key endpoints:

POST /tweets
  - body: { content, media_ids }
  - returns: { tweet_id, created_at }

GET /feed?user_id=123&cursor=xxx
  - returns: { tweets: [...], next_cursor }

GET /tweets/{tweet_id}
  - returns: { tweet }

Database Schema¶

Sketch key tables:

users
  - id (PK)
  - username
  - email
  - created_at

tweets
  - id (PK)
  - user_id (FK)
  - content
  - created_at

follows
  - follower_id (PK)
  - followee_id (PK)
  - created_at

Phase 4: Deep Dive (15-20 min)¶

How to Choose What to Deep Dive¶

Option 1: Ask the interviewer

"I can deep dive into feed generation, real-time updates,
or the caching strategy. Which interests you most?"

Option 2: Pick the most critical/complex component - Feed ranking for social media - Consistency for financial systems - Real-time for chat apps

Deep Dive Topics¶

Database Design¶

"For the database, I'd use:

Primary DB: PostgreSQL for users/tweets (ACID, relational)
- Partition tweets by user_id
- Index on (user_id, created_at) for timeline

Cache: Redis
- User profiles (high read, low write)
- Hot tweets (viral content)
- Session data

Why not NoSQL for tweets?
- Need joins for mentions, replies
- Strong consistency for tweet creation
- But could shard by user_id at scale"

Caching Strategy¶

"I'd implement multi-layer caching:

L1: CDN for static assets
L2: Redis for computed feeds
L3: Local cache on app servers

Cache invalidation:
- TTL-based (5 min for feeds)
- Event-driven (new tweet invalidates author's feed)

Cache-aside pattern:
1. Check cache
2. If miss, query DB
3. Populate cache
4. Return data"

Scaling Approach¶

"To scale reads:
1. Add read replicas
2. Cache hot data
3. CDN for static content

To scale writes:
1. Shard database
2. Async processing via queue
3. Rate limiting

Sharding strategy:
- Shard by user_id (even distribution)
- Consistent hashing (easy to add nodes)"

Phase 5: Bottlenecks & Trade-offs (5-10 min)¶

Identify Bottlenecks¶

Proactively address:

"Potential bottlenecks I see:

1. Database writes at scale
   → Solution: Sharding by user_id

2. Feed generation for users with many followers
   → Solution: Pre-compute feeds, push model

3. Hot partitions (celebrity users)
   → Solution: Separate hot/cold data, caching"

Discuss Trade-offs¶

Show you understand there's no perfect solution:

Decision	Trade-off
Push vs Pull feed	Storage vs Latency
SQL vs NoSQL	Consistency vs Scale
Sync vs Async	Latency vs Complexity
Strong vs Eventual consistency	Correctness vs Availability
Cache vs No cache	Speed vs Staleness

Example Trade-off Discussion¶

"For feed generation, we have two options:

PUSH Model (Fan-out on write):
+ Fast reads (feed pre-computed)
- Slow writes for popular users
- Storage cost for duplicate data
- Best for: Most users

PULL Model (Fan-out on read):
+ Fast writes
+ Less storage
- Slow reads (compute on demand)
- Best for: Celebrity users

Hybrid approach:
- Push for regular users
- Pull for celebrities (>1M followers)"

Failure Handling¶

"For fault tolerance:

1. Server failure
   → Multiple instances behind LB
   → Health checks, auto-replacement

2. Database failure
   → Multi-AZ replication
   → Automated failover

3. Cache failure
   → Fall back to database
   → Cache warming on recovery

4. Datacenter failure
   → Cross-region replication
   → DNS failover"

Phase 6: Wrap-up (2-3 min)¶

Summarize Key Points¶

"To summarize, we designed a Twitter-like system that:
- Handles 500K read QPS, 5K write QPS
- Uses PostgreSQL sharded by user_id
- Implements hybrid push/pull for feeds
- Caches with Redis for hot data
- Achieves 99.99% availability with multi-AZ"

Mention Future Improvements¶

"If I had more time, I'd explore:
- Search functionality with Elasticsearch
- Real-time notifications with WebSockets
- Analytics pipeline with Kafka
- ML-based feed ranking"

Communication Best Practices¶

DO ✅¶

Practice	Example
Think out loud	"I'm considering two options here..."
Draw diagrams	Always visualize architecture
State assumptions	"I'm assuming 80% reads, 20% writes"
Ask for feedback	"Does this direction make sense?"
Acknowledge trade-offs	"This approach sacrifices X for Y"
Use numbers	"This handles ~10K QPS"

DON'T ❌¶

Anti-pattern	Why It's Bad
Jumping to solution	Shows lack of process
Over-engineering	Adds unnecessary complexity
Ignoring scale	Missing the point of the interview
One-way monologue	Not collaborative
Getting stuck in details	Lose the big picture
Using buzzwords without understanding	Easily exposed

Handling Uncertainty¶

Good: "I'm not 100% sure about X, but my understanding is..."
Good: "I'd need to research X, but I'd approach it by..."
Bad:  Making up facts or pretending to know

Handling Pushback¶

Interviewer: "What if this doesn't scale?"

Response: "Good point. At higher scale, we could:
1. Add more shards
2. Implement read replicas
3. Use a distributed cache

Shall I explore any of these?"

Common Patterns Reference¶

Scaling Reads¶

Caching (Redis, Memcached)
Read replicas
CDN
Database denormalization

Scaling Writes¶

Sharding/partitioning
Message queues
Batch processing
Write-behind caching

High Availability¶

Load balancing
Multi-AZ deployment
Database replication
Circuit breakers
Graceful degradation

Low Latency¶

Caching
CDN
Edge computing
Connection pooling
Async processing

System Design Template¶

Use this mental checklist:

□ Requirements
  □ Functional (what features?)
  □ Non-functional (scale, latency, availability?)
  □ Scope (what's in/out?)

□ Estimations
  □ Traffic (QPS read/write)
  □ Storage (how much data?)
  □ Bandwidth (network needs?)

□ High-Level Design
  □ Core components
  □ Data flow
  □ API design
  □ Data model

□ Deep Dive
  □ Critical component details
  □ Database choices
  □ Caching strategy
  □ Scaling approach

□ Trade-offs
  □ Identified bottlenecks
  □ Discussed alternatives
  □ Justified decisions

□ Wrap-up
  □ Summary
  □ Future improvements

Quick Reference: Component Selection¶

Need	Solution
Distribute traffic	Load Balancer (ALB/NLB)
Serve static content globally	CDN (CloudFront)
Reduce database load	Cache (Redis/ElastiCache)
Handle async tasks	Message Queue (SQS/Kafka)
Store structured data	SQL (PostgreSQL/MySQL)
Store unstructured data	NoSQL (DynamoDB/MongoDB)
Store files	Object Storage (S3)
Full-text search	Search Engine (Elasticsearch)
Real-time communication	WebSockets/Pub-Sub
Workflow orchestration	Step Functions/Airflow
Rate limiting	Redis + Token Bucket
Service discovery	DNS/Consul/Cloud Map

Sample Opening Statement¶

"Before diving in, I'd like to understand the requirements.
Let me start with a few clarifying questions:

1. What's the expected scale - daily active users?
2. What are the core features we must support?
3. Are there specific latency or availability requirements?
4. Should I focus on any particular aspect?

[After getting answers]

Great. Based on that, let me outline our scope:
- We'll design X, Y, Z features
- Targeting N users with K QPS
- Aiming for 99.9% availability

Let me start with some quick estimations, then move
to the high-level architecture. Sound good?"

Final Tips¶

Practice drawing - Diagrams should be clear and quick
Know your numbers - Latencies, capacities, common scales
Have opinions - "I prefer X because..." shows experience
Stay calm - It's a conversation, not an interrogation
Be collaborative - Treat interviewer as a teammate
Manage time - Don't spend 20 min on requirements
Show depth - Anyone can draw boxes; show you understand why