Facebook News Feed¶
Quick Reference Guide for System Design Interviews
Problem Statement¶
Design Facebook's News Feed system that aggregates and ranks posts from friends, groups, pages, and ads, presenting a personalized feed to each user.
Requirements¶
Functional Requirements¶
- Display personalized feed of posts from friends, pages, groups
- Support various content types (text, images, videos, links)
- Like, comment, share functionality
- Real-time updates for new posts
- Infinite scroll with pagination
Non-Functional Requirements¶
- Latency: < 500ms feed load
- Availability: 99.99%
- Scale: 2B users, 500M DAU
- Freshness: New posts appear within minutes
Back of Envelope Estimation¶
High-Level Architecture¶
Feed Generation Approaches¶
Facebook's Hybrid Approach¶
Ranking System¶
Social Graph (TAO)¶
Feed Service Architecture¶
Data Models¶
-- Posts table (sharded by author_id)
CREATE TABLE posts (
post_id BIGINT PRIMARY KEY, -- Snowflake ID
author_id BIGINT NOT NULL,
content_type ENUM('text', 'photo', 'video', 'link'),
content TEXT,
media_ids JSON,
privacy ENUM('public', 'friends', 'only_me'),
like_count INT DEFAULT 0,
comment_count INT DEFAULT 0,
share_count INT DEFAULT 0,
created_at TIMESTAMP NOT NULL,
INDEX idx_author_time (author_id, created_at DESC)
);
-- Feed actions (for ranking features)
CREATE TABLE feed_actions (
user_id BIGINT NOT NULL,
post_id BIGINT NOT NULL,
action_type ENUM('view', 'like', 'comment', 'share', 'hide'),
dwell_time_ms INT,
created_at TIMESTAMP NOT NULL,
INDEX idx_user_time (user_id, created_at DESC)
);
-- User affinity scores (computed offline)
CREATE TABLE user_affinity (
user_id BIGINT NOT NULL,
target_id BIGINT NOT NULL,
target_type ENUM('user', 'page', 'group'),
affinity_score FLOAT,
updated_at TIMESTAMP,
PRIMARY KEY (user_id, target_type, target_id)
);
Caching Strategy¶
Real-Time Updates¶
Ad Integration¶
Interview Discussion Points¶
- Pull vs Push for News Feed?
- Facebook uses Pull with pre-aggregated candidates
- Unlike Twitter, ranking is critical
-
Fresh posts can wait minutes
-
How does ranking work?
- ML model predicts engagement
- Features: affinity, recency, content type
-
Diversity enforced post-ranking
-
How do you handle the social graph?
- TAO: Distributed graph database
- Cached aggressively (99% hit rate)
-
Sharded by user ID
-
How do you handle real-time updates?
- WebSocket for connected users
- Only high-affinity updates pushed
-
"New posts" badge, user refreshes
-
How do you scale feed generation?
- Pre-compute candidate pool
- Cache ranked feeds (short TTL)
-
Parallel fetching from shards
-
How do you mix ads with organic?
- Ad service parallel to feed
- Auction-based ranking
- Inserted at fixed positions
Quick Reference for System Design Interviews