Skip to content

Twitter Timeline System

Quick Reference Guide for System Design Interviews


Problem Statement

Design Twitter's core functionality: users can post tweets, follow other users, and see a timeline of tweets from users they follow. The system should handle high read and write throughput with low latency.


Requirements

Functional Requirements

  • Post tweets (280 characters, media optional)
  • Follow/unfollow users
  • Home timeline (tweets from followed users)
  • User timeline (user's own tweets)
  • Like, retweet, reply to tweets
  • Search tweets

Non-Functional Requirements

  • Latency: < 200ms for timeline
  • Availability: 99.99%
  • Scale: 500M users, 300M DAU
  • Consistency: Eventually consistent (acceptable)

Back of Envelope Estimation

Capacity Estimation


High-Level Architecture

Twitter Architecture


Timeline Generation Approaches

Approach 1: Pull Model (Fan-out on Read)

Pull Model - Fan-out on Read

Approach 2: Push Model (Fan-out on Write)

Push Model - Fan-out on Write

Hybrid Model


Data Models

Data Models


Database Schema

-- Users table
CREATE TABLE users (
    user_id         UUID PRIMARY KEY,
    username        VARCHAR(50) UNIQUE NOT NULL,
    display_name    VARCHAR(100),
    bio             VARCHAR(500),
    follower_count  BIGINT DEFAULT 0,
    following_count BIGINT DEFAULT 0,
    is_celebrity    BOOLEAN DEFAULT FALSE,
    created_at      TIMESTAMP NOT NULL
);

-- Tweets table (sharded by user_id)
CREATE TABLE tweets (
    tweet_id        BIGINT PRIMARY KEY,  -- Snowflake ID
    user_id         UUID NOT NULL,
    content         VARCHAR(280) NOT NULL,
    media_urls      TEXT[],
    reply_to        BIGINT,
    retweet_of      BIGINT,
    like_count      INT DEFAULT 0,
    retweet_count   INT DEFAULT 0,
    reply_count     INT DEFAULT 0,
    created_at      TIMESTAMP NOT NULL,

    INDEX idx_user_time (user_id, created_at DESC)
);

-- Follower relationship (graph database recommended)
CREATE TABLE follows (
    follower_id     UUID NOT NULL,
    followee_id     UUID NOT NULL,
    created_at      TIMESTAMP NOT NULL,

    PRIMARY KEY (follower_id, followee_id),
    INDEX idx_followee (followee_id)
);

-- Likes
CREATE TABLE likes (
    user_id         UUID NOT NULL,
    tweet_id        BIGINT NOT NULL,
    created_at      TIMESTAMP NOT NULL,

    PRIMARY KEY (user_id, tweet_id)
);

Snowflake ID Generation

Snowflake ID


Fanout Service

Fanout Service


Timeline Service

Timeline Service


Caching Strategy

Caching Strategy


Search Architecture

Search Architecture


Data Partitioning

Data Partitioning


Interview Discussion Points

  1. Push vs Pull for timeline?
  2. Push for regular users (fast reads)
  3. Pull for celebrities (avoid slow writes)
  4. Hybrid approach is best

  5. How do you handle celebrities?

  6. Don't fan-out on write
  7. Fetch their tweets on timeline read
  8. Cache their recent tweets

  9. How do you generate unique tweet IDs?

  10. Snowflake IDs: time-ordered, distributed
  11. 64-bit, contains timestamp

  12. How do you handle timeline pagination?

  13. Cursor-based with tweet_id
  14. "Get tweets with ID < cursor"

  15. How do you rank the timeline?

  16. Chronological (simple)
  17. ML-based ranking (engagement, relevance)
  18. Mix of followed + recommended

  19. How do you handle deletes?

  20. Soft delete in DB
  21. Async removal from timeline caches
  22. Eventual consistency acceptable

Quick Reference for System Design Interviews