Skip to content

YouTube Video Streaming

Quick Reference Guide for System Design Interviews


Problem Statement

Design a video streaming platform like YouTube that supports video upload, transcoding, storage, and streaming to millions of concurrent users.


Requirements

Functional Requirements

  • Upload videos
  • Stream/playback videos
  • Video transcoding (multiple resolutions)
  • Adaptive bitrate streaming
  • Comments, likes, subscriptions
  • Search and recommendations

Non-Functional Requirements

  • Availability: 99.99%
  • Latency: < 200ms video start
  • Scale: 2B users, 1B hours watched/day
  • Storage: Petabytes of video

Back of Envelope Estimation

Capacity Estimation


High-Level Architecture

YouTube Architecture


Video Upload Pipeline

Video Upload Pipeline


Video Transcoding

Video Transcoding


Adaptive Bitrate Streaming

Adaptive Bitrate Streaming


CDN Architecture

CDN Architecture


Video Playback Flow

Video Playback Flow


Data Models

-- Videos table
CREATE TABLE videos (
    video_id        VARCHAR(20) PRIMARY KEY,
    user_id         UUID NOT NULL,
    title           VARCHAR(500) NOT NULL,
    description     TEXT,
    duration        INT,
    status          ENUM('uploading', 'processing', 'published', 'failed'),
    privacy         ENUM('public', 'unlisted', 'private'),
    view_count      BIGINT DEFAULT 0,
    like_count      BIGINT DEFAULT 0,
    created_at      TIMESTAMP NOT NULL,
    published_at    TIMESTAMP,

    INDEX idx_user_id (user_id),
    INDEX idx_created (created_at DESC)
);

-- Video files (per quality)
CREATE TABLE video_files (
    video_id        VARCHAR(20) NOT NULL,
    quality         VARCHAR(10) NOT NULL,  -- '1080p', '720p', etc.
    codec           VARCHAR(20),
    file_size       BIGINT,
    manifest_url    VARCHAR(500),

    PRIMARY KEY (video_id, quality)
);

-- View history (for analytics and recommendations)
CREATE TABLE view_events (
    event_id        UUID PRIMARY KEY,
    video_id        VARCHAR(20) NOT NULL,
    user_id         UUID,
    watch_time      INT,  -- seconds watched
    timestamp       TIMESTAMP,
    device_type     VARCHAR(20),

    INDEX idx_video (video_id, timestamp),
    INDEX idx_user (user_id, timestamp)
);

Recommendations

Recommendation System


Storage Architecture

Storage Architecture


Interview Discussion Points

  1. How do you handle video upload for large files?
  2. Chunked upload (resumable)
  3. Direct to S3 with presigned URLs
  4. Background transcoding

  5. How does adaptive bitrate streaming work?

  6. Video split into segments
  7. Multiple quality versions
  8. Player switches based on bandwidth

  9. How do you scale video delivery?

  10. Multi-tier CDN
  11. Edge caching
  12. Pre-warm popular content

  13. How do you handle transcoding at scale?

  14. Distributed workers
  15. Parallel segment encoding
  16. Priority queues (popular channels first)

  17. How do you store petabytes of video?

  18. Tiered storage (hot/warm/cold)
  19. Lifecycle policies
  20. Cost optimization

  21. How do you handle live streaming?

  22. Different pipeline (RTMP ingest)
  23. Near real-time transcoding
  24. Ultra-low latency CDN