Skip to content

Price Tracking Service (like CamelCamelCamel)

Quick Reference Guide for System Design Interviews


Problem Statement

Design a price tracking service that monitors product prices across e-commerce sites, stores historical pricing data, and alerts users when prices drop to their desired levels.


Requirements

Functional Requirements

  • Track prices for millions of products
  • Historical price graphs
  • Price drop alerts (email, push)
  • Support multiple retailers (Amazon, Walmart, etc.)
  • User watchlists

Non-Functional Requirements

  • Freshness: Prices updated every 1-6 hours
  • Availability: 99.9%
  • Scale: 100M products, 10M users
  • Accuracy: Correct price extraction

High-Level Architecture

High-Level Architecture


Crawler Service

Crawler Strategy


Price Extraction

Price Extraction


Price History Storage

Price History Storage


Alert System

Alert System


Data Models

-- Products
CREATE TABLE products (
    product_id      UUID PRIMARY KEY,
    name            VARCHAR(500),
    brand           VARCHAR(200),
    category        VARCHAR(100),
    image_url       VARCHAR(500),
    current_price   DECIMAL(10,2),
    lowest_price    DECIMAL(10,2),
    highest_price   DECIMAL(10,2),
    avg_price       DECIMAL(10,2),
    last_updated    TIMESTAMP,
    created_at      TIMESTAMP DEFAULT NOW()
);

-- Product URLs (same product on multiple retailers)
CREATE TABLE product_urls (
    product_id      UUID REFERENCES products(product_id),
    retailer        VARCHAR(50),
    url             VARCHAR(2000),
    external_id     VARCHAR(100),  -- ASIN for Amazon
    crawl_priority  INT DEFAULT 1,
    last_crawled    TIMESTAMP,
    next_crawl      TIMESTAMP,

    PRIMARY KEY (product_id, retailer)
);

-- User watchlists
CREATE TABLE watchlists (
    user_id         UUID,
    product_id      UUID REFERENCES products(product_id),
    target_price    DECIMAL(10,2),
    notify_on_drop  BOOLEAN DEFAULT TRUE,
    added_at        TIMESTAMP DEFAULT NOW(),

    PRIMARY KEY (user_id, product_id)
);

-- Price history (TimescaleDB hypertable)
CREATE TABLE price_history (
    time            TIMESTAMPTZ NOT NULL,
    product_id      UUID NOT NULL,
    retailer        VARCHAR(50),
    price           DECIMAL(10,2),
    in_stock        BOOLEAN,
    currency        VARCHAR(3) DEFAULT 'USD'
);

API Endpoints

API Endpoints


Interview Discussion Points

  1. How do you crawl millions of products?
  2. Priority-based scheduling
  3. Distributed workers
  4. Rate limiting per domain

  5. How do you handle anti-scraping?

  6. Rotate user agents/IPs
  7. Respect robots.txt
  8. Use APIs when available

  9. How do you store price history efficiently?

  10. TimescaleDB for time-series
  11. Compression and downsampling
  12. Retention policies

  13. How do you check alerts efficiently?

  14. Event-driven on price update
  15. Indexed lookup by product_id
  16. Batch processing

  17. How do you handle layout changes?

  18. Monitor extraction success rate
  19. Alert on anomalies
  20. Quick parser updates

  21. How do you ensure freshness?

  22. Higher priority for watched items
  23. Adaptive crawl frequency
  24. Real-time for high-value products