Fraud Detection System¶

Problem Statement¶

Design a real-time fraud detection system for a payment platform that can identify and prevent fraudulent transactions while minimizing false positives. The system should score transactions in real-time and support both rules-based and ML-based detection.

Requirements¶

Functional Requirements¶

Score transactions in real-time (< 100ms)
Support rules-based fraud detection
Support ML-based fraud detection
Provide explanations for fraud decisions
Allow manual review workflow
Support merchant-configurable rules
Track and learn from confirmed fraud
Provide fraud analytics and reporting

Non-Functional Requirements¶

Latency: P99 < 100ms (in payment critical path)
Availability: 99.99% (payment dependency)
Accuracy: High precision (minimize false positives)
Scalability: Handle 10,000+ TPS
Adaptability: Detect new fraud patterns quickly

High-Level Architecture¶

Fraud Detection Architecture

Core Components¶

1. Feature Store¶

Pre-computed features for real-time lookup
Historical aggregations (velocity, patterns)
Device fingerprints
User behavior profiles

2. Rules Engine¶

Configurable business rules
Velocity checks
Blocklists/Allowlists
Geographic rules
Amount thresholds

3. ML Scoring Service¶

Real-time model inference
Multiple model ensemble
Feature vector construction
Model versioning and A/B testing

4. Decision Engine¶

Combines rule and ML scores
Applies merchant-specific policies
Determines final action (approve/review/decline)
Provides explanation for decision

5. Review Queue¶

Manual review workflow
Case assignment
Decision recording
Feedback loop for model training

Feature Engineering¶

Real-Time Features¶

Feature Categories

Feature Store Architecture¶

Feature Store Architecture

Rules Engine¶

Rule Types¶

Rule Categories

Rule Definition¶

# Example rule definitions
rules:
  - id: velocity_card_hourly
    name: "Card velocity - hourly limit"
    condition: |
      velocity.card_transactions_1h > 5
    action: REVIEW
    score: 30

  - id: high_amount_new_user
    name: "High amount from new user"
    condition: |
      transaction.amount > 500 AND
      user.account_age_days < 7 AND
      user.transaction_count < 3
    action: REVIEW
    score: 40

  - id: impossible_travel
    name: "Impossible travel detection"
    condition: |
      geo.distance_from_last_transaction > 500 AND
      time.hours_since_last_transaction < 2
    action: DECLINE
    score: 80

  - id: blocklist_ip
    name: "Blocked IP address"
    condition: |
      blocklist.contains(request.ip)
    action: DECLINE
    score: 100

Rule Evaluation¶

public class RulesEngine {

    private final List<Rule> globalRules;
    private final Map<String, List<Rule>> merchantRules;

    public RuleResult evaluate(Transaction tx, Features features) {
        List<TriggeredRule> triggered = new ArrayList<>();
        int totalScore = 0;
        Action worstAction = Action.APPROVE;

        // Evaluate global rules
        for (Rule rule : globalRules) {
            if (rule.evaluate(tx, features)) {
                triggered.add(new TriggeredRule(rule));
                totalScore += rule.getScore();
                worstAction = max(worstAction, rule.getAction());
            }
        }

        // Evaluate merchant-specific rules
        List<Rule> merchantSpecific = merchantRules.get(tx.getMerchantId());
        if (merchantSpecific != null) {
            for (Rule rule : merchantSpecific) {
                if (rule.evaluate(tx, features)) {
                    triggered.add(new TriggeredRule(rule));
                    totalScore += rule.getScore();
                    worstAction = max(worstAction, rule.getAction());
                }
            }
        }

        return new RuleResult(triggered, totalScore, worstAction);
    }
}

ML Scoring¶

Model Architecture¶

ML Scoring Pipeline

Model Serving¶

Model Serving Options

Model Training Pipeline¶

Model Training Pipeline

Decision Engine¶

Combining Scores¶

public class DecisionEngine {

    public FraudDecision decide(
            Transaction tx,
            RuleResult ruleResult,
            MLScore mlScore,
            MerchantConfig config
    ) {
        // Early exit for hard blocks
        if (ruleResult.getAction() == Action.DECLINE) {
            return FraudDecision.decline(
                ruleResult.getTriggeredRules(),
                "Rule-based decline"
            );
        }

        // Calculate combined score
        double combinedScore = calculateCombinedScore(ruleResult, mlScore, config);

        // Apply thresholds
        if (combinedScore >= config.getDeclineThreshold()) {
            return FraudDecision.decline(combinedScore, getReasons(ruleResult, mlScore));
        } else if (combinedScore >= config.getReviewThreshold()) {
            return FraudDecision.review(combinedScore, getReasons(ruleResult, mlScore));
        } else {
            return FraudDecision.approve(combinedScore);
        }
    }

    private double calculateCombinedScore(
            RuleResult ruleResult,
            MLScore mlScore,
            MerchantConfig config
    ) {
        // Weighted combination
        double ruleWeight = config.getRuleWeight();      // e.g., 0.3
        double mlWeight = config.getMlWeight();          // e.g., 0.7

        return (ruleResult.getNormalizedScore() * ruleWeight) +
               (mlScore.getScore() * mlWeight);
    }
}

Threshold Configuration¶

Score Thresholds

Data Flow¶

Real-Time Scoring¶

Real-Time Scoring Flow

Feedback Loop¶

Feedback Loop

Performance Optimization¶

Latency Budget¶

Latency Budget

Optimization Strategies¶

1. Parallel Fetching
   • Fetch all features concurrently
   • Use async/await or CompletableFuture

2. Feature Caching
   • Cache computed features in Redis
   • TTL based on feature volatility

3. Model Optimization
   • Quantization (reduce model size)
   • Pruning (remove unnecessary neurons)
   • Distillation (smaller student model)

4. Precomputation
   • Pre-compute entity risk scores offline
   • Aggregate features in batch

5. Circuit Breakers
   • Skip ML scoring if latency exceeds budget
   • Fall back to rules-only scoring

Data Models¶

Transaction Risk Assessment¶

CREATE TABLE fraud_assessments (
    id                  UUID PRIMARY KEY,
    transaction_id      UUID NOT NULL,

    -- Scores
    rule_score          DECIMAL(5,2),
    ml_score            DECIMAL(5,2),
    combined_score      DECIMAL(5,2),

    -- Decision
    decision            VARCHAR(20) NOT NULL,  -- approve, review, decline
    decision_reason     TEXT,

    -- Triggered rules
    triggered_rules     JSONB,

    -- Features used
    features_snapshot   JSONB,

    -- Timing
    processing_time_ms  INT,
    created_at          TIMESTAMP NOT NULL,

    -- Review (if applicable)
    reviewed_by         UUID,
    reviewed_at         TIMESTAMP,
    review_decision     VARCHAR(20),
    review_notes        TEXT,

    -- Outcome (feedback)
    is_fraud            BOOLEAN,
    fraud_confirmed_at  TIMESTAMP,
    fraud_source        VARCHAR(50)  -- chargeback, manual_report, etc.
);

CREATE INDEX idx_fraud_assessments_tx ON fraud_assessments(transaction_id);
CREATE INDEX idx_fraud_assessments_decision ON fraud_assessments(decision, created_at);
CREATE INDEX idx_fraud_assessments_review ON fraud_assessments(decision, reviewed_at)
    WHERE decision = 'review';

Velocity Counters (Redis)¶

# Card velocity (transactions per hour)
INCR fraud:velocity:card:{card_hash}:1h
EXPIRE fraud:velocity:card:{card_hash}:1h 3600

# User velocity (amount per day)
INCRBY fraud:velocity:user:{user_id}:amount:1d {amount}
EXPIRE fraud:velocity:user:{user_id}:amount:1d 86400

# Device velocity (users per device)
SADD fraud:velocity:device:{fingerprint}:users {user_id}
EXPIRE fraud:velocity:device:{fingerprint}:users 604800  # 7 days

Technology Choices¶

Component	Technology Options
Feature Store	Redis, DynamoDB, Feast
Rules Engine	Drools, Custom, JSON-based
ML Serving	TensorFlow Serving, TorchServe, SageMaker
ML Framework	XGBoost, LightGBM, PyTorch
Message Queue	Kafka (for async updates)
Graph Analysis	Neo4j, Amazon Neptune
Monitoring	Datadog, Prometheus + Grafana

Metrics & Monitoring¶

Key Metrics¶

Metric	Description	Target
Precision	% of declined that were actually fraud	> 50%
Recall	% of fraud correctly identified	> 90%
False Positive Rate	Good transactions declined	< 1%
Latency P99	Scoring latency	< 100ms
Review Rate	% of transactions sent to review	< 5%
Auto-Approval Rate	% approved without review	> 90%

Monitoring Dashboard¶

Monitoring Dashboard

Interview Discussion Points¶

How do you handle cold start for new users?
Use network features, device fingerprint, email analysis, behavioral signals
How do you minimize false positives?
Multiple signals, ML calibration, merchant-specific thresholds, 3D Secure
How do you detect new fraud patterns?
Anomaly detection, unsupervised learning, rule suggestions from analysts
How do you handle latency requirements?
Parallel processing, feature caching, model optimization, fallback rules
How do you deal with label delay (chargebacks take 90 days)?
Use proxy labels (reviews), semi-supervised learning, synthetic labels
How do you prevent adversarial attacks?
Feature diversity, model ensemble, continuous monitoring, behavioral analysis