1 / 12

Synthetic Control Method

Advanced Causal Inference for Policy Evaluation

Learning Objectives

  • Understand the conceptual foundation of Synthetic Control Method
  • Learn when and how to apply synthetic control in business contexts
  • Interpret synthetic control results for policy evaluation
  • Recognize advantages and limitations of the methodology

DATA5000 - Building on Causal Machine Learning Foundations

What is the Synthetic Control Method?

Synthetic Control Method is a statistical technique used to evaluate the causal effect of an intervention or policy when you have one treated unit and multiple potential control units.

Core Concept

Instead of using a single control unit, we create a "synthetic control" - a weighted combination of multiple control units that best resembles the treated unit before the intervention.

Traditional Approach

  • Find one similar control unit
  • May not match perfectly
  • Limited comparability
  • Potential bias from mismatch

Synthetic Control

  • Combine multiple control units
  • Optimal weighted average
  • Better pre-treatment fit
  • More credible counterfactual

Why Do We Need Synthetic Control?

The Challenge of Single-Unit Interventions

Business Scenario

Problem: A retail company implements a new pricing strategy in their Melbourne store. How do we measure its impact on sales?

Challenge: We cannot find a single store that perfectly matches Melbourne's characteristics.

Common Business Cases

  • Store-specific policy changes
  • Regional marketing campaigns
  • City-level regulations
  • Product launches in specific markets

Research Applications

  • Economic policy evaluation
  • Healthcare interventions
  • Educational program assessment
  • Environmental policy impact

Solution: Create a synthetic Melbourne using weighted combinations of Sydney, Brisbane, Perth, and Adelaide stores.

How Synthetic Control Works: Visual Concept

Treated Unit

Melbourne Store

Received intervention

Sydney (w₁ = 0.4)

Brisbane (w₂ = 0.3)

Perth (w₃ = 0.2)

Adelaide (w₄ = 0.1)

Synthetic Control

Weighted Combination

Best counterfactual

Synthetic Control Formula:
Synthetic Melbourne = 0.4×Sydney + 0.3×Brisbane + 0.2×Perth + 0.1×Adelaide

Where weights (w₁, w₂, w₃, w₄) are chosen to minimize pre-intervention differences

Mathematical Foundation

Objective: Find Optimal Weights

Optimization Problem:

Minimize: Σᵗ⁼¹ᵀ⁰ (Y₁ₜ - Σⱼ₌₂ᴶ⁺¹ wⱼ × Yⱼₜ)²

Subject to: Σⱼ₌₂ᴶ⁺¹ wⱼ = 1 and wⱼ ≥ 0

Where:
• Y₁ₜ = outcome for treated unit at time t
• Yⱼₜ = outcome for control unit j at time t
• T⁰ = time of intervention
• wⱼ = weight for control unit j

Intuition: We find weights that make the synthetic control track the treated unit as closely as possible before the intervention occurs.

Synthetic Control: Step-by-Step Process

1

Data Preparation

Collect pre- and post-intervention data for treated unit and potential controls

2

Weight Optimization

Find optimal weights that minimize pre-intervention differences

3

Synthetic Construction

Create synthetic control using optimized weights

4

Pre-treatment Validation

Verify synthetic control fits treated unit well before intervention

5

Treatment Effect

Calculate difference between treated and synthetic units post-intervention

6

Inference & Testing

Conduct placebo tests and assess statistical significance

Business Example: E-commerce Pricing Strategy

Scenario: Dynamic Pricing Implementation

Company: RetailTech Australia

Intervention: Implemented AI-driven dynamic pricing in Melbourne market (January 2024)

Question: Did dynamic pricing increase monthly revenue?

Data: Monthly revenue data from January 2022 to June 2024

Available Control Markets

Sydney
Size: Large
Demographics: Similar
Brisbane
Size: Medium
Demographics: Similar
Perth
Size: Medium
Demographics: Different
Adelaide
Size: Small
Demographics: Similar

Synthetic Control Results: Revenue Impact

Monthly Revenue: Melbourne vs Synthetic Control

Dynamic Pricing
Implementation
Revenue ($M) Time (Months) Melbourne (Treated) Synthetic Control

Key Finding: Dynamic pricing increased monthly revenue by approximately $15M (25% increase)

Interpreting Synthetic Control Results

Treatment Effect Calculation

Treatment Effect at time t:
Treatment Effect₍ₜ₎ = Y₁ₜ - Synthetic Control₍ₜ₎

Average Treatment Effect (Post-intervention):
ATE = (1/T) × Σᵗ⁼ᵀ⁰⁺¹ᵀ (Y₁ₜ - Synthetic Control₍ₜ₎)

What to Look For

  • Pre-treatment fit: Synthetic closely tracks treated unit
  • Post-treatment divergence: Clear separation after intervention
  • Persistent effect: Difference maintains over time
  • Magnitude: Economically meaningful impact

Red Flags

  • Poor pre-fit: Synthetic doesn't match before intervention
  • Sudden jumps: Effects appear before intervention date
  • Unstable weights: Few units receive most weight
  • Limited controls: Insufficient donor pool

Validation: Placebo Tests

Ensuring Results Are Not Due to Chance

Placebo Test Logic: Apply synthetic control to units that did NOT receive treatment. If we find similar "effects," our results may be spurious.

1

In-Time Placebo

Test intervention at different time points before actual treatment

2

In-Space Placebo

Apply same analysis to control units that never received treatment

3

Statistical Inference

Compare actual effect size to distribution of placebo effects

Business Example: Placebo Test Results

Finding: When we apply the same analysis to Sydney, Brisbane, Perth, and Adelaide (control cities), none show revenue increases of similar magnitude.

Conclusion: Melbourne's revenue increase is likely due to dynamic pricing, not random market fluctuations.

Advantages and Limitations

✓ Advantages

  • Transparent: Clear methodology and assumptions
  • No parametric model: Avoids functional form assumptions
  • Optimal matching: Data-driven control selection
  • Visual interpretation: Easy to understand results
  • Robust inference: Built-in placebo tests

⚠ Limitations

  • Single treated unit: Cannot study multiple treatments
  • Long pre-period needed: Requires substantial historical data
  • Convex hull restriction: Synthetic unit limited by control range
  • Time-varying confounders: Assumes no unobserved changes
  • Spillover effects: Controls must be unaffected by treatment

When to Use Synthetic Control Method

Ideal Scenarios

  • Single unit receives treatment
  • Multiple potential control units available
  • Long pre-intervention time series
  • Clear intervention timing
  • No spillover between units
  • Stable relationships pre-treatment

Business Applications

  • Store-specific interventions
  • Regional policy changes
  • Market entry strategies
  • Product launch evaluations
  • Regulatory impact assessment
  • Marketing campaign effectiveness

Implementation in Python

Libraries: Use SparseSC or SyntheticControlMethods packages

Key Steps: Data preparation → Weight optimization → Effect estimation → Placebo testing

Next Week: Hands-on implementation with real business data

Remember: Synthetic Control is most powerful when combined with domain expertise and theoretical understanding of the intervention mechanism.