Understand the conceptual foundation of Synthetic Control Method
Learn when and how to apply synthetic control in business contexts
Interpret synthetic control results for policy evaluation
Recognize advantages and limitations of the methodology
DATA5000 - Building on Causal Machine Learning Foundations
What is the Synthetic Control Method?
Synthetic Control Method is a statistical technique used to evaluate the causal effect of an intervention or policy when you have one treated unit and multiple potential control units.
Core Concept
Instead of using a single control unit, we create a "synthetic control" - a weighted combination of multiple control units that best resembles the treated unit before the intervention.
Traditional Approach
Find one similar control unit
May not match perfectly
Limited comparability
Potential bias from mismatch
Synthetic Control
Combine multiple control units
Optimal weighted average
Better pre-treatment fit
More credible counterfactual
Why Do We Need Synthetic Control?
The Challenge of Single-Unit Interventions
Business Scenario
Problem: A retail company implements a new pricing strategy in their Melbourne store. How do we measure its impact on sales?
Challenge: We cannot find a single store that perfectly matches Melbourne's characteristics.
Common Business Cases
Store-specific policy changes
Regional marketing campaigns
City-level regulations
Product launches in specific markets
Research Applications
Economic policy evaluation
Healthcare interventions
Educational program assessment
Environmental policy impact
Solution: Create a synthetic Melbourne using weighted combinations of Sydney, Brisbane, Perth, and Adelaide stores.
Where weights (w₁, w₂, w₃, w₄) are chosen to minimize pre-intervention differences
Mathematical Foundation
Objective: Find Optimal Weights
Optimization Problem:
Minimize: Σᵗ⁼¹ᵀ⁰ (Y₁ₜ - Σⱼ₌₂ᴶ⁺¹ wⱼ × Yⱼₜ)²
Subject to: Σⱼ₌₂ᴶ⁺¹ wⱼ = 1 and wⱼ ≥ 0
Where:
• Y₁ₜ = outcome for treated unit at time t
• Yⱼₜ = outcome for control unit j at time t
• T⁰ = time of intervention
• wⱼ = weight for control unit j
Intuition: We find weights that make the synthetic control track the treated unit as closely as possible before the intervention occurs.
Synthetic Control: Step-by-Step Process
1
Data Preparation
Collect pre- and post-intervention data for treated unit and potential controls
2
Weight Optimization
Find optimal weights that minimize pre-intervention differences
3
Synthetic Construction
Create synthetic control using optimized weights
4
Pre-treatment Validation
Verify synthetic control fits treated unit well before intervention
5
Treatment Effect
Calculate difference between treated and synthetic units post-intervention
6
Inference & Testing
Conduct placebo tests and assess statistical significance
Business Example: E-commerce Pricing Strategy
Scenario: Dynamic Pricing Implementation
Company: RetailTech Australia
Intervention: Implemented AI-driven dynamic pricing in Melbourne market (January 2024)
Question: Did dynamic pricing increase monthly revenue?
Data: Monthly revenue data from January 2022 to June 2024
Available Control Markets
Sydney
Size: Large
Demographics: Similar
Brisbane
Size: Medium
Demographics: Similar
Perth
Size: Medium
Demographics: Different
Adelaide
Size: Small
Demographics: Similar
Synthetic Control Results: Revenue Impact
Monthly Revenue: Melbourne vs Synthetic Control
Dynamic Pricing Implementation
Key Finding: Dynamic pricing increased monthly revenue by approximately $15M (25% increase)
Interpreting Synthetic Control Results
Treatment Effect Calculation
Treatment Effect at time t:
Treatment Effect₍ₜ₎ = Y₁ₜ - Synthetic Control₍ₜ₎
Average Treatment Effect (Post-intervention):
ATE = (1/T) × Σᵗ⁼ᵀ⁰⁺¹ᵀ (Y₁ₜ - Synthetic Control₍ₜ₎)
What to Look For
Pre-treatment fit: Synthetic closely tracks treated unit
Post-treatment divergence: Clear separation after intervention
Persistent effect: Difference maintains over time
Magnitude: Economically meaningful impact
Red Flags
Poor pre-fit: Synthetic doesn't match before intervention
Sudden jumps: Effects appear before intervention date
Unstable weights: Few units receive most weight
Limited controls: Insufficient donor pool
Validation: Placebo Tests
Ensuring Results Are Not Due to Chance
Placebo Test Logic: Apply synthetic control to units that did NOT receive treatment. If we find similar "effects," our results may be spurious.
1
In-Time Placebo
Test intervention at different time points before actual treatment
2
In-Space Placebo
Apply same analysis to control units that never received treatment
3
Statistical Inference
Compare actual effect size to distribution of placebo effects
Business Example: Placebo Test Results
Finding: When we apply the same analysis to Sydney, Brisbane, Perth, and Adelaide (control cities), none show revenue increases of similar magnitude.
Conclusion: Melbourne's revenue increase is likely due to dynamic pricing, not random market fluctuations.
Advantages and Limitations
✓ Advantages
Transparent: Clear methodology and assumptions
No parametric model: Avoids functional form assumptions
Optimal matching: Data-driven control selection
Visual interpretation: Easy to understand results
Robust inference: Built-in placebo tests
⚠ Limitations
Single treated unit: Cannot study multiple treatments
Long pre-period needed: Requires substantial historical data
Convex hull restriction: Synthetic unit limited by control range
Time-varying confounders: Assumes no unobserved changes
Spillover effects: Controls must be unaffected by treatment
When to Use Synthetic Control Method
Ideal Scenarios
Single unit receives treatment
Multiple potential control units available
Long pre-intervention time series
Clear intervention timing
No spillover between units
Stable relationships pre-treatment
Business Applications
Store-specific interventions
Regional policy changes
Market entry strategies
Product launch evaluations
Regulatory impact assessment
Marketing campaign effectiveness
Implementation in Python
Libraries: Use SparseSC or SyntheticControlMethods packages