Synthetic Control Method

Advanced Causal Inference for Policy Evaluation

Learning Objectives

Understand the conceptual foundation of Synthetic Control Method
Learn when and how to apply synthetic control in business contexts
Interpret synthetic control results for policy evaluation
Recognize advantages and limitations of the methodology

DATA5000 - Building on Causal Machine Learning Foundations

What is the Synthetic Control Method?

Synthetic Control Method is a statistical technique used to evaluate the causal effect of an intervention or policy when you have one treated unit and multiple potential control units.

Core Concept

Instead of using a single control unit, we create a "synthetic control" - a weighted combination of multiple control units that best resembles the treated unit before the intervention.

Traditional Approach

Find one similar control unit
May not match perfectly
Limited comparability
Potential bias from mismatch

Synthetic Control

Combine multiple control units
Optimal weighted average
Better pre-treatment fit
More credible counterfactual

Why Do We Need Synthetic Control?

The Challenge of Single-Unit Interventions

Business Scenario

Problem: A retail company implements a new pricing strategy in their Melbourne store. How do we measure its impact on sales?

Challenge: We cannot find a single store that perfectly matches Melbourne's characteristics.

Common Business Cases

Store-specific policy changes
Regional marketing campaigns
City-level regulations
Product launches in specific markets

Research Applications

Economic policy evaluation
Healthcare interventions
Educational program assessment
Environmental policy impact

Solution: Create a synthetic Melbourne using weighted combinations of Sydney, Brisbane, Perth, and Adelaide stores.

How Synthetic Control Works: Visual Concept

Treated Unit

Melbourne Store

Received intervention

→

Sydney (w₁ = 0.4)

Brisbane (w₂ = 0.3)

Perth (w₃ = 0.2)

Adelaide (w₄ = 0.1)

→

Synthetic Control

Weighted Combination

Best counterfactual

Synthetic Control Formula:
Synthetic Melbourne = 0.4×Sydney + 0.3×Brisbane + 0.2×Perth + 0.1×Adelaide

Where weights (w₁, w₂, w₃, w₄) are chosen to minimize pre-intervention differences

Mathematical Foundation

Objective: Find Optimal Weights

Optimization Problem:

Minimize: Σᵗ⁼¹ᵀ⁰ (Y₁ₜ - Σⱼ₌₂ᴶ⁺¹ wⱼ × Yⱼₜ)²

Subject to: Σⱼ₌₂ᴶ⁺¹ wⱼ = 1 and wⱼ ≥ 0

Where:
• Y₁ₜ = outcome for treated unit at time t
• Yⱼₜ = outcome for control unit j at time t
• T⁰ = time of intervention
• wⱼ = weight for control unit j

Intuition: We find weights that make the synthetic control track the treated unit as closely as possible before the intervention occurs.

Synthetic Control: Step-by-Step Process

1

Data Preparation

Collect pre- and post-intervention data for treated unit and potential controls

2

Weight Optimization

Find optimal weights that minimize pre-intervention differences

3

Synthetic Construction

Create synthetic control using optimized weights

4

Pre-treatment Validation

Verify synthetic control fits treated unit well before intervention

5

Treatment Effect

Calculate difference between treated and synthetic units post-intervention

6

Inference & Testing

Conduct placebo tests and assess statistical significance

Business Example: E-commerce Pricing Strategy

Scenario: Dynamic Pricing Implementation

Company: RetailTech Australia

Intervention: Implemented AI-driven dynamic pricing in Melbourne market (January 2024)

Question: Did dynamic pricing increase monthly revenue?

Data: Monthly revenue data from January 2022 to June 2024

Available Control Markets

Sydney
Size: Large
Demographics: Similar

Brisbane
Size: Medium
Demographics: Similar

Perth
Size: Medium
Demographics: Different

Adelaide
Size: Small
Demographics: Similar

Synthetic Control Results: Revenue Impact

Monthly Revenue: Melbourne vs Synthetic Control

Dynamic Pricing
Implementation

Key Finding: Dynamic pricing increased monthly revenue by approximately $15M (25% increase)

Interpreting Synthetic Control Results

Treatment Effect Calculation

Treatment Effect at time t:
Treatment Effect₍ₜ₎ = Y₁ₜ - Synthetic Control₍ₜ₎

Average Treatment Effect (Post-intervention):
ATE = (1/T) × Σᵗ⁼ᵀ⁰⁺¹ᵀ (Y₁ₜ - Synthetic Control₍ₜ₎)

What to Look For

Pre-treatment fit: Synthetic closely tracks treated unit
Post-treatment divergence: Clear separation after intervention
Persistent effect: Difference maintains over time
Magnitude: Economically meaningful impact

Red Flags

Poor pre-fit: Synthetic doesn't match before intervention
Sudden jumps: Effects appear before intervention date
Unstable weights: Few units receive most weight
Limited controls: Insufficient donor pool

Validation: Placebo Tests

Ensuring Results Are Not Due to Chance

Placebo Test Logic: Apply synthetic control to units that did NOT receive treatment. If we find similar "effects," our results may be spurious.

1

In-Time Placebo

Test intervention at different time points before actual treatment

2

In-Space Placebo

Apply same analysis to control units that never received treatment

3

Statistical Inference

Compare actual effect size to distribution of placebo effects

Business Example: Placebo Test Results

Finding: When we apply the same analysis to Sydney, Brisbane, Perth, and Adelaide (control cities), none show revenue increases of similar magnitude.

Conclusion: Melbourne's revenue increase is likely due to dynamic pricing, not random market fluctuations.

Advantages and Limitations

✓ Advantages

Transparent: Clear methodology and assumptions
No parametric model: Avoids functional form assumptions
Optimal matching: Data-driven control selection
Visual interpretation: Easy to understand results
Robust inference: Built-in placebo tests

⚠ Limitations

Single treated unit: Cannot study multiple treatments
Long pre-period needed: Requires substantial historical data
Convex hull restriction: Synthetic unit limited by control range
Time-varying confounders: Assumes no unobserved changes
Spillover effects: Controls must be unaffected by treatment

When to Use Synthetic Control Method

Ideal Scenarios

Single unit receives treatment
Multiple potential control units available
Long pre-intervention time series
Clear intervention timing
No spillover between units
Stable relationships pre-treatment

Business Applications

Store-specific interventions
Regional policy changes
Market entry strategies
Product launch evaluations
Regulatory impact assessment
Marketing campaign effectiveness

Implementation in Python

Libraries: Use SparseSC or SyntheticControlMethods packages

Key Steps: Data preparation → Weight optimization → Effect estimation → Placebo testing

Next Week: Hands-on implementation with real business data

Remember: Synthetic Control is most powerful when combined with domain expertise and theoretical understanding of the intervention mechanism.