Teaching Counterfactual Analysis & Synthetic Control Method

A practical guide for making complex causal inference concepts accessible to students

1. Start with Relatable Analogies

Begin with concrete, everyday examples to build intuition before introducing formal concepts.

The Road Trip Analogy

"Imagine you're planning a road trip and have two route options: Highway A or Highway B. You choose Highway A and arrive in 3 hours. But how do you know if that was the best choice? You can't go back in time and try Highway B."

This illustrates the fundamental problem of causal inference - we never observe both outcomes for the same unit at the same time.

The Recipe Analogy for SCM

"Imagine you love a restaurant's secret sauce but can't access their recipe. To recreate it, you mix different proportions of ketchup, mayo, spices, etc. until you match the taste."

That's what Synthetic Control Method does - we "recreate" California using 30% Nevada + 45% Washington + 25% Oregon to see what would have happened without a policy intervention.

Classroom Demonstration: Secret Recipe Game

Set up a demonstration with colored water in clear cups:

  1. Show students a "target" purple liquid (represents California)
  2. Give them 3 other colored liquids: red (Nevada), blue (Washington), yellow (Oregon)
  3. Challenge them to mix these in different proportions to match the target purple
  4. Explain: "This is exactly how SCM works - finding the right weights to create a match"

2. Visual Storytelling of Counterfactuals

A counterfactual is what would have happened to a treated unit if it had not received the treatment. It's the alternative reality we never observe but need to estimate.

Treatment
Actual Counterfactual Effect

Key Teaching Point:

The treatment effect is the gap between what actually happened and what would have happened without the intervention. This difference represents the causal impact of the treatment.

Alternative Visual: Bar Chart Comparison

$300K

Actual House
Price

$285K

Counterfactual
(No Fireplace)

+$15K

3. Interactive Exploration of Synthetic Control Method

The Synthetic Control Method (SCM) creates a weighted combination of control units that closely resembles the treated unit before intervention, then uses these weights to estimate what would have happened without treatment.

Build Your Own Synthetic California

Adjust the weights of control states to create a synthetic California that matches the real California's tobacco consumption before the policy:

Nevada
30%
Oregon
25%
Washington
45%
1980
1985
1990
1995
2000
0
50
100
150
200
250
Policy Implemented (1988)
Effect:
-80 packs
per capita
Real California
Synthetic California
Estimated Treatment Effect: 80 fewer cigarette packs per capita

4. Classroom Teaching Plan

Day 1
Day 2
Day 3

Day 1: Introduction to Counterfactual Thinking

  1. Start class with a thought experiment: "What if you had chosen a different major?" Ask students to discuss how they would determine if they made the right choice.
  2. Introduce the "fundamental problem of causal inference" using the road trip analogy.
  3. Discuss good and bad counterfactuals using examples (naive before/after, simple comparisons).
  4. Activity: Give students news headlines and ask them to identify the implicit counterfactual claims.
  5. Homework: Find a "What if?" question in their own lives and brainstorm ways to estimate the answer.

Sample Headlines for Activity:

"New Tax Policy Boosts Economy by 3%" (Implicit counterfactual: Without the tax policy, the economy would have grown less)
"Study Shows Education Program Increases Test Scores" (Implicit counterfactual: Without the program, scores would be lower)
"Adding a Fireplace Increases Home Value by $15,000" (Implicit counterfactual: Without a fireplace, the home would be worth $15,000 less)

Day 2: Building to Synthetic Control

  1. Review the California tobacco control program case study.
  2. Ask: "How could we estimate what would have happened without the program?"
  3. Walk through problems with simple before/after or single comparison state approaches.
  4. Introduce SCM as "building a better comparison" using the recipe analogy.
  5. Demonstrate the interactive tool to show how SCM works.
  6. Class exercise: In small groups, use the provided dataset to create synthetic controls manually.

Case Study: California Tobacco Control Program (Data-Focused)

In 1988, California implemented a tobacco control program with increased taxes and anti-smoking initiatives. Looking at the data:

Year California Nevada Oregon Washington
1985 120 140 125 115
1990 100 135 120 110
1995 80 130 115 105
2000 60 125 110 100

The challenge: Which state is the best comparison? Or should we use a combination?

Day 3: Hands-on SCM Application

  1. Provide this GDP dataset for 5 states (State A implemented a policy in 2020):
  2. Year State A
    (Treated)
    State B State C State D State E
    2018 100 90 110 105 95
    2019 105 92 114 108 98
    2020 115 94 117 110 100
    2021 130 97 120 112 103
  3. Walk students through creating a synthetic control in a spreadsheet:
  4. Step-by-Step SCM Activity:

    1. Try different weights for States B, C, D, and E
    2. Calculate weighted average for pre-treatment years (2018-2019)
    3. Compare with State A's pre-treatment values
    4. Find weights that minimize pre-treatment differences
    5. Use these weights to calculate synthetic values for 2020-2021
    6. Calculate treatment effect as actual minus synthetic
  5. Discuss and compare results across student groups.
  6. Explain data-driven assumption checks and limitations.
  7. Final discussion: When would SCM be appropriate vs. other methods?

5. Simplifying Technical Language

Technical Term Student-Friendly Language Visual/Data Example
Counterfactual "What would have happened otherwise" The dashed line showing predicted cigarette sales without the policy
Synthetic Control Weights "Recipe proportions" or "Mixing ingredients" Nevada: 30%, Oregon: 25%, Washington: 45%
Pre-treatment fit "How well our copy matches before the change" How closely synthetic California tracks real California before 1988
Treatment effect "The difference our change made" 80 fewer cigarette packs per capita by 2000
Donor pool "Available comparison ingredients" The states we can use to build our synthetic version (Nevada, Oregon, etc.)
Covariates "Important matching characteristics" Demographics, economy, smoking regulations before treatment

6. Assessment Ideas

Visual Interpretation Quiz

Show students this SCM graph and ask:

?

Questions:

  1. When did the intervention occur?
  2. What is the estimated treatment effect?
  3. Was the intervention successful?

Data-Driven SCM Assignment

Provide this mini-dataset and have students:

Year Treated Control 1 Control 2 Control 3
2018 10 12 8 9
2019 11 13 9 10
2020* 15 14 10 11
*Treatment implemented in 2020
  1. Find the optimal weights for Control units
  2. Calculate the synthetic counterfactual
  3. Determine the treatment effect

Sample Quiz Question: Fireplace Counterfactual

Question: The chart below shows a counterfactual analysis of houses with and without fireplaces:

House Price Distribution With and Without Fireplaces $200k $250k $300k $350k Mean: $300k Mean: $285k With Fireplaces Without Fireplaces (Counterfactual)

Based on this data visualization, what is the estimated causal effect of adding a fireplace to a house?

  1. No effect; fireplaces are purely decorative
  2. An increase of approximately $15,000 in home value
  3. A decrease in home value due to maintenance costs
  4. Cannot be determined from this data

7. Common Student Challenges & Solutions

Common Challenge Data-Driven Teaching Solution
Confusing counterfactuals with predictions Show side-by-side visuals: counterfactual (alternative present) vs. forecast (future prediction)
Struggling with weight optimization Use interactive tools where students can adjust weights and see pre-treatment fit improve
Difficulty identifying good control units Show pre-treatment trend charts for multiple potential controls and discuss similarities/differences
Overconfidence in results Use placebo tests where students apply SCM to units that didn't receive treatment
Poor understanding of when to use SCM Present datasets with varying characteristics and ask which would be suited for SCM vs. other methods