Understanding cause and effect in artificial intelligence
| Type | Question | Example |
|---|---|---|
| Descriptive | What happened? | "Our sales dropped 20% last month" |
| Diagnostic | Why did it happen? | "Sales dropped because our competitor launched a sale" |
| Predictive | What will happen? | "Sales will likely drop another 15% next month" |
| Prescriptive | What should we do? | "Offer free shipping and extend hours to recover 80% of lost sales" |
Today we focus on moving from prediction to prescription using causal analysis
Predictive Analytics says:
"This user will likely watch more shows"
Prescriptive Analytics says:
"Show this user Season 2 of Stranger Things NOW because that specific recommendation will cause them to stay subscribed"
Prediction tells you WHAT will happen
Prescription tells you HOW to change it
Business value comes from changing outcomes, not just predicting them
Observation: Ice cream sales and drowning deaths both increase in summer
Wrong Conclusion: "Ice cream causes drowning! Ban ice cream!"
Right Conclusion: Hot weather causes BOTH:
Just because two things happen together doesn't mean one causes the other. There might be a third factor causing both.
Observation: Customers who buy muffins also buy coffee
Correlation: Muffin buyers tend to buy coffee
Causation Question: Does offering a muffin discount CAUSE more coffee sales?
Better Question: What if we offered a "coffee + muffin" combo? Would that CAUSE higher total revenue?
If you just see correlation, you might discount muffins expecting coffee sales to rise. But if morning customers naturally buy both, the discount just loses you money.
Look at this gym membership data. Notice the pattern:
| Person | Age | Exercise (hrs/week) | Health Score (0-100) |
|---|---|---|---|
| John | 25 | 5.0 | 85 |
| Mary | 28 | 4.5 | 82 |
| Alex | 32 | 4.0 | 80 |
| Lisa | 45 | 3.0 | 72 |
| Tom | 48 | 2.5 | 70 |
| Sarah | 52 | 2.0 | 68 |
| Bob | 61 | 1.5 | 62 |
| Carol | 65 | 1.0 | 58 |
| Dan | 68 | 0.5 | 55 |
🟢 Young people (20s-30s): Exercise MORE (4-5 hrs) → Health Score HIGHER (80-85)
🟡 Middle age (40s-50s): Exercise LESS (2-3 hrs) → Health Score MEDIUM (68-72)
🔴 Older people (60s+): Exercise LEAST (0.5-1.5 hrs) → Health Score LOWER (55-62)
The Question: Does exercise cause better health? Or does AGE affect BOTH exercise habits AND health?
A confounder is a hidden influencer that affects both your action AND your result, making you think one causes the other when it doesn't.
Young people exercise more AND are healthier
Conclusion: "Exercise causes better health!" (Partially true, but age is hiding part of the story)
Compare 30-year-olds who exercise vs. don't exercise
Compare 60-year-olds who exercise vs. don't exercise
Now we see the TRUE effect of exercise at each age level
| Business Scenario | Apparent Relationship | Hidden Confounder |
|---|---|---|
| Premium members buy more | Premium status → Higher spending | High income causes BOTH premium membership AND more spending |
| Email opens lead to purchases | Opening emails → Buying | Brand loyalty causes BOTH email engagement AND purchases |
| Training increases productivity | Training → Performance | Motivated employees seek training AND perform better |
| Ads drive sales | Ad clicks → Purchases | Purchase intent causes BOTH ad clicking AND buying |
An e-commerce company finds that customers who view product reviews spend 40% more.
Should they force ALL customers to view reviews?
Discuss with your group for 3 minutes
What's really happening:
Conclusion: Simply showing reviews to uninterested customers won't cause the same spending increase. They're not engaged enough to care.
Identify what makes customers engaged, then work on increasing engagement rather than just forcing review views.
Any action, policy, or intervention you might take:
The actual impact that action has on your outcome (sales, satisfaction, retention, etc.)
The term comes from medical research (does this treatment cure the disease?), but in business:
You're asking the right question! To estimate treatment effects, you need to observe BOTH:
The treatment already happened naturally
You randomly assign treatment
External event creates treatment/control groups
EconML works with Scenario 1 (observational data) by controlling for confounders statistically, letting you estimate causal effects even without running experiments.
Question: "Does it work overall?"
Measures: Impact across EVERYONE
Example: "Free shipping increases average order value by $12"
Question: "Does it work differently for different groups?"
Measures: How impact varies by group
Example: "Free shipping increases orders by 30% for students but decreases orders by 5% for seniors"
Question: "Does it work for those actually affected?"
Measures: Impact only on those who changed behavior
Example: "For customers who used the free shipping offer, orders increased by $25"
Question: "Does it work for people like YOU?"
Measures: Impact for specific individual characteristics
Example: "For 25-year-old female customers in Sydney who shop on weekends, free shipping increases orders by $18"
Problem: Should we recommend action movies or comedies to a new subscriber?
Instead of one recommendation for everyone (ATE), personalize based on characteristics (CATE)
Result: Higher engagement, lower churn, more subscription renewals
Neighborhoods with more Black residents had lower home prices.
Correlation-Based Model Would Say:
"Ethnicity predicts lower home prices"
Predictive models can perpetuate bias and historical injustice
Causal models help us understand what we can actually change
"Use ethnicity as a feature because it predicts prices well"
Result: Perpetuates discrimination
"Identify what actually causes price differences: infrastructure, pollution, crime"
Result: Focus on factors we can actually change
We'll use this dataset to practice identifying true causal factors while avoiding bias
Data shows:
Discuss in your group for 5 minutes
CATE (Conditional Average Treatment Effect): Personalize interventions based on customer profile
Example: "For high-value customers with service issues, offering a dedicated support line reduces churn by 25%"
HTE (Heterogeneous Treatment Effect): Different strategies for different customer segments
Example: "Contract incentives work for price-sensitive customers but not for quality-focused customers"
"Customers who call service 3+ times will likely churn"
You can predict, but what do you DO about it?
"Reducing service calls by 50% through proactive support will decrease churn by 15%"
You have an actionable strategy
Job: Explain predictions
Answers: "Which features contributed most to this prediction?"
Type: Correlation-based
Example: "This customer will churn because they have a high bill, many service calls, and month-to-month contract"
Job: Identify causes
Answers: "Which features, if changed, will change the outcome?"
Type: Causation-based
Example: "Reducing this customer's bill by $10 will decrease their churn probability by 12%"
SHAP (SHapley Additive exPlanations) explains ML predictions by showing how much each feature contributed to a specific prediction.
Based on game theory: distributes "credit" for a prediction fairly among all features.
SHAP says: "Customer service calls strongly predict churn" (correlation)
Does NOT mean: "Reducing service calls will reduce churn" (causation)
Why? Service calls might be a symptom of underlying product issues (confounder)
What it shows: Overall ranking of features by average absolute impact
Business use: "Which 5 factors matter most for customer churn?"
Contract Status ████████████████ 0.42 Service Calls ███████████ 0.31 Monthly Charges █████████ 0.25 Account Age ██████ 0.18 Data Usage ████ 0.12
Higher bar = stronger correlation with prediction
What it shows: Distribution of SHAP values across all data points
Colors: Red = high feature value, Blue = low feature value
Use case: "Do high service calls increase or decrease churn probability?"
What it shows: Step-by-step breakdown of ONE prediction
Use case: "Why did the model predict THIS customer will churn?"
Shows: Base value → Feature impacts → Final prediction
Final Prediction: 78% churn probability (High Risk!)
| Starting Point (Base Value) | 32% (average churn) |
| + Month-to-month contract | +25% |
| + 8 service calls (very high) | +18% |
| - Has device protection | -5% |
| + High monthly charges ($89) | +8% |
| Final Prediction | 78% churn risk |
These features predict churn, but we cannot conclude that changing them will reduce churn.
For causal effects, we need EconML.
| Aspect | SHAP (Correlation) | EconML (Causation) |
|---|---|---|
| Question | "Which features predict the outcome?" | "Which features CAUSE the outcome?" |
| Output | Feature importance scores | Treatment effect estimates |
| Handles Confounders? | ❌ No - shows all correlations | ✅ Yes - removes confounder bias |
| Business Use | Understanding patterns, debugging models | Making decisions, taking actions |
| Example Insight | "Service calls explain 31% of churn variance" | "Reducing service calls by 1 causes 5% churn reduction" |
Wrong: "SHAP says service calls are important → reduce service calls"
Why wrong: Calls might just correlate with product issues. Reducing calls without fixing root causes won't help.
Right: Use EconML to find the causal effect of improving service quality
You'll apply SHAP and EconML to the TeleConnect churn dataset and see:
Your assessment will require:
Prediction tells you what will happen.
Causation tells you how to change it.
Real business value comes from taking the RIGHT actions, not just predicting outcomes.