Week 11: Model Interpretability & Explainability

Understanding the "Why" Behind AI Decisions

DATA4800: Artificial Intelligence and Machine Learning

Kaplan Business School

Today's Learning Outcomes

By the end of this workshop, you will be able to:

Distinguish between model interpretability and explainability
Identify when white-box vs. black-box models are appropriate for business problems
Apply techniques like SHAP, LIME, and Partial Dependence Plots to explain model predictions
Evaluate model performance using appropriate metrics for business contexts
Make informed decisions about model complexity vs. transparency trade-offs

The Black Box Problem

Scenario: The Rejected Loan Application

Sarah applies for a $50,000 business loan at Global Bank. She has:

A credit score of 720 (good)
3 years running her bakery business
Consistent revenue of $180,000/year
No previous loan defaults

The bank's AI system rejects her application.

When Sarah asks why, the loan officer says: "Our AI model determined you're high risk. That's all I can tell you."

What's wrong with this picture?

Why We Need Explainable AI

Three Critical Questions from Different Stakeholders:

1. Customer Perspective: "Why was I rejected?"

Sarah needs to know what she can improve. Is it her debt-to-income ratio? Her business age? Without explanation, she can't take action to improve her chances next time.

2. Manager Perspective: "Can I trust this decision?"

The bank manager sees profitable applicants being rejected. Without understanding the model's logic, she can't verify if it's making sound business decisions or if it needs adjustment.

3. Regulator Perspective: "Is this fair and legal?"

Financial regulators must ensure the AI isn't discriminating based on protected characteristics. A "black box" makes compliance verification impossible.

When Explainability Matters Most

Some AI decisions have serious consequences that demand transparency:

Healthcare 🏥

Example: AI recommends surgery vs. medication

Why explanation matters: Doctors need to understand the reasoning to make informed recommendations to patients

Hiring Decisions 💼

Example: AI screens job applications

Why explanation matters: Must ensure decisions aren't based on biased factors like age, gender, or ethnicity

Credit Approval 💳

Example: AI approves/denies loans

Why explanation matters: Legal requirement in many jurisdictions; customers have right to know

Criminal Justice ⚖️

Example: AI predicts recidivism risk

Why explanation matters: Decisions affect people's freedom and must be justifiable in court

Two Key Concepts: Not the Same Thing

Interpretability

Understanding HOW the model works overall

Analogy: Seeing the Recipe

You can see all the ingredients and steps needed to bake a cake. You understand the entire process from start to finish.

Example: A simple scoring formula where you can see exactly how credit score, income, and debt combine to produce a risk score.

Explainability

Understanding WHY a specific decision was made

Analogy: Explaining the Outcome

You might not know the full recipe, but you can explain why THIS cake came out dense: "We used whole wheat flour instead of all-purpose flour."

Example: For Sarah's rejected loan, explaining "Your debt-to-income ratio of 45% was the primary factor" even if we can't show the full model.

Quick Check: Test Your Understanding

A hospital uses an AI system to predict which patients are at high risk of readmission. The doctor asks: "For this specific patient, what factors made the AI flag them as high risk?"

Is the doctor asking for interpretability or explainability?

A) Interpretability - understanding how the model works overall

B) Explainability - understanding why this specific decision was made

C) Neither - the doctor is asking about model accuracy

White-Box vs. Black-Box Models

White-Box Models: The Glass Box 🔍

You can see inside and understand exactly how inputs become outputs.

Examples:

Linear Regression
Logistic Regression
Decision Trees
Simple Rule-Based Systems

Advantage: Complete transparency - you can explain both HOW and WHY

Limitation: May not capture complex patterns

Black-Box Models: The Locked Box 🔒

Internal workings are complex and opaque. You see inputs and outputs but not the process.

Examples:

Neural Networks (Deep Learning)
Random Forests
Gradient Boosting Machines
Support Vector Machines

Advantage: Can capture very complex patterns and often more accurate

Limitation: Difficult to understand and explain decisions

White-Box Example: Linear Regression

The Restaurant Bill Analogy

Imagine calculating a restaurant bill. The formula is simple and transparent:

Total Bill = $15 (base) + ($25 × number of people) + ($8 × appetizers) + ($6 × desserts)

Example Calculation:

• 4 people having dinner
• 2 orders of appetizers
• 3 desserts

Total = $15 + (4 × $25) + (2 × $8) + (3 × $6) = $149

Why is this interpretable?

You can explain EXACTLY why the bill is $149. Each person adds $25, each appetizer adds $8, etc. There's no mystery!

Linear Regression: Real Business Example

Predicting House Prices in Iowa

Price = $50,000 + ($120 × Square Feet) + ($15,000 × Bedrooms) + ($8,000 × Garage Spaces)

Example: Predict price for a house with:

1,500 square feet
3 bedrooms
2 garage spaces

Price = $50,000 + (1,500 × $120) + (3 × $15,000) + (2 × $8,000) = $311,000

Feature Contributions (Impacts)

• Base value: $50,000
• Square footage: +$180,000 ↑ (largest impact)
• Bedrooms: +$45,000 ↑
• Garage: +$16,000 ↑

We can see EXACTLY how each feature contributes to the final price!

Logistic Regression: The Scoring System

From Points to Probability

Logistic regression is like a scoring system that converts points into a 0-100% probability.

Loan Approval Example:

Calculate a score based on applicant features:

Score = -5 + (0.01 × Credit Score) + (0.00002 × Annual Income) - (2 × Debt-to-Income Ratio)

Then convert score to probability:

If Score = 2.5, then:
Probability of Approval = 92%

Still Interpretable!

We can see how each feature (credit score, income, debt ratio) contributes to the final probability.

The White-Box Limitation

Problem: Feature Dependencies

Linear models assume each feature contributes independently. But real-world features often interact!

House Price Example:

The model says:

Each sq ft of basement adds $100
Each sq ft of first floor adds $120

But in reality: A house with a huge basement (2,000 sq ft) but tiny first floor (800 sq ft) is oddly shaped and less valuable than these numbers suggest!

The features interact - you can't just add them up independently!

This is where more complex (black-box) models can capture patterns that simple linear models miss.

Quick Check: White-Box vs Black-Box

A bank wants to predict loan default risk. They need to:

Explain every decision to regulators
Handle only 5 features (income, credit score, employment years, debt ratio, loan amount)
Provide clear reasoning to customers

Which type of model is most appropriate?

A) White-box model (e.g., Logistic Regression) - transparency is required and problem is simple

B) Black-box model (e.g., Neural Network) - need maximum accuracy

C) Either would work equally well

The Fundamental Trade-off

Key Insight

More complex models (black-box) can capture intricate patterns and achieve higher accuracy, but become harder to interpret. Simpler models (white-box) are easy to understand but may miss complex relationships.

The choice depends on your business priorities: accuracy vs. transparency.

Choosing White-Box vs. Black-Box

Situation	Recommended Approach	Reasoning
Simple problem (2-3 features)	White-Box	Linear relationships are likely sufficient; transparency is valuable
High regulatory requirements (banking, healthcare)	White-Box	Must be able to explain and justify every decision
Complex problem (100+ features with interactions)	Black-Box + Explanation Tools	Need complex model for accuracy, use SHAP/LIME for explanations
Image/text data (computer vision, NLP)	Black-Box (Deep Learning)	Linear models can't handle these data types effectively
Internal analytics (no external stakeholders)	Either	Choose based on accuracy needs vs. debugging convenience

Explaining Black-Box Models: Partial Dependence Plots

The "What If" Tool

Business Question: "How does credit score affect loan approval probability, holding everything else constant?"

How Partial Dependence Works:
                            Step 1: Take 1,000 loan applications from your dataset
                        
                            Step 2: Change ONLY the credit score for all 1,000 (try 600, 650, 700, 750, etc.)
                        
                            Step 3: Run the model on each modified dataset
                        
                            Step 4: Average the approval probabilities at each credit score
                        
                            Step 5: Plot the results to see the relationship

What This Tells You:

"On average, increasing credit score from 650 to 750 increases approval probability by 25 percentage points."

You've isolated the effect of ONE feature in your black-box model!

Partial Dependence Plot: Visual Example

                    Reading This Plot:
                    X-axis: Credit score values (from 600 to 800)
Y-axis: Average predicted approval probability
Interpretation: The upward slope shows that higher credit scores strongly increase approval chances
Business Insight: Credit score is an important factor in the model's decisions

                

SHAP Values: Fair Credit Attribution

The Team Project Analogy

Imagine a group project where three students work together:

Working alone, each student would score 60/100
Working together, they score 85/100
Question: How much did each student contribute to the improvement?

SHAP Calculates Fair Contribution:

Try all possible team combinations:

Student A alone: 60
Students A + B: 72
Students A + C: 70
Students B + C: 68
All three A + B + C: 85

SHAP averages across ALL these combinations to determine each student's fair share of the +25 point improvement.

SHAP Values in Loan Prediction

Explaining Sarah's Rejected Loan

The model's baseline approval rate (average across all applicants): 65%
Sarah's predicted approval probability: 32%

Change from baseline = 32% - 65% = -33%

SHAP Shows Feature Contributions:

• Credit Score (720): +8% (helps approval)
• Annual Income ($45K): -15% (hurts approval)
• Debt-to-Income Ratio (45%): -20% (hurts approval)
• Business Age (3 years): -6% (hurts approval)

Total: +8% - 15% - 20% - 6% = -33% ✓

Primary issue: Debt-to-income ratio (-20%). This is what Sarah should focus on improving!

LIME: Local Interpretable Model-Agnostic Explanations

The Restaurant Menu Analogy

Imagine a restaurant with 200 menu items. Understanding how the chef prices EVERY dish is complex. But you just want to understand why YOUR specific order of pasta costs $18.

LIME's Approach: "Zoom in" on just the pasta dish and nearby similar dishes (other pasta, similar ingredients). Build a simple model for JUST THAT area of the menu.

LIME in 5 Steps:

1. Take Sarah's loan application (the instance we want to explain)

2. Create 1,000 "similar" applications by slightly changing Sarah's values

3. Run the complex model on all 1,000 variations

4. Build a simple linear model that mimics the complex model's behavior for these similar cases

5. Use the simple model to explain why Sarah was rejected

SHAP vs. LIME: When to Use Which?

Aspect	SHAP	LIME
What it explains	Fair contribution of each feature	Local approximation with simple model
Computation time	Slower (tries all combinations)	Faster (approximates locally)
Consistency	Same explanation every time	Can vary slightly between runs
Theoretical foundation	Game theory (Shapley values)	Local approximation
Best for	High-stakes decisions needing precise attribution	Quick explanations, exploring many instances
Example use case	Explaining denied loan to regulator	Internal model debugging and validation

In practice: Both are valuable tools! SHAP for precision, LIME for speed and ease of use.

Quick Check: Explanation Techniques

You've built a complex neural network for fraud detection. A transaction is flagged as fraudulent and you need to explain why to the merchant (quickly, for one specific transaction). Which technique is most appropriate?

A) Partial Dependence Plot - shows overall feature effects

B) LIME - fast local explanation for this specific instance

C) Rebuild the model as logistic regression for interpretability

Case Study: LendingClub Loan Prediction

The Business Problem

LendingClub processes 10,000+ loan applications per month. They need to:

Approve good borrowers who will repay (capture revenue)
Reject bad borrowers who will default (avoid losses)
Explain decisions to applicants and regulators

The Data

                        Dataset Statistics
                        Total loans: 12,290
Good loans: 9,704 (79%)
Defaulted loans: 2,586 (21%)

                    

                        Key Features (Simplified)
                        Home ownership status
Annual income
Debt-to-income ratio
Credit score

                    

LendingClub: What the Model Tells Us

Business Insights from Logistic Regression

What increases loan approval chances?

🏠 Home Ownership:
Owning a home → +30% approval chance
Interpretation: Homeowners are seen as more stable and creditworthy

💰 Annual Income:
Each additional $10,000 → +5% approval chance
Interpretation: Higher income provides greater repayment capacity

📊 Debt-to-Income Ratio:
Each 10% increase → -25% approval chance
Interpretation: Higher existing debt obligations increase risk

⭐ Credit Score:
100-point increase → +115% better approval odds
Interpretation: Credit score is the strongest predictor

The Mathematical Model (For Reference)

Now that we understand the business meaning, here's the actual logistic regression formula:

log(odds) = β₀ + β₁(Home_Own) + β₂(Income) + β₃(DTI) + β₄(Credit_Score)

Fitted Coefficients:

β₀ (intercept) = -8.2
β₁ (home ownership) = 1.1 → exp(1.1) = 3.0 times better odds
β₂ (income per $10K) = 0.05 → exp(0.05) = 1.05 times better odds
β₃ (debt-to-income per 10%) = -1.4 → exp(-1.4) = 0.25 times the odds
β₄ (credit score per 100 pts) = 0.77 → exp(0.77) = 2.15 times better odds

Key Point: Notice how we presented business meaning FIRST, then the formula. This aids understanding!

Evaluating Model Performance: The Confusion Matrix

A Medical Analogy: Cancer Detection

A doctor tests 100 patients for cancer. 30 actually have cancer, 70 don't.

Actual Reality	Doctor's Prediction
Actual Reality	Predicted: Cancer	Predicted: No Cancer
Actually has cancer (30)	25 True Positive (TP) ✅ Correctly detected	5 False Negative (FN) ❌ Missed cancer!
Actually healthy (70)	8 False Positive (FP) ❌ False alarm	62 True Negative (TN) ✅ Correctly cleared

Confusion Matrix: Loan Approval

LendingClub Model Performance

Actual Outcome	Model's Prediction
Actual Outcome	Predicted: Good Loan	Predicted: Will Default
Good loan (9,704)	8,200 True Positive ✅ Approved good customer	1,504 False Negative ❌ Rejected good customer 💰 Lost revenue!
Default (2,586)	415 False Positive ❌ Approved bad customer 💰 Will lose money!	2,171 True Negative ✅ Correctly rejected

Business Question: Which error is more costly?

False Negative (FN): Rejecting good customers → Lost interest revenue (~$3,000 per loan)
False Positive (FP): Approving bad customers → Lose principal (~$15,000 per default)

FP is ~5x more costly than FN! Our model should err on the side of caution.

Performance Metrics: Beyond Accuracy

Understanding Key Metrics Through Business Questions

1. Accuracy: "How often is the model right overall?"

Accuracy = (TP + TN) / Total = (8,200 + 2,171) / 12,290 = 84.4%

⚠️ Warning: Can be misleading with imbalanced data! If 95% of loans are good, predicting "all good" gives 95% accuracy but is useless.

2. Recall (True Positive Rate): "Of the good loans, how many did we approve?"

Recall = TP / (TP + FN) = 8,200 / 9,704 = 84.5%

Business meaning: We're capturing 84.5% of revenue opportunities. Missing 15.5%.

3. Specificity (True Negative Rate): "Of the bad loans, how many did we correctly reject?"

Specificity = TN / (TN + FP) = 2,171 / 2,586 = 84.0%

Business meaning: We're avoiding 84% of potential defaults. 16% slip through.

More Performance Metrics

4. Precision: "Of the loans we approved, how many are actually good?"

Precision = TP / (TP + FP) = 8,200 / (8,200 + 415) = 95.2%

Business meaning: When we approve a loan, we're right 95.2% of the time. High confidence in approvals!

5. F-Score: "Balance between catching good loans and avoiding bad ones"

F-Score = 2 × (Precision × Recall) / (Precision + Recall) = 2 × (0.952 × 0.845) / (0.952 + 0.845) = 89.6%

Business meaning: Harmonic mean of precision and recall. Useful when you want a single score that balances both.

Which Metric Matters Most?

It depends on business priorities:

Growth phase: Maximize Recall (approve more good customers)
Risk-averse phase: Maximize Precision (avoid defaults at all costs)
Balanced approach: Optimize F-Score

Adjusting the Decision Threshold

The Probability Decision Point

Logistic regression gives us probabilities (0-100%). We need to choose a cutoff threshold (Z) to make binary approve/reject decisions.

Scenario 1: Conservative (Z = 0.85)

Approve only if probability > 85%

Result: Very few defaults (high precision) but miss many good customers (low recall)

Risk: Business stagnation from lost customers

Scenario 2: Balanced (Z = 0.75)

Approve if probability > 75%

Result: Moderate defaults, good customer capture

Sweet spot: Sustainable growth with acceptable risk

Scenario 3: Aggressive (Z = 0.50)

Approve if probability > 50%

Result: Capture most good customers but many defaults too

Risk: Bankruptcy from excessive defaults

ROC Curve: Visualizing the Trade-off

What is ROC?

Receiver Operating Characteristic curve shows True Positive Rate (Recall) vs. False Positive Rate at different thresholds.

X-axis: False Positive Rate (bad loans approved)

Y-axis: True Positive Rate (good loans approved)

AUC Score

Area Under the Curve = 0.66

• 0.5 = Random guessing (diagonal line)
• 1.0 = Perfect prediction
• 0.66 = Decent, but room for improvement

Quick Check: Metrics Understanding

A bank's loan model has:
• Precision = 98%
• Recall = 45%

What does this tell you about the bank's strategy?

A) Very conservative - they approve few loans, but almost all are good (low default risk but missing revenue)

B) Very aggressive - they approve many loans, including risky ones

C) Balanced approach - good trade-off between risk and revenue

Activity 1: Concept Check (5 minutes)

Reflect and Discuss

Take a moment to answer these questions (write notes, discuss with a neighbor):

Question 1: In your own words, explain the difference between interpretability and explainability. Give a business example of each.

Question 2: Your company wants to predict customer churn. You have 50 features. Would you start with a white-box or black-box model? Why?

Question 3: For the LendingClub case, which is worse from a business perspective: rejecting good customers (FN) or approving bad ones (FP)? Justify your answer.

Question 4: Can you think of a domain where explainability is legally required? What about a domain where accuracy matters more than explainability?

Activity 2: Orange Data Mining Demo (15 minutes)

Exploring Model Explanations with Orange

                        Setup (Before We Start):
                        Open Orange Data Mining
Load the LendingClub dataset (will be provided)
Have the "Explain Model" widget ready

                    

Demonstration Workflow:

Step 1: Build a Logistic Regression model for loan prediction

Step 2: Use "Explain Model" widget to generate feature importance

Step 3: Select specific instances and view SHAP/LIME explanations

Step 4: Create Partial Dependence Plots for key features

Step 5: Compare explanations across different model types

Activity 2: Questions to Consider During Demo

1. Feature Importance: Which features have the strongest impact on loan approval? Does this match our expectations from the business context?

2. Individual Explanations: Pick a rejected loan. Can you identify the top 2 reasons for rejection using SHAP values?

3. Partial Dependence: How does the approval probability change as credit score increases? Is the relationship linear or non-linear?

4. Model Comparison: Compare logistic regression explanations with decision tree explanations. Which is easier to interpret?

5. Actionable Insights: If you were a loan officer, what advice would you give to rejected applicants based on the model explanations?

📝 Resource: Orange Explain Model widget documentation at
orangedatamining.com/widget-catalog/explain/explain-model/

Activity 3: Python Exploration (Optional, 10 minutes)

For Those Comfortable with Python

Try this Google Colab notebook for hands-on SHAP/LIME practice:

📓 Colab Notebook:
colab.research.google.com

(Notebook link will be provided via Canvas)

What You'll Learn:

Install and use the SHAP library in Python
Generate waterfall plots showing feature contributions
Create LIME explanations with visualizations
Build interactive Partial Dependence Plots

Note: Focus on interpreting the outputs rather than the code syntax. The goal is understanding explanations, not becoming a Python expert!

Preparing for Assessment 3

Assessment Overview

Build a predictive model that:

Solves a real business problem from your domain
Demonstrates appropriate evaluation metrics
Includes interpretability/explainability techniques
Explains business value and ROI

4-Step Readiness Checklist:

✅ Step 1: Topic Selection
Choose a prediction problem from your work/interest area. Must involve a decision that can be explained (credit, hiring, diagnosis, pricing, etc.)

✅ Step 2: Data Availability
Need 500-1,000 examples minimum with features (inputs) and target (output). Can use public datasets if no proprietary data available.

✅ Step 3: Tool Selection
Decide between Orange (easier, visual) or Python (more flexible). Both can handle interpretability techniques.

✅ Step 4: Interpretability Plan
How will you explain your model? Feature importance? SHAP? LIME? Plan which technique fits your problem best.

Interpretability Checklist for Your Project

1. Business Problem First

Define the decision being made and why it matters. Who needs to understand the model's decisions (managers, customers, regulators)?

2. Model Transparency Level

Did you choose a white-box or black-box model? Justify this choice based on your business context. If black-box, which explanation techniques will you use?

3. Feature Importance

Show which features are most important overall. Use feature importance scores, SHAP summary plots, or coefficients (for linear models).

4. Individual Decision Explanations

Pick 2-3 specific examples and explain WHY the model made those predictions. Show how features contributed to each decision.

5. Evaluation Metrics Choice

Justify your metric selection based on business costs. Is FP or FN more costly? Did you adjust the decision threshold accordingly?

6. Ethical Considerations

Address potential biases. Could your model discriminate unfairly? How did you check for this?

Week 11 Key Takeaways

1. Interpretability ≠ Explainability

Interpretability is understanding HOW a model works overall. Explainability is understanding WHY a specific decision was made. Both are valuable in different contexts.

2. White-Box vs. Black-Box Trade-off

Simple, transparent models (white-box) are easy to explain but may miss complex patterns. Complex models (black-box) can be more accurate but require explanation techniques like SHAP, LIME, or Partial Dependence Plots.

3. Business Context Drives Metric Choice

Accuracy isn't everything! Choose metrics (precision, recall, F-score) based on the relative costs of different error types in your business problem.

4. Explanation Techniques Are Tools, Not Solutions

SHAP, LIME, and PDPs help explain black-box models but don't make biased models fair or bad models good. They're diagnostic tools that reveal how models make decisions.

5. Explainability Enables Trust and Action

In high-stakes domains, being able to explain WHY a decision was made is not just nice to have—it's essential for regulatory compliance, customer trust, and model debugging.

Resources & Next Steps

📚 Further Learning

Tools:

Orange Data Mining: Visual, no-code tool for model explanation
orangedatamining.com
SHAP Python Library: Industry-standard for model explanations
github.com/slundberg/shap
LIME: Local model explanations
github.com/marcotcr/lime

                    Practice Datasets:
                    LendingClub Loan Data (on Canvas)
UCI Machine Learning Repository (many business datasets)
Kaggle (competitions with real-world problems)

                

Next Week Preview:

We'll explore Time Series Forecasting - predicting future values based on historical patterns. Topics include seasonality, trends, and ARIMA models.

Questions? Office hours or post on the discussion forum!

Thank You!

Week 11: Model Interpretability & Explainability

Remember: The best model is one you can explain and trust.

Next Steps:

1. Complete the Orange hands-on exercises
2. Start planning your Assessment 3 interpretability approach
3. Review key concepts using the practice quiz on Canvas