Week 11: Model Interpretability & Explainability

Understanding the "Why" Behind AI Decisions

DATA4800: Artificial Intelligence and Machine Learning

Kaplan Business School

Today's Learning Outcomes

By the end of this workshop, you will be able to:

  • Distinguish between model interpretability and explainability
  • Identify when white-box vs. black-box models are appropriate for business problems
  • Apply techniques like SHAP, LIME, and Partial Dependence Plots to explain model predictions
  • Evaluate model performance using appropriate metrics for business contexts
  • Make informed decisions about model complexity vs. transparency trade-offs

The Black Box Problem

Scenario: The Rejected Loan Application

Sarah applies for a $50,000 business loan at Global Bank. She has:

  • A credit score of 720 (good)
  • 3 years running her bakery business
  • Consistent revenue of $180,000/year
  • No previous loan defaults

The bank's AI system rejects her application.

When Sarah asks why, the loan officer says: "Our AI model determined you're high risk. That's all I can tell you."

What's wrong with this picture?

Why We Need Explainable AI

Three Critical Questions from Different Stakeholders:

1. Customer Perspective: "Why was I rejected?"

Sarah needs to know what she can improve. Is it her debt-to-income ratio? Her business age? Without explanation, she can't take action to improve her chances next time.

2. Manager Perspective: "Can I trust this decision?"

The bank manager sees profitable applicants being rejected. Without understanding the model's logic, she can't verify if it's making sound business decisions or if it needs adjustment.

3. Regulator Perspective: "Is this fair and legal?"

Financial regulators must ensure the AI isn't discriminating based on protected characteristics. A "black box" makes compliance verification impossible.

When Explainability Matters Most

Some AI decisions have serious consequences that demand transparency:

Healthcare πŸ₯

Example: AI recommends surgery vs. medication

Why explanation matters: Doctors need to understand the reasoning to make informed recommendations to patients

Hiring Decisions πŸ’Ό

Example: AI screens job applications

Why explanation matters: Must ensure decisions aren't based on biased factors like age, gender, or ethnicity

Credit Approval πŸ’³

Example: AI approves/denies loans

Why explanation matters: Legal requirement in many jurisdictions; customers have right to know

Criminal Justice βš–οΈ

Example: AI predicts recidivism risk

Why explanation matters: Decisions affect people's freedom and must be justifiable in court

Two Key Concepts: Not the Same Thing

Interpretability

Understanding HOW the model works overall

Analogy: Seeing the Recipe

You can see all the ingredients and steps needed to bake a cake. You understand the entire process from start to finish.

Example: A simple scoring formula where you can see exactly how credit score, income, and debt combine to produce a risk score.

Explainability

Understanding WHY a specific decision was made

Analogy: Explaining the Outcome

You might not know the full recipe, but you can explain why THIS cake came out dense: "We used whole wheat flour instead of all-purpose flour."

Example: For Sarah's rejected loan, explaining "Your debt-to-income ratio of 45% was the primary factor" even if we can't show the full model.

Quick Check: Test Your Understanding

A hospital uses an AI system to predict which patients are at high risk of readmission. The doctor asks: "For this specific patient, what factors made the AI flag them as high risk?"

Is the doctor asking for interpretability or explainability?
A) Interpretability - understanding how the model works overall
B) Explainability - understanding why this specific decision was made
C) Neither - the doctor is asking about model accuracy

White-Box vs. Black-Box Models

White-Box Models: The Glass Box πŸ”

You can see inside and understand exactly how inputs become outputs.

Examples:

  • Linear Regression
  • Logistic Regression
  • Decision Trees
  • Simple Rule-Based Systems

Advantage: Complete transparency - you can explain both HOW and WHY

Limitation: May not capture complex patterns

Black-Box Models: The Locked Box πŸ”’

Internal workings are complex and opaque. You see inputs and outputs but not the process.

Examples:

  • Neural Networks (Deep Learning)
  • Random Forests
  • Gradient Boosting Machines
  • Support Vector Machines

Advantage: Can capture very complex patterns and often more accurate

Limitation: Difficult to understand and explain decisions

White-Box Example: Linear Regression

The Restaurant Bill Analogy

Imagine calculating a restaurant bill. The formula is simple and transparent:

Total Bill = $15 (base) + ($25 Γ— number of people) + ($8 Γ— appetizers) + ($6 Γ— desserts)

Example Calculation:

β€’ 4 people having dinner
β€’ 2 orders of appetizers
β€’ 3 desserts

Total = $15 + (4 Γ— $25) + (2 Γ— $8) + (3 Γ— $6) = $149

Why is this interpretable?

You can explain EXACTLY why the bill is $149. Each person adds $25, each appetizer adds $8, etc. There's no mystery!

Linear Regression: Real Business Example

Predicting House Prices in Iowa

Price = $50,000 + ($120 Γ— Square Feet) + ($15,000 Γ— Bedrooms) + ($8,000 Γ— Garage Spaces)

Example: Predict price for a house with:

  • 1,500 square feet
  • 3 bedrooms
  • 2 garage spaces

Price = $50,000 + (1,500 Γ— $120) + (3 Γ— $15,000) + (2 Γ— $8,000) = $311,000

Feature Contributions (Impacts)

β€’ Base value: $50,000
β€’ Square footage: +$180,000 ↑ (largest impact)
β€’ Bedrooms: +$45,000 ↑
β€’ Garage: +$16,000 ↑

We can see EXACTLY how each feature contributes to the final price!

Logistic Regression: The Scoring System

From Points to Probability

Logistic regression is like a scoring system that converts points into a 0-100% probability.

Loan Approval Example:

Calculate a score based on applicant features:

Score = -5 + (0.01 Γ— Credit Score) + (0.00002 Γ— Annual Income) - (2 Γ— Debt-to-Income Ratio)

Then convert score to probability:

If Score = 2.5, then:
Probability of Approval = 92%

Still Interpretable!

We can see how each feature (credit score, income, debt ratio) contributes to the final probability.

The White-Box Limitation

Problem: Feature Dependencies

Linear models assume each feature contributes independently. But real-world features often interact!

House Price Example:

The model says:

  • Each sq ft of basement adds $100
  • Each sq ft of first floor adds $120

But in reality: A house with a huge basement (2,000 sq ft) but tiny first floor (800 sq ft) is oddly shaped and less valuable than these numbers suggest!

The features interact - you can't just add them up independently!

This is where more complex (black-box) models can capture patterns that simple linear models miss.

Quick Check: White-Box vs Black-Box

A bank wants to predict loan default risk. They need to:
  • Explain every decision to regulators
  • Handle only 5 features (income, credit score, employment years, debt ratio, loan amount)
  • Provide clear reasoning to customers
Which type of model is most appropriate?
A) White-box model (e.g., Logistic Regression) - transparency is required and problem is simple
B) Black-box model (e.g., Neural Network) - need maximum accuracy
C) Either would work equally well

The Fundamental Trade-off

Key Insight

More complex models (black-box) can capture intricate patterns and achieve higher accuracy, but become harder to interpret. Simpler models (white-box) are easy to understand but may miss complex relationships.

The choice depends on your business priorities: accuracy vs. transparency.

Choosing White-Box vs. Black-Box

Situation Recommended Approach Reasoning
Simple problem
(2-3 features)
White-Box Linear relationships are likely sufficient; transparency is valuable
High regulatory requirements
(banking, healthcare)
White-Box Must be able to explain and justify every decision
Complex problem
(100+ features with interactions)
Black-Box + Explanation Tools Need complex model for accuracy, use SHAP/LIME for explanations
Image/text data
(computer vision, NLP)
Black-Box (Deep Learning) Linear models can't handle these data types effectively
Internal analytics
(no external stakeholders)
Either Choose based on accuracy needs vs. debugging convenience

Explaining Black-Box Models: Partial Dependence Plots

The "What If" Tool

Business Question: "How does credit score affect loan approval probability, holding everything else constant?"

How Partial Dependence Works:

Step 1: Take 1,000 loan applications from your dataset
Step 2: Change ONLY the credit score for all 1,000 (try 600, 650, 700, 750, etc.)
Step 3: Run the model on each modified dataset
Step 4: Average the approval probabilities at each credit score
Step 5: Plot the results to see the relationship

What This Tells You:

"On average, increasing credit score from 650 to 750 increases approval probability by 25 percentage points."

You've isolated the effect of ONE feature in your black-box model!

Partial Dependence Plot: Visual Example

Reading This Plot:

  • X-axis: Credit score values (from 600 to 800)
  • Y-axis: Average predicted approval probability
  • Interpretation: The upward slope shows that higher credit scores strongly increase approval chances
  • Business Insight: Credit score is an important factor in the model's decisions

SHAP Values: Fair Credit Attribution

The Team Project Analogy

Imagine a group project where three students work together:

  • Working alone, each student would score 60/100
  • Working together, they score 85/100
  • Question: How much did each student contribute to the improvement?

SHAP Calculates Fair Contribution:

Try all possible team combinations:

  • Student A alone: 60
  • Students A + B: 72
  • Students A + C: 70
  • Students B + C: 68
  • All three A + B + C: 85

SHAP averages across ALL these combinations to determine each student's fair share of the +25 point improvement.

SHAP Values in Loan Prediction

Explaining Sarah's Rejected Loan

The model's baseline approval rate (average across all applicants): 65%
Sarah's predicted approval probability: 32%

Change from baseline = 32% - 65% = -33%

SHAP Shows Feature Contributions:

β€’ Credit Score (720): +8% (helps approval)
β€’ Annual Income ($45K): -15% (hurts approval)
β€’ Debt-to-Income Ratio (45%): -20% (hurts approval)
β€’ Business Age (3 years): -6% (hurts approval)

Total: +8% - 15% - 20% - 6% = -33% βœ“

Primary issue: Debt-to-income ratio (-20%). This is what Sarah should focus on improving!

LIME: Local Interpretable Model-Agnostic Explanations

The Restaurant Menu Analogy

Imagine a restaurant with 200 menu items. Understanding how the chef prices EVERY dish is complex. But you just want to understand why YOUR specific order of pasta costs $18.

LIME's Approach: "Zoom in" on just the pasta dish and nearby similar dishes (other pasta, similar ingredients). Build a simple model for JUST THAT area of the menu.

LIME in 5 Steps:

1. Take Sarah's loan application (the instance we want to explain)
2. Create 1,000 "similar" applications by slightly changing Sarah's values
3. Run the complex model on all 1,000 variations
4. Build a simple linear model that mimics the complex model's behavior for these similar cases
5. Use the simple model to explain why Sarah was rejected

SHAP vs. LIME: When to Use Which?

Aspect SHAP LIME
What it explains Fair contribution of each feature Local approximation with simple model
Computation time Slower (tries all combinations) Faster (approximates locally)
Consistency Same explanation every time Can vary slightly between runs
Theoretical foundation Game theory (Shapley values) Local approximation
Best for High-stakes decisions needing precise attribution Quick explanations, exploring many instances
Example use case Explaining denied loan to regulator Internal model debugging and validation

In practice: Both are valuable tools! SHAP for precision, LIME for speed and ease of use.

Quick Check: Explanation Techniques

You've built a complex neural network for fraud detection. A transaction is flagged as fraudulent and you need to explain why to the merchant (quickly, for one specific transaction). Which technique is most appropriate?
A) Partial Dependence Plot - shows overall feature effects
B) LIME - fast local explanation for this specific instance
C) Rebuild the model as logistic regression for interpretability

Case Study: LendingClub Loan Prediction

The Business Problem

LendingClub processes 10,000+ loan applications per month. They need to:

  • Approve good borrowers who will repay (capture revenue)
  • Reject bad borrowers who will default (avoid losses)
  • Explain decisions to applicants and regulators

The Data

Dataset Statistics

  • Total loans: 12,290
  • Good loans: 9,704 (79%)
  • Defaulted loans: 2,586 (21%)

Key Features (Simplified)

  • Home ownership status
  • Annual income
  • Debt-to-income ratio
  • Credit score

LendingClub: What the Model Tells Us

Business Insights from Logistic Regression

What increases loan approval chances?

🏠 Home Ownership:
Owning a home β†’ +30% approval chance
Interpretation: Homeowners are seen as more stable and creditworthy

πŸ’° Annual Income:
Each additional $10,000 β†’ +5% approval chance
Interpretation: Higher income provides greater repayment capacity

πŸ“Š Debt-to-Income Ratio:
Each 10% increase β†’ -25% approval chance
Interpretation: Higher existing debt obligations increase risk

⭐ Credit Score:
100-point increase β†’ +115% better approval odds
Interpretation: Credit score is the strongest predictor

The Mathematical Model (For Reference)

Now that we understand the business meaning, here's the actual logistic regression formula:

log(odds) = Ξ²β‚€ + β₁(Home_Own) + Ξ²β‚‚(Income) + β₃(DTI) + Ξ²β‚„(Credit_Score)

Fitted Coefficients:

Ξ²β‚€ (intercept) = -8.2
β₁ (home ownership) = 1.1 β†’ exp(1.1) = 3.0 times better odds
Ξ²β‚‚ (income per $10K) = 0.05 β†’ exp(0.05) = 1.05 times better odds
β₃ (debt-to-income per 10%) = -1.4 β†’ exp(-1.4) = 0.25 times the odds
Ξ²β‚„ (credit score per 100 pts) = 0.77 β†’ exp(0.77) = 2.15 times better odds

Key Point: Notice how we presented business meaning FIRST, then the formula. This aids understanding!

Evaluating Model Performance: The Confusion Matrix

A Medical Analogy: Cancer Detection

A doctor tests 100 patients for cancer. 30 actually have cancer, 70 don't.

Actual Reality Doctor's Prediction
Predicted: Cancer Predicted: No Cancer
Actually has cancer (30) 25
True Positive (TP)
βœ… Correctly detected
5
False Negative (FN)
❌ Missed cancer!
Actually healthy (70) 8
False Positive (FP)
❌ False alarm
62
True Negative (TN)
βœ… Correctly cleared

Confusion Matrix: Loan Approval

LendingClub Model Performance

Actual Outcome Model's Prediction
Predicted: Good Loan Predicted: Will Default
Good loan (9,704) 8,200
True Positive
βœ… Approved good customer
1,504
False Negative
❌ Rejected good customer
πŸ’° Lost revenue!
Default (2,586) 415
False Positive
❌ Approved bad customer
πŸ’° Will lose money!
2,171
True Negative
βœ… Correctly rejected

Business Question: Which error is more costly?

  • False Negative (FN): Rejecting good customers β†’ Lost interest revenue (~$3,000 per loan)
  • False Positive (FP): Approving bad customers β†’ Lose principal (~$15,000 per default)

FP is ~5x more costly than FN! Our model should err on the side of caution.

Performance Metrics: Beyond Accuracy

Understanding Key Metrics Through Business Questions

1. Accuracy: "How often is the model right overall?"

Accuracy = (TP + TN) / Total = (8,200 + 2,171) / 12,290 = 84.4%

⚠️ Warning: Can be misleading with imbalanced data! If 95% of loans are good, predicting "all good" gives 95% accuracy but is useless.

2. Recall (True Positive Rate): "Of the good loans, how many did we approve?"

Recall = TP / (TP + FN) = 8,200 / 9,704 = 84.5%

Business meaning: We're capturing 84.5% of revenue opportunities. Missing 15.5%.

3. Specificity (True Negative Rate): "Of the bad loans, how many did we correctly reject?"

Specificity = TN / (TN + FP) = 2,171 / 2,586 = 84.0%

Business meaning: We're avoiding 84% of potential defaults. 16% slip through.

More Performance Metrics

4. Precision: "Of the loans we approved, how many are actually good?"

Precision = TP / (TP + FP) = 8,200 / (8,200 + 415) = 95.2%

Business meaning: When we approve a loan, we're right 95.2% of the time. High confidence in approvals!

5. F-Score: "Balance between catching good loans and avoiding bad ones"

F-Score = 2 Γ— (Precision Γ— Recall) / (Precision + Recall) = 2 Γ— (0.952 Γ— 0.845) / (0.952 + 0.845) = 89.6%

Business meaning: Harmonic mean of precision and recall. Useful when you want a single score that balances both.

Which Metric Matters Most?

It depends on business priorities:

  • Growth phase: Maximize Recall (approve more good customers)
  • Risk-averse phase: Maximize Precision (avoid defaults at all costs)
  • Balanced approach: Optimize F-Score

Adjusting the Decision Threshold

The Probability Decision Point

Logistic regression gives us probabilities (0-100%). We need to choose a cutoff threshold (Z) to make binary approve/reject decisions.

Scenario 1: Conservative (Z = 0.85)

Approve only if probability > 85%

Result: Very few defaults (high precision) but miss many good customers (low recall)

Risk: Business stagnation from lost customers

Scenario 2: Balanced (Z = 0.75)

Approve if probability > 75%

Result: Moderate defaults, good customer capture

Sweet spot: Sustainable growth with acceptable risk

Scenario 3: Aggressive (Z = 0.50)

Approve if probability > 50%

Result: Capture most good customers but many defaults too

Risk: Bankruptcy from excessive defaults

ROC Curve: Visualizing the Trade-off

What is ROC?

Receiver Operating Characteristic curve shows True Positive Rate (Recall) vs. False Positive Rate at different thresholds.

X-axis: False Positive Rate (bad loans approved)

Y-axis: True Positive Rate (good loans approved)

AUC Score

Area Under the Curve = 0.66

β€’ 0.5 = Random guessing (diagonal line)
β€’ 1.0 = Perfect prediction
β€’ 0.66 = Decent, but room for improvement

Quick Check: Metrics Understanding

A bank's loan model has:
β€’ Precision = 98%
β€’ Recall = 45%

What does this tell you about the bank's strategy?
A) Very conservative - they approve few loans, but almost all are good (low default risk but missing revenue)
B) Very aggressive - they approve many loans, including risky ones
C) Balanced approach - good trade-off between risk and revenue

Activity 1: Concept Check (5 minutes)

Reflect and Discuss

Take a moment to answer these questions (write notes, discuss with a neighbor):

Question 1: In your own words, explain the difference between interpretability and explainability. Give a business example of each.
Question 2: Your company wants to predict customer churn. You have 50 features. Would you start with a white-box or black-box model? Why?
Question 3: For the LendingClub case, which is worse from a business perspective: rejecting good customers (FN) or approving bad ones (FP)? Justify your answer.
Question 4: Can you think of a domain where explainability is legally required? What about a domain where accuracy matters more than explainability?

Activity 2: Orange Data Mining Demo (15 minutes)

Exploring Model Explanations with Orange

Setup (Before We Start):

  1. Open Orange Data Mining
  2. Load the LendingClub dataset (will be provided)
  3. Have the "Explain Model" widget ready

Demonstration Workflow:

Step 1: Build a Logistic Regression model for loan prediction
Step 2: Use "Explain Model" widget to generate feature importance
Step 3: Select specific instances and view SHAP/LIME explanations
Step 4: Create Partial Dependence Plots for key features
Step 5: Compare explanations across different model types

Activity 2: Questions to Consider During Demo

1. Feature Importance: Which features have the strongest impact on loan approval? Does this match our expectations from the business context?
2. Individual Explanations: Pick a rejected loan. Can you identify the top 2 reasons for rejection using SHAP values?
3. Partial Dependence: How does the approval probability change as credit score increases? Is the relationship linear or non-linear?
4. Model Comparison: Compare logistic regression explanations with decision tree explanations. Which is easier to interpret?
5. Actionable Insights: If you were a loan officer, what advice would you give to rejected applicants based on the model explanations?

πŸ“ Resource: Orange Explain Model widget documentation at
orangedatamining.com/widget-catalog/explain/explain-model/

Activity 3: Python Exploration (Optional, 10 minutes)

For Those Comfortable with Python

Try this Google Colab notebook for hands-on SHAP/LIME practice:

πŸ““ Colab Notebook:
colab.research.google.com

(Notebook link will be provided via Canvas)

What You'll Learn:

  • Install and use the SHAP library in Python
  • Generate waterfall plots showing feature contributions
  • Create LIME explanations with visualizations
  • Build interactive Partial Dependence Plots

Note: Focus on interpreting the outputs rather than the code syntax. The goal is understanding explanations, not becoming a Python expert!

Preparing for Assessment 3

Assessment Overview

Build a predictive model that:

  • Solves a real business problem from your domain
  • Demonstrates appropriate evaluation metrics
  • Includes interpretability/explainability techniques
  • Explains business value and ROI

4-Step Readiness Checklist:

βœ… Step 1: Topic Selection
Choose a prediction problem from your work/interest area. Must involve a decision that can be explained (credit, hiring, diagnosis, pricing, etc.)
βœ… Step 2: Data Availability
Need 500-1,000 examples minimum with features (inputs) and target (output). Can use public datasets if no proprietary data available.
βœ… Step 3: Tool Selection
Decide between Orange (easier, visual) or Python (more flexible). Both can handle interpretability techniques.
βœ… Step 4: Interpretability Plan
How will you explain your model? Feature importance? SHAP? LIME? Plan which technique fits your problem best.

Interpretability Checklist for Your Project

1. Business Problem First

Define the decision being made and why it matters. Who needs to understand the model's decisions (managers, customers, regulators)?

2. Model Transparency Level

Did you choose a white-box or black-box model? Justify this choice based on your business context. If black-box, which explanation techniques will you use?

3. Feature Importance

Show which features are most important overall. Use feature importance scores, SHAP summary plots, or coefficients (for linear models).

4. Individual Decision Explanations

Pick 2-3 specific examples and explain WHY the model made those predictions. Show how features contributed to each decision.

5. Evaluation Metrics Choice

Justify your metric selection based on business costs. Is FP or FN more costly? Did you adjust the decision threshold accordingly?

6. Ethical Considerations

Address potential biases. Could your model discriminate unfairly? How did you check for this?

Week 11 Key Takeaways

1. Interpretability β‰  Explainability

Interpretability is understanding HOW a model works overall. Explainability is understanding WHY a specific decision was made. Both are valuable in different contexts.

2. White-Box vs. Black-Box Trade-off

Simple, transparent models (white-box) are easy to explain but may miss complex patterns. Complex models (black-box) can be more accurate but require explanation techniques like SHAP, LIME, or Partial Dependence Plots.

3. Business Context Drives Metric Choice

Accuracy isn't everything! Choose metrics (precision, recall, F-score) based on the relative costs of different error types in your business problem.

4. Explanation Techniques Are Tools, Not Solutions

SHAP, LIME, and PDPs help explain black-box models but don't make biased models fair or bad models good. They're diagnostic tools that reveal how models make decisions.

5. Explainability Enables Trust and Action

In high-stakes domains, being able to explain WHY a decision was made is not just nice to haveβ€”it's essential for regulatory compliance, customer trust, and model debugging.

Resources & Next Steps

πŸ“š Further Learning

Tools:

Practice Datasets:

  • LendingClub Loan Data (on Canvas)
  • UCI Machine Learning Repository (many business datasets)
  • Kaggle (competitions with real-world problems)

Next Week Preview:

We'll explore Time Series Forecasting - predicting future values based on historical patterns. Topics include seasonality, trends, and ARIMA models.

Questions? Office hours or post on the discussion forum!

Thank You!

Week 11: Model Interpretability & Explainability

Remember: The best model is one you can explain and trust.

Next Steps:

1. Complete the Orange hands-on exercises
2. Start planning your Assessment 3 interpretability approach
3. Review key concepts using the practice quiz on Canvas