By the end of this workshop, you will be able to:
By the end of this workshop, you will be able to:
A probabilistic classifier that predicts outcomes based on the likelihood of features occurring together.
Naïve Bayes is built on a simple question: What is the probability of event A given that event B has occurred?
Customer Purchase Prediction Example: Will a customer buy based on their browsing behavior?
Dataset: 1,000 emails (700 legitimate, 300 spam)
| Feature | Spam Emails (300) | Legitimate (700) |
|---|---|---|
| Contains "FREE" | 240 (80%) | 70 (10%) |
| Contains "Meeting" | 30 (10%) | 490 (70%) |
| Unknown Sender | 270 (90%) | 140 (20%) |
A classification method that finds the optimal boundary (hyperplane) between classes with the maximum margin of separation.
Customer Segmentation: Spending vs Visit Frequency
Sometimes data cannot be separated by a straight line in its original form.
Watch how data that cannot be separated linearly becomes separable in higher dimensions:
An ensemble method that builds multiple weak learners sequentially, where each new model focuses on correcting the errors of previous models.
| Aspect | Random Forest (Bagging) | Gradient Boosting |
|---|---|---|
| Training | Parallel (independent trees) | Sequential (each tree learns from previous errors) |
| Focus | Reduce variance through averaging | Reduce bias by correcting errors |
| Speed | Fast (can parallelize) | Slower (sequential process) |
| Accuracy | Good | Typically higher |
| Overfitting Risk | Lower | Higher (needs careful tuning) |
| Interpretability | Moderate | Lower |
| Criteria | Naïve Bayes | SVM | Gradient Boosting |
|---|---|---|---|
| Training Speed | Very Fast | Slow | Moderate |
| Prediction Speed | Very Fast | Fast | Moderate |
| Typical Accuracy | Good | Very Good | Excellent |
| Interpretability | High | Low | Moderate |
| Handles Non-linearity | No | Yes (with kernels) | Yes |
| Dataset Size | Small to Medium | Small to Medium | Medium to Large |
| Metric | Formula | Example Value |
|---|---|---|
| Accuracy | (TP + TN) / Total | (85 + 10) / 120 = 79.2% |
| Precision | TP / (TP + FP) | 85 / (85 + 15) = 85.0% |
| Recall | TP / (TP + FN) | 85 / (85 + 10) = 89.5% |
| F1-Score | 2 × (Precision × Recall) / (Precision + Recall) | 87.2% |
A company wants to predict which employees are likely to leave to implement proactive retention strategies.
| Method | Accuracy | Precision | Recall | Training Time |
|---|---|---|---|---|
| Naïve Bayes | 76% | 71% | 68% | 2 seconds |
| SVM (RBF kernel) | 84% | 82% | 79% | 45 seconds |
| Gradient Boosting | 89% | 87% | 86% | 28 seconds |