1 / 30

Workshop: Hands-On Python & Colab Practice

1 Hour Practical Session

What We'll Do

  1. Set up Google Colab
  2. Practice Python fundamentals
  3. Analyze a simple business dataset
  4. Prepare for Week 2's ML content

Workshop Structure

  • Part 1: Colab Setup (10 min)
  • Part 2: Python Practice (25 min)
  • Part 3: Business Data Analysis (20 min)
  • Part 4: Challenge & Wrap-Up (5 min)

Learning Objectives

By the end of this workshop, you'll be able to write Python code, work with data structures, analyze business data, and create visualizations.

Part 1: Colab Setup (10 minutes)

Exercise 1.1: Create Your First Notebook

Instructions

  1. Go to: https://colab.research.google.com
  2. Click: File → New Notebook
  3. Rename: Week_1_Workshop.ipynb
  4. Run your first code cell

First Code Cell

# Cell 1: Introduction
print("=" * 50)
print("DATA5000 - Week 1 Workshop")
print("Student: [Your Name Here]")
print("Date: [Today's Date]")
print("=" * 50)

Run: Press Shift+Enter

Expected Output

==================================================
DATA5000 - Week 1 Workshop
Student: [Your Name]
Date: [Date]
==================================================

Exercise 1.2 & 1.3: Add Text Cell & Test Libraries

Exercise 1.2: Add Text Cell

Instructions:

  1. Click: + Text button
  2. Type markdown content
  3. Press Shift+Enter to render
# Week 1 Workshop: Python Fundamentals
## Learning Objectives
- Practice Python basics
- Work with data structures
- Analyze business data
- Prepare for Week 2
**Status**: In Progress 🚀

Exercise 1.3: Test Python Libraries

# Test that key libraries are available
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
print("✓ NumPy version:", np.__version__)
print("✓ Pandas version:", pd.__version__)
print("✓ Matplotlib version:",
plt.matplotlib.__version__)
print("\nAll libraries loaded!")

Checkpoint: Everyone should have a working Colab notebook with these 3 cells

Part 2: Python Practice - Exercise 2.1

Variables & Basic Operations

Task: Customer Order Analysis

# Create variables for a customer order
customer_name = "Alice Johnson"
customer_id = "CUST001"
order_total = 149.99
items_ordered = 3
is_member = True
# Display customer info
print("Customer Information:")
print(f" Name: {customer_name}")
print(f" ID: {customer_id}")
print(f" Order Total: ${order_total}")
print(f" Items: {items_ordered}")
print(f" Member: {is_member}")
# Calculate average item price
avg_item_price = order_total / items_ordered
print(f"\nAvg Item: ${avg_item_price:.2f}")

Apply Member Discount

# Apply member discount (10% for members)
if is_member:
discount = order_total * 0.10
final_total = order_total - discount
print(f"Discount: ${discount:.2f}")
print(f"Final: ${final_total:.2f}")
else:
print("No discount applied")

Your Turn: Modify for a non-member with order total of $250.00 and 5 items

Exercise 2.2: Working with Lists

Sales Analysis

Code: Weekly Sales Data

# Daily sales for the week (in $1000s)
sales = [45.2, 52.3, 48.7, 61.5,
58.9, 72.1, 69.8]
days = ['Mon', 'Tue', 'Wed', 'Thu',
'Fri', 'Sat', 'Sun']
# Basic statistics
total_sales = sum(sales)
avg_daily = total_sales / len(sales)
max_sales = max(sales)
min_sales = min(sales)
print("Weekly Sales Report")
print("=" * 40)
print(f"Total: ${total_sales:.2f}k")
print(f"Average: ${avg_daily:.2f}k")
print(f"Best Day: ${max_sales:.2f}k")
print(f"Worst Day: ${min_sales:.2f}k")

Analysis & Insights

# Find best performing day
best_day_index = sales.index(max_sales)
best_day = days[best_day_index]
print(f"\nBest day: {best_day}")

# Calculate Mon-Fri growth
weekday_growth = ((sales[4] - sales[0])
        / sales[0]) * 100
print(f"Mon-Fri: {weekday_growth:.1f}%")

# Find high-performing days (> $60k)
high_days = [days[i] for i, sale
        in enumerate(sales)
        if sale > 60]
print(f"\nHigh days: {', '.join(high_days)}")

Your Turn: Calculate weekend vs. weekday average sales

Exercise 2.3: Dictionaries & Functions (Part 1)

Customer Database Structure

# Customer records
customers = [
{"id": "C001", "name": "Alice",
"purchases": 5, "total_spent": 1250.00},
{"id": "C002", "name": "Bob",
"purchases": 2, "total_spent": 340.00},
{"id": "C003", "name": "Carol",
"purchases": 8, "total_spent": 2100.00},
{"id": "C004", "name": "David",
"purchases": 3, "total_spent": 780.00},
]

Helper Functions

def categorize_customer(total_spent):
    """Categorize based on spending"""
    if total_spent >= 2000:
        return "Premium"
    elif total_spent >= 1000:
        return "Standard"
    else:
        return "Basic"

def avg_order_value(total_spent,
    num_purchases):
    """Calculate average per purchase"""
    if num_purchases == 0:
        return 0
    return total_spent / num_purchases

Exercise 2.3: Dictionaries & Functions (Part 2)

Analysis Code & Output

print("Customer Analysis Report")
print("=" * 70)
print(f"{'ID':<6} {'Name':<10} "
    f"{'Purchases':<12} {'Total':<12}")
print("-" * 70)

for customer in customers:
    cust_id = customer['id']
    name = customer['name']
    purchases = customer['purchases']
    total = customer['total_spent']
    avg_order = avg_order_value(total,
        purchases)
    tier = categorize_customer(total)

    print(f"{cust_id:<6} {name:<10} "
        f"{purchases:<12} ${total:<11.2f}")
# Summary statistics
total_customers = len(customers)
total_revenue = sum(c['total_spent'] for c in customers)
avg_value = total_revenue / total_customers
print(f"\nTotal Customers: {total_customers}")
print(f"Total Revenue: ${total_revenue:.2f}")
print(f"Avg Value: ${avg_value:.2f}")
# Count by tier
premium = sum(1 for c in customers if categorize_customer( c['total_spent']) == "Premium")
# ... similar for standard and basic

Your Turn: Add a new customer | Modify tier thresholds | Add CLV function

Part 3: Business Data Analysis

Exercise 3.1: Load and Explore Dataset

Create Sample Dataset

import pandas as pd
import numpy as np
# Create sample data
data = {
'Date': ['2024-01-01', '2024-01-02', '2024-01-03', '2024-01-04', '2024-01-05'],
'Product': ['Laptop', 'Mouse', 'Keyboard', 'Monitor', 'Laptop'],
'Category': ['Electronics', 'Accessories', 'Accessories', 'Electronics', 'Electronics'],
'Units_Sold': [5, 25, 15, 8, 3],
'Unit_Price': [999.99, 29.99, 79.99, 399.99, 999.99],
}
df = pd.DataFrame(data)

Explore the Data

# Display dataset info
print("Dataset Preview:")
print(df)
print("\nDataset Information:")
print(f" Rows: {len(df)}")
print(f" Columns: {len(df.columns)}")
print(f"\nColumn Names: {list(df.columns)}")
print(f"\nData Types:")
print(df.dtypes)
# Calculate total revenue
df['Total_Revenue'] = (df['Units_Sold'] * df['Unit_Price'])
print("\nDataset with Revenue:")
print(df)

Exercise 3.2: Calculate Business Metrics

Overall Metrics

# Total metrics total_revenue = df['Total_Revenue'].sum()
total_units = df['Units_Sold'].sum()
avg_transaction = total_revenue / len(df)
print("Overall Business Metrics")
print("=" * 50)
print(f"Total Revenue: ${total_revenue:,.2f}")
print(f"Total Units: {total_units}")
print(f"Avg Transaction: ${avg_transaction:,.2f}")

Product Performance

# Product performance print("\nProduct Performance:")
product_revenue = df.groupby('Product')[ 'Total_Revenue'].sum().sort_values( ascending=False)
print(product_revenue)
# Best-selling product
best_product = product_revenue.idxmax()
best_revenue = product_revenue.max()
print(f"\nBest: {best_product}")
print(f"Revenue: ${best_revenue:,.2f}")

Additional Analysis

# Category analysis
category_analysis = df.groupby('Category').agg({ 'Total_Revenue': 'sum', 'Units_Sold': 'sum' }).round(2)
# Customer type analysis
customer_analysis = df.groupby('Customer_Type')[ 'Total_Revenue'].agg(['sum', 'mean', 'count'])

Exercise 3.3: Create Visualizations

Visualization Code

import matplotlib.pyplot as plt
# Create figure with subplots
fig, axes = plt.subplots(2, 2, figsize=(14, 10))
fig.suptitle('Sales Analysis Dashboard', fontsize=16, fontweight='bold')
# 1. Revenue by Product (Bar Chart)
product_revenue.plot(kind='bar', ax=axes[0, 0], color='skyblue')
axes[0, 0].set_title('Revenue by Product')
axes[0, 0].set_xlabel('Product')
axes[0, 0].set_ylabel('Revenue ($)')
# 2. Units Sold by Category (Pie)
category_units = df.groupby('Category')[ 'Units_Sold'].sum()
axes[0, 1].pie(category_units, labels=category_units.index, autopct='%1.1f%%')
axes[0, 1].set_title('Units by Category')

More Charts

# 3. Revenue Trend (Line Chart)
axes[1, 0].plot(df['Date'], df['Total_Revenue'], marker='o', linewidth=2)
axes[1, 0].set_title('Daily Revenue')
axes[1, 0].set_xlabel('Date')
axes[1, 0].set_ylabel('Revenue ($)')
axes[1, 0].grid(True, alpha=0.3)
# 4. Customer Type (Bar Chart)
customer_revenue = df.groupby( 'Customer_Type')['Total_Revenue'].sum()
axes[1, 1].bar(customer_revenue.index, customer_revenue.values, color=['green', 'orange'])
axes[1, 1].set_title('Revenue by Type')
plt.tight_layout()
plt.show()
print("Visualizations created!")

Part 4: Challenge Exercise

Calculate Profitability

# Task 1: Calculate profit margin
# Assume cost is 60% of unit price
df['Cost'] = (df['Unit_Price'] * df['Units_Sold'] * 0.60)
df['Profit'] = (df['Total_Revenue'] - df['Cost'])
df['Profit_Margin'] = ((df['Profit'] / df['Total_Revenue'] * 100).round(2))
print("Profitability Analysis:")
print(df[['Product', 'Total_Revenue', 'Profit', 'Profit_Margin']])
# Task 2: High-value transactions
high_value = df[df['Total_Revenue'] > 3000]
print(f"\nHigh-Value: {len(high_value)}")
print(high_value[['Date', 'Product', 'Total_Revenue']])

Summary Report Function

# Task 3: Cumulative revenue
df['Cumulative_Revenue'] = ( df['Total_Revenue'].cumsum())
# Task 4: Summary report function
def generate_summary_report(dataframe):
    """Generate business summary"""
    report = {
        'Total Revenue':
            dataframe['Total_Revenue'].sum(),
        'Total Profit':
            dataframe['Profit'].sum(),
        'Avg Profit Margin':
            dataframe['Profit_Margin'].mean(),
        'Best Product':
            dataframe.groupby('Product')[
                'Profit'].sum().idxmax(),
    }
    return report

summary = generate_summary_report(df)
print("\nExecutive Summary:")
for key, value in summary.items():
    print(f" {key}: ${value:,.2f}")

Challenge Questions: What product has highest profit margin? | What % of revenue from Business customers? | What if costs increased 10%?

Workshop Wrap-Up

What We Accomplished Today

  • Set up Google Colab
  • Practiced Python fundamentals
  • Loaded and analyzed business data
  • Created visualizations
  • Built reusable analysis functions

Save Your Work

print("=" * 50)
print("Workshop Complete!")
print("=" * 50)
print("\nKey Skills Learned:")
print(" ✓ Python basics")
print(" ✓ Data structures")
print(" ✓ Pandas DataFrames")
print(" ✓ Data visualization")
print("\nNext Week: First ML Model!")

Homework

  • Complete the notebook if you didn't finish
  • Add comments explaining each code section
  • Try the challenge exercises
  • Save to Google Drive: DATA5000/Week_1/

Preparation for Week 2

  1. Review today's notebook
  2. Practice Python basics (30 min)
  3. Read Week 2 pre-reading (on MyKBS)
  4. Download customer_churn.csv dataset

Weekly Quiz

Question 1

What is the main difference between AI and ML?

  • A) AI is newer than ML
  • B) ML is a subset of AI focused on learning from data
  • C) AI requires more computing power
  • D) There is no difference

Question 2

Which type of ML learns from labeled examples?

  • A) Unsupervised Learning
  • B) Reinforcement Learning
  • C) Supervised Learning
  • D) Deep Learning

Question 3

What data type is customer_name = "Alice"?

  • A) int
  • B) float
  • C) bool
  • D) str

Question 4

Why splitting data into training and testing sets?

  • A) To make the model train faster
  • B) To evaluate model on unseen data
  • C) To reduce dataset size
  • D) To eliminate outliers

Question 5

Which of the following is an ethical concern in AI?

  • A) Bias in training data leading to unfair outcomes
  • B) Using too much memory
  • C) Slow execution time
  • D) Complex syntax

Bonus Question

Name one business application of AI discussed today and explain how it creates value.

Resources for Continued Learning

Python Practice:

  • Codecademy: Learn Python 3
  • LeetCode: Python problems (Easy)
  • HackerRank: Python basics track

AI Fundamentals:

  • 3Blue1Brown: Neural Networks (YouTube)
  • Google AI: ML Crash Course
  • Fast.ai: Practical Deep Learning

Support:

  • Office Hours | Discussion Forum: MyKBS
  • Academic Success Centre

Week 1 Complete!

Today's Achievements

Python Fundamentals

Variables, lists, dictionaries, functions, control flow

Google Colab

Setup, notebooks, libraries, code execution

Data Analysis

Pandas, metrics, visualizations, business insights

Next Week: Introduction to Predictive Analytics

What You'll Learn:

  • Your first machine learning model
  • Customer churn prediction
  • Model evaluation techniques
  • Business application of ML

What to Bring:

  • Completed Week 1 notebook
  • Questions from homework
  • Customer_churn.csv dataset
  • Enthusiasm for ML!

Great work today! See you next week!