LLM Business Problem Exercises

Practice solving business problems using transformers and the Google Generative AI API

Exercise 1: Customer Review Sentiment Analysis

Business Problem

Your e-commerce company has collected thousands of product reviews. You need to analyze sentiment patterns to identify product issues and improvement opportunities.

Sample Dataset

[
  {"product_id": "P001", "review": "This smartphone has amazing battery life but the camera quality is disappointing."},
  {"product_id": "P001", "review": "Love the speed and design, worth every penny!"},
  {"product_id": "P001", "review": "The phone overheats when playing games for more than 20 minutes."},
  {"product_id": "P002", "review": "These headphones have excellent sound quality but the ear cushions wore out quickly."},
  {"product_id": "P002", "review": "Comfortable to wear but the Bluetooth connection keeps dropping."},
  {"product_id": "P002", "review": "Great value for money, been using them daily for months with no issues."}
]

Implementation Code

from google import genai
import json

client = genai.Client(api_key="YOUR_API_KEY") #AIzaSyCtS9u8UyDXA9BtSqR-yFSI5-ER9oedH0U

def analyze_product_reviews(reviews_data):
    # Group reviews by product
    products = {}
    for item in reviews_data:
        product_id = item["product_id"]
        if product_id not in products:
            products[product_id] = []
        products[product_id].append(item["review"])
    
    results = {}
    
    # Analyze each product
    for product_id, reviews in products.items():
        reviews_text = "\n".join([f"- {review}" for review in reviews])
        
        prompt = f"""
        Analyze these product reviews:
        
        {reviews_text}
        
        Provide a structured analysis in JSON format with these fields:
        1. "overall_sentiment": (positive, negative, or mixed)
        2. "positive_aspects": [list of praised features]
        3. "negative_aspects": [list of criticized features]
        4. "improvement_suggestions": [specific improvements based on reviews]
        
        Only return the JSON, no additional text.
        """
        
        response = client.models.generate_content(
            model="gemini-2.0-flash", 
            contents=prompt
        )
        
        try:
            # Parse the JSON response
            analysis = json.loads(response.text.replace('```json','').replace('```','').strip())
            results[product_id] = analysis
        except json.JSONDecodeError:
            results[product_id] = {"error": "Could not parse response as JSON"}
    
    return results

# Run the analysis
results = analyze_product_reviews(sample_reviews_data)
print(json.dumps(results, indent=2))

Quiz Questions

Q1: What transformer capability is most crucial for extracting product features from the reviews?

Q2: Why is the prompt instructing the model to return only JSON?

Q3: If you wanted to enhance this code to prioritize which product to improve first, what would be the most effective addition?

Q4: What business insight would be MOST difficult to extract using this approach?

Q5: How could transformer models' understanding of negation help in this analysis?

Results

Your score: 0/5

Exercise 2: Market Research Topic Clustering

Business Problem

Your marketing team has conducted interviews with potential customers. You need to identify common themes and priorities to inform your product development strategy.

Sample Dataset

[
  {"respondent_id": "R001", "response": "I need software that integrates well with my existing tools. Too many platforms don't connect with each other."},
  {"respondent_id": "R002", "response": "Price is my main concern. The features are great but I can't justify the expense for my small business."},
  {"respondent_id": "R003", "response": "Customer support is crucial. I've had bad experiences with other vendors who take days to respond."},
  {"respondent_id": "R004", "response": "Integration capabilities are a deal-breaker. If it doesn't work with my CRM, it's useless to me."},
  {"respondent_id": "R005", "response": "The learning curve should be minimal. I don't have time to train my team on complicated software."},
  {"respondent_id": "R006", "response": "Budget constraints are real for us. We need something affordable with flexible payment options."}
]

Implementation Code

from google import genai
import json

client = genai.Client(api_key="YOUR_API_KEY")

def identify_market_themes(interview_responses):
    # Combine all responses
    all_responses = "\n".join([f"- {item['response']}" for item in interview_responses])
    
    prompt = f"""
    These are responses from potential customers during market research interviews:
    
    {all_responses}
    
    Identify the main themes mentioned by these respondents. For each theme:
    1. Provide a clear theme name
    2. List the respondent IDs that mentioned this theme
    3. Provide a brief summary of their concerns or priorities
    4. Rate the priority level (High, Medium, Low) based on frequency and emphasis
    
    Return your analysis as JSON with this structure:
    {{
        "themes": [
            {{
                "theme_name": string,
                "mentioned_by": [respondent_ids],
                "summary": string,
                "priority": string
            }}
        ],
        "overall_recommendation": string
    }}
    
    Only return the JSON, no additional text.
    """
    
    response = client.models.generate_content(
        model="gemini-2.0-flash", 
        contents=prompt
    )
    
    try:
        # Parse the JSON response
        analysis = json.loads(response.text.replace('```json','').replace('```','').strip())
        return analysis
    except json.JSONDecodeError:
        return {"error": "Could not parse response as JSON"}

# Run the analysis
themes_analysis = identify_market_themes(interview_data)
print(json.dumps(themes_analysis, indent=2))

Quiz Questions

Q1: Which transformer characteristic is most valuable for identifying themes across different responses?

Q2: How does this implementation leverage the LLM's capabilities differently than traditional clustering algorithms?

Q3: What would be a valid business decision based on the results of this analysis?

Q4: Which addition to the code would make the results more reliable?

Q5: The model is asked to rate priorities as High, Medium, or Low. What is this an example of?

Results

Your score: 0/5

Exercise 3: Sales Forecast Analysis

Business Problem

Your sales team has historical sales data and needs to understand patterns and forecast future performance to set realistic targets.

Sample Dataset

[
  {"quarter": "2023-Q1", "region": "North", "product_line": "Enterprise", "revenue": 1250000, "units_sold": 125},
  {"quarter": "2023-Q1", "region": "South", "product_line": "Enterprise", "revenue": 980000, "units_sold": 98},
  {"quarter": "2023-Q1", "region": "East", "product_line": "SMB", "revenue": 560000, "units_sold": 112},
  {"quarter": "2023-Q1", "region": "West", "product_line": "SMB", "revenue": 640000, "units_sold": 128},
  {"quarter": "2023-Q2", "region": "North", "product_line": "Enterprise", "revenue": 1340000, "units_sold": 134},
  {"quarter": "2023-Q2", "region": "South", "product_line": "Enterprise", "revenue": 1050000, "units_sold": 105},
  {"quarter": "2023-Q2", "region": "East", "product_line": "SMB", "revenue": 580000, "units_sold": 116},
  {"quarter": "2023-Q2", "region": "West", "product_line": "SMB", "revenue": 690000, "units_sold": 138},
  {"quarter": "2023-Q3", "region": "North", "product_line": "Enterprise", "revenue": 1420000, "units_sold": 142},
  {"quarter": "2023-Q3", "region": "South", "product_line": "Enterprise", "revenue": 1100000, "units_sold": 110},
  {"quarter": "2023-Q3", "region": "East", "product_line": "SMB", "revenue": 620000, "units_sold": 124},
  {"quarter": "2023-Q3", "region": "West", "product_line": "SMB", "revenue": 710000, "units_sold": 142},
  {"quarter": "2023-Q4", "region": "North", "product_line": "Enterprise", "revenue": 1560000, "units_sold": 156},
  {"quarter": "2023-Q4", "region": "South", "product_line": "Enterprise", "revenue": 1240000, "units_sold": 124},
  {"quarter": "2023-Q4", "region": "East", "product_line": "SMB", "revenue": 680000, "units_sold": 136},
  {"quarter": "2023-Q4", "region": "West", "product_line": "SMB", "revenue": 770000, "units_sold": 154}
]

Implementation Code

from google import genai
import json

client = genai.Client(api_key="YOUR_API_KEY")

def analyze_sales_data(sales_data):
    # Convert data to string representation
    sales_str = json.dumps(sales_data, indent=2)
    
    prompt = f"""
    Analyze this quarterly sales data:
    
    {sales_str}
    
    Provide a detailed analysis including:
    1. Overall revenue trends by quarter
    2. Performance comparison between regions
    3. Performance comparison between product lines
    4. Key insights and patterns
    5. Forecast for 2024-Q1 and 2024-Q2 based on trends
    
    Return your analysis as JSON with this structure:
    {{
        "quarterly_trends": {{
            "summary": string,
            "growth_rates": {{ quarter: percentage }}
        }},
        "regional_performance": [
            {{ "region": string, "highlights": string, "areas_of_concern": string }}
        ],
        "product_line_performance": [
            {{ "product_line": string, "highlights": string, "areas_of_concern": string }}
        ],
        "key_insights": [string],
        "forecast": {{
            "2024-Q1": {{ region: {{ product_line: {{ "revenue": number, "units": number }} }} }},
            "2024-Q2": {{ region: {{ product_line: {{ "revenue": number, "units": number }} }} }}
        }}
    }}
    
    Only return the JSON, no additional text.
    """
    
    response = client.models.generate_content(
        model="gemini-2.0-flash", 
        contents=prompt
    )
    
    try:
        # Parse the JSON response
        analysis = json.loads(response.text.replace('```json','').replace('```','').strip())
        return analysis
    except json.JSONDecodeError:
        return {"error": "Could not parse response as JSON"}

# Run the analysis
sales_analysis = analyze_sales_data(sales_data)
print(json.dumps(sales_analysis, indent=2))

Quiz Questions

Q1: What type of prediction is the model performing when forecasting 2024-Q1 and Q2 revenue?

Q2: Why is JSON specified as the return format in this exercise?

Q3: Which of the following capabilities does this implementation NOT leverage from transformer models?

Q4: What business advantage does using an LLM for this analysis provide compared to traditional BI tools?

Q5: How could you enhance this code to provide better context for future forecasting?

Results

Your score: 0/5

Exercise 4: Customer Segmentation Analysis

Business Problem

Your marketing team needs to segment customers to create targeted campaigns. You have customer purchase data and need to identify meaningful segments.

Sample Dataset

[
  {"customer_id": "C001", "age": 28, "gender": "F", "location": "urban", "total_purchases": 12, "avg_order_value": 85, "preferred_category": "apparel", "shopping_frequency": "weekly"},
  {"customer_id": "C002", "age": 45, "gender": "M", "location": "suburban", "total_purchases": 8, "avg_order_value": 120, "preferred_category": "electronics", "shopping_frequency": "monthly"},
  {"customer_id": "C003", "age": 62, "gender": "F", "location": "rural", "total_purchases": 4, "avg_order_value": 65, "preferred_category": "home goods", "shopping_frequency": "quarterly"},
  {"customer_id": "C004", "age": 35, "gender": "M", "location": "urban", "total_purchases": 18, "avg_order_value": 75, "preferred_category": "apparel", "shopping_frequency": "weekly"},
  {"customer_id": "C005", "age": 50, "gender": "F", "location": "suburban", "total_purchases": 6, "avg_order_value": 200, "preferred_category": "electronics", "shopping_frequency": "monthly"},
  {"customer_id": "C006", "age": 25, "gender": "M", "location": "urban", "total_purchases": 24, "avg_order_value": 50, "preferred_category": "apparel", "shopping_frequency": "weekly"},
  {"customer_id": "C007", "age": 58, "gender": "M", "location": "rural", "total_purchases": 5, "avg_order_value": 180, "preferred_category": "home goods", "shopping_frequency": "quarterly"},
  {"customer_id": "C008", "age": 31, "gender": "F", "location": "urban", "total_purchases": 15, "avg_order_value": 90, "preferred_category": "beauty", "shopping_frequency": "bi-weekly"},
  {"customer_id": "C009", "age": 42, "gender": "M", "location": "suburban", "total_purchases": 10, "avg_order_value": 150, "preferred_category": "electronics", "shopping_frequency": "monthly"},
  {"customer_id": "C010", "age": 67, "gender": "F", "location": "rural", "total_purchases": 3, "avg_order_value": 95, "preferred_category": "home goods", "shopping_frequency": "quarterly"}
]

Implementation Code

from google import genai
import json

client = genai.Client(api_key="YOUR_API_KEY")

def segment_customers(customer_data):
    # Convert data to string representation
    customer_str = json.dumps(customer_data, indent=2)
    
    prompt = f"""
    Analyze this customer data and identify meaningful customer segments:
    
    {customer_str}
    
    Create a customer segmentation analysis with:
    1. 3-5 distinct customer segments
    2. Detailed description of each segment
    3. Marketing recommendations for each segment
    4. Assignment of each customer_id to a segment
    
    Return your analysis as JSON with this structure:
    {{
        "segments": [
            {{
                "segment_name": string,
                "description": string,
                "characteristics": [string],
                "customer_ids": [string],
                "marketing_recommendations": [string]
            }}
        ],
        "marketing_strategy": {{
            "overall_approach": string,
            "resource_allocation": string
        }}
    }}
    
    Only return the JSON, no additional text.
    """
    
    response = client.models.generate_content(
        model="gemini-2.0-flash", 
        contents=prompt
    )
    
    try:
        # Parse the JSON response
        analysis = json.loads(response.text.replace('```json','').replace('```','').strip())
        return analysis
    except json.JSONDecodeError:
        return {"error": "Could not parse response as JSON"}

# Run the analysis
segmentation = segment_customers(customer_data)
print(json.dumps(segmentation, indent=2))

Quiz Questions

Q1: What advantage does using an LLM for segmentation provide over traditional clustering algorithms?

Q2: Which part of the transformer architecture is most crucial for identifying patterns across different customer attributes?

Q3: What business function is BEST supported by this customer segmentation approach?

Q4: If you wanted to evaluate the quality of these segments, what would be the BEST approach?

Q5: How does the prompt in this example guide the model to produce business-relevant outputs?

Results

Your score: 0/5

Exercise 5: Product Description Generation

Business Problem

Your e-commerce team needs to create compelling product descriptions for a new inventory of products. You have basic product specifications but need to generate marketing copy.

Sample Dataset

[
  {
    "product_id": "WH001",
    "category": "Wireless Headphones",
    "brand": "SoundWave",
    "features": ["40-hour battery life", "Active noise cancellation", "Bluetooth 5.2", "Foldable design"],
    "technical_specs": {"weight": "220g", "frequency_response": "20Hz-20kHz", "color_options": ["Black", "Silver", "Blue"]},
    "price_tier": "premium",
    "target_audience": "professionals, travelers"
  },
  {
    "product_id": "LP001",
    "category": "Laptop",
    "brand": "TechPro",
    "features": ["13.3-inch Retina display", "16GB RAM", "512GB SSD", "10-hour battery life"],
    "technical_specs": {"processor": "Intel Core i7", "graphics": "Integrated Intel Iris", "weight": "1.4kg"},
    "price_tier": "premium",
    "target_audience": "professionals, students"
  },
  {
    "product_id": "SP001",
    "category": "Smart Speaker",
    "brand": "HomeHub",
    "features": ["360-degree sound", "Voice assistant", "Smart home controls", "Multi-room audio"],
    "technical_specs": {"power": "24W", "connectivity": "WiFi, Bluetooth", "dimensions": "15cm x 15cm x 18cm"},
    "price_tier": "mid-range",
    "target_audience": "home users, tech enthusiasts"
  }
]

Implementation Code

from google import genai
import json

client = genai.Client(api_key="YOUR_API_KEY")

def generate_product_descriptions(products_data):
    results = {}
    
    for product in products_data:
        product_str = json.dumps(product, indent=2)
        
        prompt = f"""
        Create a compelling product description based on these specifications:
        
        {product_str}
        
        Generate:
        1. An attention-grabbing headline (max 10 words)
        2. A short description for listings (50-60 words)
        3. A full product description (150-200 words)
        4. 5 key selling points formatted as bullet points
        5. SEO keywords (5-7 keywords)
        
        Return as JSON with this structure:
        {{
            "headline": string,
            "short_description": string,
            "full_description": string,
            "selling_points": [string],
            "seo_keywords": [string]
        }}
        
        Ensure the tone matches the price tier (premium, mid-range, budget) and appeals to the target audience.
        Only return the JSON, no additional text.
        """
        
        response = client.models.generate_content(
            model="gemini-2.0-flash", 
            contents=prompt
        )
        
        try:
            description = json.loads(response.text)
            results[product["product_id"]] = description
        except json.JSONDecodeError:
            results[product["product_id"]] = {"error": "Could not parse response as JSON"}
    
    return results

# Run the generation
product_descriptions = generate_product_descriptions(products_data)
print(json.dumps(product_descriptions, indent=2))

Quiz Questions

Q1: Which transformer capability is most responsible for the model's ability to adjust its writing tone for different price tiers?

Q2: What advantage does this approach have over template-based product description generation?

Q3: How is the model interpreting the "target_audience" field in the data?

Q4: What business metric would BEST evaluate the effectiveness of these generated descriptions?

Q5: How could you improve this code to create more effective product descriptions?

Results

Your score: 0/5