Practice solving business problems using transformers and the Google Generative AI API
Your e-commerce company has collected thousands of product reviews. You need to analyze sentiment patterns to identify product issues and improvement opportunities.
[
{"product_id": "P001", "review": "This smartphone has amazing battery life but the camera quality is disappointing."},
{"product_id": "P001", "review": "Love the speed and design, worth every penny!"},
{"product_id": "P001", "review": "The phone overheats when playing games for more than 20 minutes."},
{"product_id": "P002", "review": "These headphones have excellent sound quality but the ear cushions wore out quickly."},
{"product_id": "P002", "review": "Comfortable to wear but the Bluetooth connection keeps dropping."},
{"product_id": "P002", "review": "Great value for money, been using them daily for months with no issues."}
]
from google import genai
import json
client = genai.Client(api_key="YOUR_API_KEY") #AIzaSyCtS9u8UyDXA9BtSqR-yFSI5-ER9oedH0U
def analyze_product_reviews(reviews_data):
# Group reviews by product
products = {}
for item in reviews_data:
product_id = item["product_id"]
if product_id not in products:
products[product_id] = []
products[product_id].append(item["review"])
results = {}
# Analyze each product
for product_id, reviews in products.items():
reviews_text = "\n".join([f"- {review}" for review in reviews])
prompt = f"""
Analyze these product reviews:
{reviews_text}
Provide a structured analysis in JSON format with these fields:
1. "overall_sentiment": (positive, negative, or mixed)
2. "positive_aspects": [list of praised features]
3. "negative_aspects": [list of criticized features]
4. "improvement_suggestions": [specific improvements based on reviews]
Only return the JSON, no additional text.
"""
response = client.models.generate_content(
model="gemini-2.0-flash",
contents=prompt
)
try:
# Parse the JSON response
analysis = json.loads(response.text.replace('```json','').replace('```','').strip())
results[product_id] = analysis
except json.JSONDecodeError:
results[product_id] = {"error": "Could not parse response as JSON"}
return results
# Run the analysis
results = analyze_product_reviews(sample_reviews_data)
print(json.dumps(results, indent=2))
Q1: What transformer capability is most crucial for extracting product features from the reviews?
Q2: Why is the prompt instructing the model to return only JSON?
Q3: If you wanted to enhance this code to prioritize which product to improve first, what would be the most effective addition?
Q4: What business insight would be MOST difficult to extract using this approach?
Q5: How could transformer models' understanding of negation help in this analysis?
Your score: 0/5
Your marketing team has conducted interviews with potential customers. You need to identify common themes and priorities to inform your product development strategy.
[
{"respondent_id": "R001", "response": "I need software that integrates well with my existing tools. Too many platforms don't connect with each other."},
{"respondent_id": "R002", "response": "Price is my main concern. The features are great but I can't justify the expense for my small business."},
{"respondent_id": "R003", "response": "Customer support is crucial. I've had bad experiences with other vendors who take days to respond."},
{"respondent_id": "R004", "response": "Integration capabilities are a deal-breaker. If it doesn't work with my CRM, it's useless to me."},
{"respondent_id": "R005", "response": "The learning curve should be minimal. I don't have time to train my team on complicated software."},
{"respondent_id": "R006", "response": "Budget constraints are real for us. We need something affordable with flexible payment options."}
]
from google import genai
import json
client = genai.Client(api_key="YOUR_API_KEY")
def identify_market_themes(interview_responses):
# Combine all responses
all_responses = "\n".join([f"- {item['response']}" for item in interview_responses])
prompt = f"""
These are responses from potential customers during market research interviews:
{all_responses}
Identify the main themes mentioned by these respondents. For each theme:
1. Provide a clear theme name
2. List the respondent IDs that mentioned this theme
3. Provide a brief summary of their concerns or priorities
4. Rate the priority level (High, Medium, Low) based on frequency and emphasis
Return your analysis as JSON with this structure:
{{
"themes": [
{{
"theme_name": string,
"mentioned_by": [respondent_ids],
"summary": string,
"priority": string
}}
],
"overall_recommendation": string
}}
Only return the JSON, no additional text.
"""
response = client.models.generate_content(
model="gemini-2.0-flash",
contents=prompt
)
try:
# Parse the JSON response
analysis = json.loads(response.text.replace('```json','').replace('```','').strip())
return analysis
except json.JSONDecodeError:
return {"error": "Could not parse response as JSON"}
# Run the analysis
themes_analysis = identify_market_themes(interview_data)
print(json.dumps(themes_analysis, indent=2))
Q1: Which transformer characteristic is most valuable for identifying themes across different responses?
Q2: How does this implementation leverage the LLM's capabilities differently than traditional clustering algorithms?
Q3: What would be a valid business decision based on the results of this analysis?
Q4: Which addition to the code would make the results more reliable?
Q5: The model is asked to rate priorities as High, Medium, or Low. What is this an example of?
Your score: 0/5
Your sales team has historical sales data and needs to understand patterns and forecast future performance to set realistic targets.
[
{"quarter": "2023-Q1", "region": "North", "product_line": "Enterprise", "revenue": 1250000, "units_sold": 125},
{"quarter": "2023-Q1", "region": "South", "product_line": "Enterprise", "revenue": 980000, "units_sold": 98},
{"quarter": "2023-Q1", "region": "East", "product_line": "SMB", "revenue": 560000, "units_sold": 112},
{"quarter": "2023-Q1", "region": "West", "product_line": "SMB", "revenue": 640000, "units_sold": 128},
{"quarter": "2023-Q2", "region": "North", "product_line": "Enterprise", "revenue": 1340000, "units_sold": 134},
{"quarter": "2023-Q2", "region": "South", "product_line": "Enterprise", "revenue": 1050000, "units_sold": 105},
{"quarter": "2023-Q2", "region": "East", "product_line": "SMB", "revenue": 580000, "units_sold": 116},
{"quarter": "2023-Q2", "region": "West", "product_line": "SMB", "revenue": 690000, "units_sold": 138},
{"quarter": "2023-Q3", "region": "North", "product_line": "Enterprise", "revenue": 1420000, "units_sold": 142},
{"quarter": "2023-Q3", "region": "South", "product_line": "Enterprise", "revenue": 1100000, "units_sold": 110},
{"quarter": "2023-Q3", "region": "East", "product_line": "SMB", "revenue": 620000, "units_sold": 124},
{"quarter": "2023-Q3", "region": "West", "product_line": "SMB", "revenue": 710000, "units_sold": 142},
{"quarter": "2023-Q4", "region": "North", "product_line": "Enterprise", "revenue": 1560000, "units_sold": 156},
{"quarter": "2023-Q4", "region": "South", "product_line": "Enterprise", "revenue": 1240000, "units_sold": 124},
{"quarter": "2023-Q4", "region": "East", "product_line": "SMB", "revenue": 680000, "units_sold": 136},
{"quarter": "2023-Q4", "region": "West", "product_line": "SMB", "revenue": 770000, "units_sold": 154}
]
from google import genai
import json
client = genai.Client(api_key="YOUR_API_KEY")
def analyze_sales_data(sales_data):
# Convert data to string representation
sales_str = json.dumps(sales_data, indent=2)
prompt = f"""
Analyze this quarterly sales data:
{sales_str}
Provide a detailed analysis including:
1. Overall revenue trends by quarter
2. Performance comparison between regions
3. Performance comparison between product lines
4. Key insights and patterns
5. Forecast for 2024-Q1 and 2024-Q2 based on trends
Return your analysis as JSON with this structure:
{{
"quarterly_trends": {{
"summary": string,
"growth_rates": {{ quarter: percentage }}
}},
"regional_performance": [
{{ "region": string, "highlights": string, "areas_of_concern": string }}
],
"product_line_performance": [
{{ "product_line": string, "highlights": string, "areas_of_concern": string }}
],
"key_insights": [string],
"forecast": {{
"2024-Q1": {{ region: {{ product_line: {{ "revenue": number, "units": number }} }} }},
"2024-Q2": {{ region: {{ product_line: {{ "revenue": number, "units": number }} }} }}
}}
}}
Only return the JSON, no additional text.
"""
response = client.models.generate_content(
model="gemini-2.0-flash",
contents=prompt
)
try:
# Parse the JSON response
analysis = json.loads(response.text.replace('```json','').replace('```','').strip())
return analysis
except json.JSONDecodeError:
return {"error": "Could not parse response as JSON"}
# Run the analysis
sales_analysis = analyze_sales_data(sales_data)
print(json.dumps(sales_analysis, indent=2))
Q1: What type of prediction is the model performing when forecasting 2024-Q1 and Q2 revenue?
Q2: Why is JSON specified as the return format in this exercise?
Q3: Which of the following capabilities does this implementation NOT leverage from transformer models?
Q4: What business advantage does using an LLM for this analysis provide compared to traditional BI tools?
Q5: How could you enhance this code to provide better context for future forecasting?
Your score: 0/5
Your marketing team needs to segment customers to create targeted campaigns. You have customer purchase data and need to identify meaningful segments.
[
{"customer_id": "C001", "age": 28, "gender": "F", "location": "urban", "total_purchases": 12, "avg_order_value": 85, "preferred_category": "apparel", "shopping_frequency": "weekly"},
{"customer_id": "C002", "age": 45, "gender": "M", "location": "suburban", "total_purchases": 8, "avg_order_value": 120, "preferred_category": "electronics", "shopping_frequency": "monthly"},
{"customer_id": "C003", "age": 62, "gender": "F", "location": "rural", "total_purchases": 4, "avg_order_value": 65, "preferred_category": "home goods", "shopping_frequency": "quarterly"},
{"customer_id": "C004", "age": 35, "gender": "M", "location": "urban", "total_purchases": 18, "avg_order_value": 75, "preferred_category": "apparel", "shopping_frequency": "weekly"},
{"customer_id": "C005", "age": 50, "gender": "F", "location": "suburban", "total_purchases": 6, "avg_order_value": 200, "preferred_category": "electronics", "shopping_frequency": "monthly"},
{"customer_id": "C006", "age": 25, "gender": "M", "location": "urban", "total_purchases": 24, "avg_order_value": 50, "preferred_category": "apparel", "shopping_frequency": "weekly"},
{"customer_id": "C007", "age": 58, "gender": "M", "location": "rural", "total_purchases": 5, "avg_order_value": 180, "preferred_category": "home goods", "shopping_frequency": "quarterly"},
{"customer_id": "C008", "age": 31, "gender": "F", "location": "urban", "total_purchases": 15, "avg_order_value": 90, "preferred_category": "beauty", "shopping_frequency": "bi-weekly"},
{"customer_id": "C009", "age": 42, "gender": "M", "location": "suburban", "total_purchases": 10, "avg_order_value": 150, "preferred_category": "electronics", "shopping_frequency": "monthly"},
{"customer_id": "C010", "age": 67, "gender": "F", "location": "rural", "total_purchases": 3, "avg_order_value": 95, "preferred_category": "home goods", "shopping_frequency": "quarterly"}
]
from google import genai
import json
client = genai.Client(api_key="YOUR_API_KEY")
def segment_customers(customer_data):
# Convert data to string representation
customer_str = json.dumps(customer_data, indent=2)
prompt = f"""
Analyze this customer data and identify meaningful customer segments:
{customer_str}
Create a customer segmentation analysis with:
1. 3-5 distinct customer segments
2. Detailed description of each segment
3. Marketing recommendations for each segment
4. Assignment of each customer_id to a segment
Return your analysis as JSON with this structure:
{{
"segments": [
{{
"segment_name": string,
"description": string,
"characteristics": [string],
"customer_ids": [string],
"marketing_recommendations": [string]
}}
],
"marketing_strategy": {{
"overall_approach": string,
"resource_allocation": string
}}
}}
Only return the JSON, no additional text.
"""
response = client.models.generate_content(
model="gemini-2.0-flash",
contents=prompt
)
try:
# Parse the JSON response
analysis = json.loads(response.text.replace('```json','').replace('```','').strip())
return analysis
except json.JSONDecodeError:
return {"error": "Could not parse response as JSON"}
# Run the analysis
segmentation = segment_customers(customer_data)
print(json.dumps(segmentation, indent=2))
Q1: What advantage does using an LLM for segmentation provide over traditional clustering algorithms?
Q2: Which part of the transformer architecture is most crucial for identifying patterns across different customer attributes?
Q3: What business function is BEST supported by this customer segmentation approach?
Q4: If you wanted to evaluate the quality of these segments, what would be the BEST approach?
Q5: How does the prompt in this example guide the model to produce business-relevant outputs?
Your score: 0/5
Your e-commerce team needs to create compelling product descriptions for a new inventory of products. You have basic product specifications but need to generate marketing copy.
[
{
"product_id": "WH001",
"category": "Wireless Headphones",
"brand": "SoundWave",
"features": ["40-hour battery life", "Active noise cancellation", "Bluetooth 5.2", "Foldable design"],
"technical_specs": {"weight": "220g", "frequency_response": "20Hz-20kHz", "color_options": ["Black", "Silver", "Blue"]},
"price_tier": "premium",
"target_audience": "professionals, travelers"
},
{
"product_id": "LP001",
"category": "Laptop",
"brand": "TechPro",
"features": ["13.3-inch Retina display", "16GB RAM", "512GB SSD", "10-hour battery life"],
"technical_specs": {"processor": "Intel Core i7", "graphics": "Integrated Intel Iris", "weight": "1.4kg"},
"price_tier": "premium",
"target_audience": "professionals, students"
},
{
"product_id": "SP001",
"category": "Smart Speaker",
"brand": "HomeHub",
"features": ["360-degree sound", "Voice assistant", "Smart home controls", "Multi-room audio"],
"technical_specs": {"power": "24W", "connectivity": "WiFi, Bluetooth", "dimensions": "15cm x 15cm x 18cm"},
"price_tier": "mid-range",
"target_audience": "home users, tech enthusiasts"
}
]
from google import genai
import json
client = genai.Client(api_key="YOUR_API_KEY")
def generate_product_descriptions(products_data):
results = {}
for product in products_data:
product_str = json.dumps(product, indent=2)
prompt = f"""
Create a compelling product description based on these specifications:
{product_str}
Generate:
1. An attention-grabbing headline (max 10 words)
2. A short description for listings (50-60 words)
3. A full product description (150-200 words)
4. 5 key selling points formatted as bullet points
5. SEO keywords (5-7 keywords)
Return as JSON with this structure:
{{
"headline": string,
"short_description": string,
"full_description": string,
"selling_points": [string],
"seo_keywords": [string]
}}
Ensure the tone matches the price tier (premium, mid-range, budget) and appeals to the target audience.
Only return the JSON, no additional text.
"""
response = client.models.generate_content(
model="gemini-2.0-flash",
contents=prompt
)
try:
description = json.loads(response.text)
results[product["product_id"]] = description
except json.JSONDecodeError:
results[product["product_id"]] = {"error": "Could not parse response as JSON"}
return results
# Run the generation
product_descriptions = generate_product_descriptions(products_data)
print(json.dumps(product_descriptions, indent=2))
Q1: Which transformer capability is most responsible for the model's ability to adjust its writing tone for different price tiers?
Q2: What advantage does this approach have over template-based product description generation?
Q3: How is the model interpreting the "target_audience" field in the data?
Q4: What business metric would BEST evaluate the effectiveness of these generated descriptions?
Q5: How could you improve this code to create more effective product descriptions?
Your score: 0/5