Guided activities — from raw numbers to meaningful charts
✏️ By hand
📊 Excel
7 activities
60 min
2 / 28
Today's Tutorial Roadmap
✏️ By Hand
Stem-and-Leaf
Activity 2 · PP2.9 · PP2.14
✏️ By Hand
Frequency Distribution
Activity 1 · Activity 3 · RP2.8 · PP2.4
📊 Excel
Histogram & Ogive
Activity 4 · RP2.9
📊 Excel
Scatter Plot
PP2.16
📊 Excel
Bar & Pie Charts
PP2.11 · PP2.10
✏️ Both
Pareto Chart
Activity 6
Running Thread
Factory Workers
Same dataset → 3 different charts
Key Skill
Chart Selection
Right chart for right question
Running dataset: Activities 2, 3 and 4 all use the same 29 factory workers' commute times — so you'll see how the same data looks through three different lenses.
Section 1 of 5
Stem-and-Leaf Plots
See every individual value and the distribution shape at the same time.
Activity 2 — Factory Workers
PP2.9 — Two-digit stems
PP2.14 — Decimal stems
4 / 28
✏️ Activity 2 — By Hand
Factory Workers: Commute Times (min)
A sample of 29 factory workers were asked how long they commute from home to work on average.
50
30
73
79
5
9
42
10
15
35
51
68
69
32
46
54
65
44
23
36
77
61
59
56
48
58
21
11
46
Task: Construct a stem-and-leaf plot of this data.
Step 1 — Sort
Order the values from smallest to largest before you begin.
Step 2 — Choose stems
Values range from 5 to 79. Use the tens digit (0, 1, 2, … 7) as stems.
Step 3 — Add leaves
The units digit of each value becomes the leaf next to its stem.
A manufacturer samples 50 metal rings (target weight ~1.4 kg). Construct a frequency distribution using FIVE classes.
1.51
1.59
1.42
1.25
1.34
1.51
1.51
1.19
1.62
1.31
1.56
1.16
1.25
1.48
1.59
1.42
1.62
1.25
1.31
1.16
1.48
1.96
1.51
1.62
1.45
1.53
1.79
1.19
1.34
1.34
1.48
1.51
1.31
1.02
1.65
1.45
1.08
1.39
1.42
1.76
1.11
1.25
1.56
1.22
1.48
1.22
1.19
1.62
1.39
1.45
Your working
Min = 1.02, Max = 1.96
Range = 1.96 − 1.02 = 0.94
Width = 0.94 ÷ 5 ≈ 0.19 → round to 0.20
Start point: 1.00 (round down)
Weight (kg)
Count
Observation
1.00–<1.20
6
Underweight
1.20–<1.40
14
Below target
1.40–<1.60
22
Near target ✓
1.60–<1.80
6
Above target
1.80–<2.00
2
Overweight
Total
50
9 / 28
✏️ PP2.4 — By Hand
Absenteeism: Midpoints, Relative & Cumulative Frequency
500 employee records examined. For each class: determine the class midpoint, relative frequency, and cumulative frequency.
Class interval
Frequency
Midpoint
Rel. Freq.
Cum. Freq.
(0, 2]
218
?
?
?
(2, 4]
207
?
?
?
(4, 6]
56
?
?
?
(6, 8]
11
?
?
?
(8, 10]
8
?
?
?
Total
500
—
1.000
500
Formulas to use
Midpoint = (Lower + Upper) ÷ 2
Relative Freq = Class Freq ÷ Total
Cumulative Freq = Running total
Class
Freq
Midpt
Rel. Freq.
Cum. Freq.
(0, 2]
218
1
0.436
218
(2, 4]
207
3
0.414
425
(4, 6]
56
5
0.112
481
(6, 8]
11
7
0.022
492
(8, 10]
8
9
0.016
500
500
1.000
500
Key insight
85% of employees (425 out of 500) took 4 or fewer days off. The distribution is strongly right-skewed — most people have low absenteeism, but a few have much higher.
Section 3 of 5
Histogram, Frequency Polygon & Ogive
Turn your frequency table into a chart — by hand first, then in Excel.
Activity 4 — Factory workers histogram
RP2.9 — NZ house prices
11 / 28
✏️ + 📊 Activity 4 — By Hand, then Excel
Factory Workers: Draw a Histogram
Use your frequency distribution from Activity 3 (same 29 workers' commute times). First class: 0 to less than 10.
By Hand — Rules to Remember
Bars must touch — no gaps (continuous data)
Bar height = frequency (or relative frequency)
All bars must be equal width
Label both axes with units
Give the chart a title
📊 Excel Steps
1
Enter the raw data in Column A
2
Insert → Charts → Histogram
3
Right-click x-axis → Format Axis → Set bin width to 10
4
Add axis labels and a title
Expected Shape — Commute Times
Discussion: Compare your by-hand histogram with the stem-and-leaf from Activity 2. What do both show about the commute time distribution? What does one show that the other doesn't?
12 / 28
📊 RP2.9 — Excel Recommended
NZ House Prices: Frequency Polygon & Ogive
90 houses for sale in a NZ city, each ~180m². Use the frequency distribution provided to construct both charts.
Frequency Polygon — Key Rules
Plot frequency against class midpoint (not boundary)
Connect points with straight lines
Close the polygon at both ends by adding hypothetical classes with zero frequency
Ogive — Key Rules
Plot cumulative % against the upper class boundary
Starts at 0% (at lower boundary of first class)
Ends at 100% (at upper boundary of last class)
Use for answering "what % are below $X?"
📊 Excel Tip — Ogive
1
Build frequency table; add Cumulative % column
2
Select upper boundaries + Cum % columns
3
Insert → Charts → Line with Markers
4
Format axes, add labels, title
Discussion question: Using your ogive, estimate what percentage of houses are priced below $300,000. Then estimate the median house price.
Section 4 of 5
Scatter Plot
Investigating relationships between two numerical variables.
PP2.16 — Advertising vs Sales Revenue
14 / 28
📊 PP2.16 — Excel Recommended
Advertising Spend vs Sales Revenue
Is total sales revenue related to advertising spend? Construct a scatter plot and discuss the relationship.
Advertising ($M)
Sales Revenue ($M)
4.2
155.7
1.6
87.3
6.3
135.6
2.7
99.0
10.4
168.2
7.1
136.9
5.5
101.4
8.3
158.2
📊 Excel Steps
1
Put Advertising in Col A, Sales in Col B
2
Select both columns → Insert → Scatter (X,Y)
3
Add axis titles and a chart title
4
Right-click a point → Add Trendline → Linear
Advertising vs Sales — Expected Pattern
Relationship
There is a positive relationship between advertising spend and sales — higher advertising tends to produce higher revenue. However, the relationship is not perfectly linear (e.g. $6.3M spend produces less sales than $5.5M). We'll quantify this relationship in Week 10 (correlation & regression).
Section 5 of 5
Bar, Pie & Pareto Charts
Visualising categorical data — and using the 80/20 rule to find what matters most.
PP2.11 — Australia Imports (Bar)
PP2.10 — Australia Exports (Pie)
Activity 6 — Restaurant Complaints (Pareto)
16 / 28
📊 PP2.11 — Excel
Australia's Top 10 Import Sources
Import Source
A$ million
China
49,329
USA
39,181
Japan
21,221
Singapore
17,878
Thailand
13,832
Germany
13,099
UK
12,044
Malaysia
10,944
South Korea
10,813
New Zealand
10,532
📊 Excel Steps
1
Enter Country in Col A, Value in Col B
2
Sort by value descending (Data → Sort)
3
Select both columns → Insert → Bar Chart (horizontal)
4
Label axes and add a title
Horizontal or Vertical?
With long country names, a horizontal bar chart (country names on y-axis) is easier to read than a vertical one. Excel calls this a "Bar Chart" rather than "Column Chart".
Discussion: After building this chart alongside the exports chart (PP2.10), which countries are Australia's most important two-way trading partners?
17 / 28
📊 PP2.10 — Excel
Australia's Top 10 Export Destinations
Export Market
A$ million
% (approx)
China
94,655
45.1%
Japan
47,501
22.6%
South Korea
19,610
9.3%
USA
9,580
4.6%
India
9,517
4.5%
New Zealand
7,399
3.5%
Taiwan
7,356
3.5%
Singapore
5,659
2.7%
Malaysia
5,561
2.6%
UK
3,859
1.8%
Total
210,697
100%
📊 Excel Steps
1
Insert → Charts → Pie → 2-D Pie
2
Right-click slices → Add Data Labels → Show %
3
Give the chart a meaningful title
Is a pie chart effective here?
China dominates at 45% — this is obvious in any chart
But comparing the 8 small slices (2–9% each) is very difficult in a pie
A bar chart sorted by value would make comparisons clearer
Pie charts work best with 3–5 categories, not 10
18 / 28
✏️ + 📊 Activity 6 — By Hand + Excel
Restaurant Complaints: Build a Pareto Chart
A restaurant owner reviews complaint forms from the past year. Construct a Pareto chart to identify the most important complaints to fix first.
Complaint Type
Percentage (%)
Long waiting time
9%
Food too salty
6%
Overpriced
26%
Small portions
21%
Food not fresh
16%
No atmosphere
12%
Unfriendly staff
7%
Other
3%
Step 1 — Sort first!
Before drawing anything, re-order categories from largest to smallest percentage. This is the defining feature of a Pareto chart.
By Hand — What to Draw
1
Sort categories largest → smallest
2
Draw bars (sorted) on left y-axis (%)
3
Calculate running cumulative %
4
Plot cumulative % line on right y-axis
5
Draw a reference line at 80%
Complaint
%
Cum %
Overpriced
26
26
Small portions
21
47
Food not fresh
16
63
No atmosphere
12
75
Long waiting time
9
84
Unfriendly staff
7
91
Food too salty
6
97
Other
3
100
19 / 28
Activity 6: Pareto Chart & Interpretation
Restaurant Complaints — Pareto Chart
The 80/20 Reading
The cumulative line crosses 80% after the 5th bar. So 5 complaint types account for 84% of all complaints:
Overpriced (26%)
Small portions (21%)
Food not fresh (16%)
No atmosphere (12%)
Long waiting time (9%)
Management implication: The owner should focus on pricing and portion size first — together they account for 47% of all complaints. Fixing salty food or rudeness won't move the needle much.
20 / 28
📊 Activity 6 — Excel Version
Building the Pareto Chart in Excel
Method 1: Excel Built-in (Office 2016+)
1
Enter complaint names in Col A, percentages in Col B (unsorted is fine)
2
Select the data → Insert → Statistical Chart → Pareto
3
Excel automatically sorts and adds the cumulative line
4
Format: add title, label axes
Method 2: Manual (any version)
1
Sort data descending by frequency
2
Add a Cumulative % column
3
Create a Column Chart from sorted frequencies
4
Add Cum% as a second series → Change type to Line → Secondary axis
Common Excel Mistakes
Forgetting to set secondary y-axis for the cumulative line
Not matching the scale: 100% on left must align with 100% on right
Leaving gaps between bars (change series options: Gap Width = 0)
Forgetting to sort the categories first (in Method 2)
Exam tip: In a written exam you may be asked to interpret a Pareto chart rather than draw one. Always read: (1) what are the top 1–2 categories? (2) Where does the 80% line intersect?
21 / 28
Discussion: Bar vs Pie vs Pareto
Use the restaurant data as a test case. Which chart best answers each question?
Question
Best Chart
Why?
"What % of complaints are about price?"
Bar or Pareto
Easy to read one bar's value
"Which 3 complaints should we fix first?"
Pareto
Already sorted, cumulative line shows priority
"What fraction of ALL complaints is food-related?"
Pie
Part-to-whole with grouped categories works
"How do complaints compare across 3 restaurant branches?"
Grouped bar
Side-by-side comparison across groups
Key principle: The Pareto chart outperforms a plain bar chart whenever the goal is to prioritise — it does the sorting for you and shows the cumulative impact.
The 80/20 Rule (Pareto Principle)
Roughly 80% of effects come from 20% of causes. In auditing: 80% of errors come from 20% of account types. In quality control: 80% of defects come from 20% of processes.
22 / 28
Reflection: Same Data, Three Different Lenses
Activities 2, 3 and 4 all used the same 29 workers' commute times. Each chart gave you different information.
Chart
Activity
What it shows
What it hides
Stem-and-leaf
Activity 2
Every individual value, distribution shape, exact duplicates
The histogram is best for the report — it's visual and immediately communicates shape and spread. The stem-and-leaf is best for field work where you need to inspect individual values quickly. The frequency table is the source of truth behind both.
23 / 28
📊 Excel Quick Reference
Summary of all Excel chart types covered today — use this as a checklist when completing exercises at home.
Chart Needed
Excel Insert Path
Key Setting to Change
Histogram
Insert → Statistical Charts → Histogram
Right-click x-axis → Format Axis → Bin width
Frequency Polygon
Insert → Charts → Line with Markers
x-axis values = class midpoints
Ogive
Insert → Charts → Line with Markers
x-axis = upper boundaries, y = cumulative %
Scatter plot
Insert → Charts → Scatter (X,Y)
Right-click → Add Trendline → Linear
Bar chart
Insert → Charts → Bar (horizontal)
Sort data descending before charting
Pie chart
Insert → Charts → Pie → 2D Pie
Add Data Labels → Show Percentage
Pareto chart
Insert → Statistical Charts → Pareto
Auto-sorted; check secondary axis alignment
Always: Add a descriptive title, label both axes with variable names and units, and check that your chart answers the question you're trying to answer before submitting.
24 / 28
Common Errors to Avoid
Stem-and-Leaf Errors
Not sorting the leaves within each row
Using more than one digit as a leaf
Forgetting stems that have no leaves (must still appear)
Using commas between leaves (not allowed)
Frequency Distribution Errors
Overlapping class boundaries (e.g. 0–10 and 10–20 both include 10)
Unequal class widths (unless explicitly required)
Frequencies that don't sum to n
Choosing too many or too few classes
Histogram Errors
Gaps between bars (not allowed — continuous data)
Y-axis not starting at zero
Missing axis labels or chart title
Pareto Chart Errors
Bars not sorted largest → smallest
Cumulative line not matching 100% at the right
Treating a Pareto as just a bar chart (forgetting the cumulative line)
25 / 28
Brain Teasers
Think through these before clicking reveal — they test deeper understanding beyond procedure.
1. A student builds a stem-and-leaf display and then a histogram from the same data. They notice the histogram looks symmetric but the stem-and-leaf shows a cluster of repeated values at the low end. How is this possible?
The histogram grouped values into classes, which can mask patterns within a class. Repeated values at the low end might all fall in the same class, making that bar's height look normal. The stem-and-leaf preserves individual values and reveals the clustering. This is why we always check the stem-and-leaf before finalising a histogram.
2. An accountant's Pareto chart of expense exceptions shows the top 2 categories hitting 79% — just under 80%. Their manager says "the 80/20 rule doesn't apply here." Are they right?
No. The "80/20 rule" is a rough guideline, not an exact threshold. The point is that a small number of categories drive most of the impact. At 79%, the principle still clearly applies — focusing on those 2 categories addresses the vast majority of exceptions. The exact percentage matters less than the insight.
3. Why might an ogive be more useful than a histogram for answering "what percentage of employees take more than 5 days off?"
A histogram shows frequency per class — to answer a threshold question, you'd need to add up multiple bars and estimate within-class proportions. An ogive directly shows the cumulative percentage at any boundary point. You simply read off "% below 5 days" directly, then subtract from 100% to get "more than 5 days". It's purpose-built for this type of question.
26 / 28
Quick Challenge: Pick the Right Chart
For each accounting scenario below, identify the best chart type and briefly explain why. Discuss with your group.
#
Scenario
Best Chart
1
You have 25 sampled invoice amounts and want to check for outliers while preserving every value.
Your answer →
2
You want to show what % of a client's costs are below $10,000.
Your answer →
3
You want to show the monthly pattern of late payments over a full year.
Your answer →
4
You need to identify which 3 audit exception types account for most of the problem.
Your answer →
5
You want to know if advertising spend predicts sales for 8 industry companies.
Your answer →
6
You have 4 business units and want to show their share of total revenue.
Your answer →
1
Stem-and-leaf — preserves individual values
2
Ogive — designed for cumulative % threshold questions
3
Time series — time on x-axis, shows trend
4
Pareto — sorted with cumulative line, 80/20 rule
5
Scatter plot — relationship between two numerical variables
6
Pie chart — 4 categories, part-to-whole, manageable
27 / 28
Assessment Tips for Week 2 Topics
In Written Exams
Always show your class width calculation
Leave no stems out of a stem-and-leaf (even empty ones)
Add axis labels, units, and a title to every chart
For pie charts: state why it is or isn't effective for that data
For Pareto: identify how many categories reach 80%
When Interpreting Charts
Always state the shape (symmetric / skewed?)
Identify any outliers or unusual values
Give an accounting implication (not just a description)
In Excel Assignments
Set histogram bin width to match your class width exactly
For scatter: include a trendline to show direction of relationship
For Pareto: check secondary y-axis aligns 100% left = 100% right
Never leave default chart titles like "Chart 1"
The key question for any chart: Does this graph answer the question I'm trying to answer? If not, choose a different chart.
Week 2 Tutorial Complete
Before Next Week
Review
Re-read lecture slides for any chart type that wasn't clear today
Practice
Complete RP2.8 and PP2.16 in Excel if you didn't finish in class
Excel
Build a Pareto chart from any dataset of your choice — try it both ways (built-in and manual)
Think
Next week: measures of centre and spread — mean, median, standard deviation