WEEK 3 EXERCISES · STATISTICS FOR ACCOUNTING
From Numbers
to Decisions
5 applied problems — compute the statistics, then make the call
HOW THESE EXERCISES WORK
Each question starts with a real business decision to make
Your Excel output is the evidence — not the final answer
Solutions show both the numbers AND a written recommendation
Always ask: What does this number tell a manager?
Q1 — CEO Salary Disclosure Q2 — Pay Equity Review Q3 — Payroll Budget Forecast Q4 — Training Budget Approval Q5 — Audit Risk Assessment
Use ← → arrow keys to navigate · Work each problem before advancing to the solution
◆ Question 1 — Salary Disclosure Decision
Should the CEO Salary Be Included in the "Average" Reported to Staff?
👤
HR Director — Meridian Solutions
Ahead of the annual compensation review, staff have requested a communication about "average salary at Meridian." The CEO's salary ($135K) is 85% above the next highest earner. You must decide which figure to publish — and how to defend it if challenged.
Your decision: Which measure of centre do you report? How do you justify it with evidence?
Dataset — n = 25 annual salaries ($000), sorted
42 44 45 47 48 50 51 52 53 54
55 56 56 57 58 60 62 64 65 68
70 72 75 78 135

Red = CEO salary

Excel tasks:

  1. Compute mean, median, mode and trimmed mean (excluding top and bottom 1 value)
  2. Compute Q₁, Q₃, IQR using QUARTILE.INC — then apply the IQR fence rule to formally test whether $135K is an outlier
  3. Compute standard deviation and CV for the full dataset and again excluding the CEO
  4. Based on (a)–(c): write a 2-sentence recommendation to the HR Director advising which figure to report and why

IQR fence rule

Lower fence = Q₁ − 1.5 × IQR
Upper fence = Q₃ + 1.5 × IQR
Any value outside these bounds is a statistical outlier. This is the method Excel's boxplot uses internally.

Why this decision is harder than it looks

Reporting the mean is technically correct — but it may mislead 24 out of 25 staff about what they can expect to earn. Reporting the median avoids this — but some stakeholders will question why you're not using "the average." You need evidence-based justification.

Think ahead

What happens to the mean, SD, and CV when you remove the CEO? The difference between the full-dataset and trimmed statistics is your quantitative evidence for the recommendation.

✔ Solution 1A — Statistics & Outlier Test
Q1: The Numbers That Drive the Decision
EXCEL LAYOUT — Salaries in A2:A26
C2
Mean
=AVERAGE(A2:A26)
C3
Median
=MEDIAN(A2:A26)
C4
Trimmed Mean
=TRIMMEAN(A2:A26,2/25)
C5
Q1
=QUARTILE.INC(A2:A26,1)
C6
Q3
=QUARTILE.INC(A2:A26,3)
C7
IQR
=C6-C5
C8
Upper Fence
=C6+1.5*C7
C9
Outlier?
=IF(MAX(A2:A26)>C8,"YES","NO")
C10
SD (full)
=STDEV.S(A2:A26)
C11
SD (excl. CEO)
=STDEV.S(A2:A25)
Q1=50.50 Q3=66.50 IQR=16.00 Upper fence = 66.50 + 24.00 = 90.50 135 > 90.50 → OUTLIER confirmed ✓ Mean (full) = 60.68 Mean (excl.CEO) = 57.13 SD (full) = 20.36 SD (excl.CEO) = 10.89
1
Compute
Mean 60.68
Median 56.00
Trim'd 58.65
2
Test outlier
$135K exceeds
upper fence
$90.5K ✓
3
Quantify impact
CEO inflates
mean by $3.55K
SD by $9.47K
4
Decide
Report
median — see
below
📋 HR Recommendation
Report the median ($56,000) as the representative salary
The CEO salary ($135,000) is a formal statistical outlier — it sits $44,500 above the IQR upper fence. Including it inflates the reported mean by $3,550 and the standard deviation by $9,470, misrepresenting the experience of 24 out of 25 staff. The median of $56,000 is the appropriate measure of centre for a right-skewed distribution. Both figures should be disclosed in full communications, with a note explaining why the median is the more representative measure.
Caveat: if the goal is total payroll cost modelling, use the mean — all values matter. The choice of measure depends on the purpose of the communication.
✔ Solution 1B — Draft Communication to Staff
Q1: What Does the Written Recommendation Look Like?
Three Measures — How They Differ and Why 50 60 70 Median $56,000 Trimmed $58,650 Mean $60,680 CEO pulls mean up $4,680 Median is most representative for right-skewed distributions Mean is appropriate only for total payroll cost modelling

Key principle: match the measure to the purpose

There is no single "correct" measure. The mean is correct for summing to a total (payroll budget). The median is correct for representing a typical individual. Always state which you're using and why.

STAFF COMMUNICATION — DRAFTPrepared by: HR Director
TO:
All Meridian Solutions staff
RE:
Compensation Review — Salary Statistics

As part of this year's compensation review, we are sharing the following salary information for transparency.

The median annual salary at Meridian is $56,000. This means half of all employees earn below this figure and half earn above it. We are reporting the median rather than the arithmetic mean ($60,680) because one salary sits significantly above the rest and would give a misleading impression of what a typical Meridian employee earns.

The middle 50% of salaries fall between $50,500 and $66,500, giving an interquartile range of $16,000.

What makes this memo defensible?

It discloses both the median and the mean, explains why the median is preferred, and provides the IQR range — so no individual figure can be accused of being cherry-picked. This is the standard expected of a professional accountant preparing financial communications.

◆ Question 2 — Pay Equity Investigation
Which Department Has a Pay Equity Problem?
⚖️
Compensation Review Committee — Meridian Solutions
A staff survey flagged concerns about pay inconsistency. The Committee must identify which department shows greater relative salary dispersion — a potential indicator of ad-hoc negotiated pay, role overlap, or undisclosed pay gaps. The department with higher relative dispersion will undergo a full pay equity audit.
Your decision: Which department do you flag for investigation — Finance or Operations? Justify your recommendation using CV, not just standard deviation.

Finance Team — n = 8 ($000)

55  62  65  70  72  78  85  92

Operations Team — n = 10 ($000)

42  44  45  47  48  50  51  52  53  54

Excel tasks:

  1. Compute mean and standard deviation for each team using AVERAGE and STDEV.S
  2. Compute CV (%) = (SD / Mean) × 100 for each team
  3. A colleague argues: "Finance is clearly more variable — the standard deviation is more than three times larger." Is this reasoning valid? Calculate what the colleague is relying on and identify the flaw
  4. Based on CV: write a one-paragraph recommendation to the Committee identifying which team to audit and why

The trap: absolute vs relative spread

Finance salaries average $72K — Operations average $49K. Of course Finance has a larger absolute SD — its values are higher. CV asks a different question: relative to what each team earns, which team has more internal inconsistency?

What high CV signals in pay equity

A high CV within a team suggests salaries are not consistently banded by role or seniority. This can indicate: ad-hoc salary negotiations, unexplained pay gaps (potentially by gender or tenure), or blurred role boundaries where junior and senior staff receive very different pay for similar work.

Think ahead to the recommendation

Your recommendation needs to state: which team, which metric you're using to justify it, and what specific action the Committee should take (e.g., commission a salary banding review, request justification for the $37K spread).

✔ Solution 2 — CV Analysis & Investigation Recommendation
Q2: Relative Dispersion Tells a Different Story
EXCEL — Finance in B2:B9, Operations in D2:D11
Measure
Finance
Operations
13
Mean
=AVERAGE(B2:B9)
=AVERAGE(D2:D11)
14
Std Dev
=STDEV.S(B2:B9)
=STDEV.S(D2:D11)
15
CV (%)
=B14/B13*100
=D14/D13*100
Finance: Mean=72.38 SD=13.18 CV=18.2% Operations: Mean=48.60 SD= 4.22 CV= 8.7% Colleague's claim: Finance SD (13.18) > Ops SD (4.22) ✓ BUT: Finance mean (72.38) > Ops mean (48.60) Finance CV (18.2%) > Ops CV (8.7%) — Finance is MORE dispersed The colleague's reasoning ignores scale — it is flawed.
SD vs CV — Two Different Stories Std Dev ($000) CV (%) 13.18 Finance 4.22 Operations → Finance looks "more variable" 18.2% Finance 8.7% Operations → Finance CV still > Ops (same conclusion)
📋 Committee Recommendation
Flag the Finance team for a pay equity audit
Finance has a CV of 18.2% compared to Operations' 8.7% — more than double the relative salary dispersion. The Finance salary range spans $55K–$92K, a $37K spread within a team of only 8. This degree of variation is inconsistent with structured salary banding and warrants review of individual salary justifications, particularly for the two highest-paid Finance staff.
Note: the colleague's SD comparison is factually correct but misleading — it does not control for the different pay scales of the two teams.
◆ Question 3 — Payroll Budget Forecast
Can Meridian Afford Its Payroll in Three Years' Time?
📊
CFO — Meridian Solutions
The board is considering a 3-year strategic plan and needs a reliable payroll cost projection for 2027. The current total payroll is $1,400,000. Growth has been uneven — strong in some years, negative in one. The CFO must choose the right growth rate to use, present the projection, and flag the financial risk if the trend continues.
Your decision: What is the projected 2027 payroll? Is the current growth trajectory financially sustainable?
YearPayroll changeGrowth factor (1 + R)
2020+8.0%1.0800
2021+12.0%1.1200
2022−5.0%0.9500
2023+15.0%1.1500
2024+6.0%1.0600

Excel tasks:

  1. Use GEOMEAN on the growth factors to find the CAGR. Show why you cannot apply GEOMEAN directly to the percentage rates
  2. Compute the arithmetic mean rate and show that it overstates the CAGR
  3. Project 2027 payroll: =1400000*(1+CAGR)^3
  4. The board's budget cap is $1,750,000 in 2027. Does the projection stay within budget? By how much?
  5. Write a 3-sentence CFO briefing note summarising the projection and budget risk

Why you cannot use the arithmetic mean here

Payroll compounds. A 15% increase in 2023 applies to a base that was already reduced 5% in 2022. The arithmetic mean treats each year's rate as if it applies to the same base — it overestimates what actually happened. The geometric mean correctly accounts for compounding.

Projection formula

P₂₀₂₇ = P₂₀₂₄ × (1 + CAGR)³

In Excel: =1400000*(1+B9)^3 where B9 holds your CAGR

Budget risk framing

The question is not just "what is the number" — it's "is the number a problem?" Calculate the headroom (budget cap minus projection) and flag whether the trend is sustainable if growth accelerates.

✔ Solution 3 — CAGR, Projection & Budget Risk
Q3: From Compound Growth Rate to CFO Briefing Note
EXCEL — Growth factors in B2:B6, rates in C2:C6
9
CAGR (Geometric)
=GEOMEAN(B2:B6)-1
10
Arith. Mean Rate
=AVERAGE(C2:C6)
11
2027 Projection
=1400000*(1+B9)^3
12
Budget cap
1750000
13
Headroom
=B12-B11
14
Within budget?
=IF(B11<=B12,"YES","NO")
CAGR (geometric) = 6.97% Arith. mean rate = 7.20% ↑ overstates by 0.23% 2027 Projection = 1,400,000 × (1.0697)³ = $1,713,840 Budget headroom = 1,750,000 − 1,713,840 = $36,160 Within budget? = YES — but margin is thin
1
CAGR
6.97%
(geometric)
2
Project
$1,713,840
by 2027
3
Compare
Cap $1.75M
Headroom $36K
4
Advise
Within budget
but at risk
📋 CFO Briefing Note
Payroll within budget but headroom is critically thin
Applying the compound annual growth rate of 6.97% (2020–2024) to the current payroll of $1,400,000 yields a 2027 projection of $1,713,840 — within the board-approved $1,750,000 cap by a margin of only $36,160 (2.1% headroom). This margin would be eliminated by a single year of above-trend growth (for instance, a repeat of the 2023 +15% increase). If the CAGR rises to just 7.7% or above, the projection breaches the cap.
Recommendation: approve the strategic plan but establish a payroll freeze trigger at $1,700,000 — the point at which any additional hires or pay rises would require board sign-off.
◆ Question 4 — Training Investment Decision
Is the Training Program Worth the Investment?
🎓
Operations Manager — Meridian Solutions
The training team has proposed a $45,000 investment to increase average compliance training from 20 to 28 hours per staff member. They claim it will significantly reduce payroll processing errors. The Operations Manager needs statistical evidence to approve or reject the budget request — and must assess how robust the evidence actually is.
Your decision: Does the correlation evidence support approving the $45,000 training budget? What caveats must you disclose?
IDTraining hrs (X)Annual errors (Y)
A1215
B1812
C258
D305
E1514
F2210
G286
H1017
I2011
J353
K819
L324

Excel tasks:

  1. Calculate CORREL(X,Y) and COVARIANCE.S(X,Y). What does the sign of the covariance tell you?
  2. Create a scatterplot (Insert → Scatter). Add a trendline. Does the visual confirm r?
  3. Sensitivity test: Remove Employee J (35 hrs, 3 errors). Recompute r for the remaining 11 employees. By how much does r change? What does this tell you about whether the result is driven by one person?
  4. Each error costs Meridian an estimated $800 to correct. The current average is ~10.5 errors/person. If moving from 20→28 hours reduces errors proportionally with r, estimate the annual savings across 12 staff. Does this justify $45,000?
  5. Write the recommendation — approve or reject — with a clear statement of what the data does and does not prove

Part (d) hint — back-of-envelope ROI

Use the trend to estimate: at 28 hours, predicted errors ≈ ? Compare to 20 hours. Multiply the reduction by 12 staff × $800. Compare to $45,000 investment.

✔ Solution 4A — Correlation & ROI Calculation
Q4: Statistical Evidence + Financial Return
EXCEL — Training in A2:A13, Errors in B2:B13
16
Covariance
=COVARIANCE.S(A2:A13,B2:B13)
17
Pearson's r
=CORREL(A2:A13,B2:B13)
18
r (excl. J)
=CORREL(A2:A11,B2:B11) *
19
Errors @ 20 hrs
≈ 10.74 (from trendline)
20
Errors @ 28 hrs
≈ 6.14 (from trendline)
21
Reduction / staff
≈ 4.60 errors
22
Annual savings
=B21*12*800
Covariance = −78.52 (negative = inverse relationship) r (all 12) = −0.9953 r² = 99.1% r (excl. J) = −0.9938 Δr = −0.0015 → J has minimal influence Trendline approx: Errors ≈ 22.5 − 0.575 × hours @ 20 hrs: 22.5 − 0.575×20 = 10.75 @ 28 hrs: 22.5 − 0.575×28 = 6.40 Δ = 4.35 Savings: 4.35 × 12 staff × $800 = $41,760 / yr Investment: $45,000 → Payback ≈ 13 months
1
Correlation
r = −0.995
Very strong
2
Robust?
Removing J
changes r by
0.0015 only
3
ROI
$41,760
saved/yr vs
$45K cost
4
Decision
Approve
with caveats
📋 Operations Manager Recommendation
Approve the training investment — with two conditions
The correlation between training hours and errors is −0.995 (r² = 99.1%), one of the strongest relationships observable in business data. The result is robust — removing the highest-trained employee changes r by only 0.0015. Projecting from the trendline, increasing average training from 20 to 28 hours is expected to reduce errors by ~4.35 per employee, generating estimated annual savings of $41,760 across 12 staff — a payback period of approximately 13 months on a $45,000 investment.
Conditions: (1) Correlation ≠ causation — other factors (experience, role complexity) may explain part of the relationship. Track actual error rates post-training to validate. (2) Conduct a follow-up study at n ≥ 30 before scaling the program firm-wide.
✔ Solution 4B — What the Scatterplot Shows
Q4: Why "Eyeballing" the Graph Matters Before Running r
Training Hours vs Payroll Errors — Always graph first Training Hours Errors 8 20 32 35 0 10 20 J r = −0.995 Near-perfect negative Every point is close to the trendline — the pattern is real and consistent, not driven by one outlier

Why graph before computing r?

A correlation coefficient summarises the linear relationship. A scatterplot reveals whether the relationship is actually linear, whether outliers exist, and whether there are distinct clusters. If the data showed a curved pattern, r would understate the true relationship — and no amount of Excel formula output would reveal this without the graph.

Sensitivity result: J has minimal influence

Removing Employee J changes r from −0.9953 to −0.9938 — a difference of only 0.0015. The conclusion is not driven by a single high-leverage observation. Every point follows the trend closely. This strengthens the business case for the training investment.

Professional standard: always disclose what the data cannot prove

A rigorous professional recommendation separates what the data shows (strong association) from what it proves (causation). Approving $45K on correlational evidence alone is defensible — but only if you state the assumption being made and commit to a follow-up evaluation to confirm the effect.

◆ Question 5 — Audit Risk Assessment from Grouped Data
Is the Invoice Population Unusual Enough to Justify Expanded Testing?
🔍
Junior Auditor — Meridian Solutions (client engagement)
Your audit client has provided a summary frequency table of 57 accounts payable invoices. Individual invoice records are not available at this stage. Your audit manager wants a preliminary risk assessment: is the average invoice value unusual, and is the spread high enough to suggest that the population may contain material misstatements? You must work from the grouped data only.
Your decision: Does the invoice data show sufficient dispersion and skew to warrant expanding your audit sample beyond the standard 15%?
Invoice amount ($)Freq (f)Midpoint (m)
0 – <50012250
500 – <1,00018750
1,000 – <2,000141,500
2,000 – <5,00083,500
5,000 – <10,00057,500
Total57

Excel tasks:

  1. Compute ∑(m × f) and the approximate mean using SUMPRODUCT or helper columns
  2. Compute the approximate standard deviation using ∑f(m − x̄)²
  3. The actual mean from raw data is $1,847. What is your approximation error (%)? Is it acceptable for preliminary risk assessment?
  4. Compare CV with a "benchmark" invoice population where mean = $1,200 and SD = $400. Is this client's invoice population more or less dispersed than the benchmark?
  5. Write a brief audit risk note (3–4 sentences) recommending whether to expand sample testing, with quantitative justification

Audit risk and statistical dispersion

High standard deviation in invoice amounts signals a heterogeneous population — large and small invoices coexist, making it more likely that a random sample misses high-value anomalies. Auditors use CV to compare dispersion across client populations and benchmark years.

Why grouped data is common in audit

Clients often provide trial balance summaries, ageing schedules, or purchase reports in banded form. Auditors routinely extract estimates from these tables before requesting raw data — particularly in preliminary analytical procedures.

Benchmark comparison hint

Benchmark CV = 400/1200 × 100 = 33.3%. Calculate your client's CV and compare. If client CV is materially higher, this increases audit risk because proportionally more value sits in unpredictable, high-variance transactions.

✔ Solution 5 — Grouped Data Statistics & Audit Risk Note
Q5: Approximate Statistics → Audit Decision
EXCEL — Freq in B2:B6, Midpoints in C2:C6
1
B: Freq
C: Mid
D: m×f
E: f(m−x̄)²
2
12
250
=B2*C2
=B2*(C2-$G$2)^2
8
57
=SUM(D2:D6)
=SUM(E2:E6)
G2
Approx. Mean
=D8/57
G3
Approx. Variance
=E8/(57-1)
G4
Approx. Std Dev
=SQRT(G3)
G5
CV (%)
=G4/G2*100
Approx. Mean = 103,000/57 = $1,807 Actual Mean = $1,847 Error = (1847−1807)/1847 = 2.2% ✓ Approx. SD$1,953 Client CV = 1953/1807 × 100 = 108.1% Benchmark CV = 400/1200 × 100 = 33.3% Client CV is 3.2× the benchmark → high dispersion
1
Estimate mean
$1,807
2.2% error
acceptable
2
Estimate spread
SD ≈ $1,953
CV = 108%
very high
3
Benchmark
Client CV
3.2× above
benchmark
4
Decision
Expand
sample ✓
📋 Audit Risk Note
Recommend expanding audit sample to 25% — elevated dispersion risk
Preliminary analytical procedures on the 57-invoice accounts payable population (grouped data) produce an estimated mean of $1,807 (2.2% from the actual $1,847) and an approximate standard deviation of $1,953 — yielding a CV of 108%. This is 3.2 times the benchmark CV of 33.3%, indicating a highly heterogeneous invoice population where large transactions are underrepresented in a standard random sample. Five invoices exceed $5,000, representing only 8.8% of count but a disproportionate share of total value.
Action: expand sample from 15% (9 invoices) to 25% (15 invoices), ensuring all 5 invoices in the $5,000–$10,000 band are tested. Request raw data from client to refine the estimate prior to final reporting.
Excel Reference Card
Every function used across the five exercises — plus guidance on when to use each one
📋 Reference — Functions, Choices & Decision Rules
Excel & Decision Quick-Reference

Central Tendency Functions

=AVERAGE(range)
=MEDIAN(range)
=MODE.SNGL(range)
=TRIMMEAN(range, 2k/n)
=GEOMEAN(factors)−1

When to use which

  • Mean — summing to a total (budget)
  • Median — typical individual (skewed data)
  • Geomean — compounding growth (CAGR)

Position

=QUARTILE.INC(r,1/2/3)
=PERCENTILE.INC(r, k)

Spread Functions

=STDEV.S(range) ← sample
=STDEV.P(range) ← population
=VAR.S / VAR.P
=MAX(r)−MIN(r) ← range
=STDEV.S/AVERAGE*100 ← CV

Outlier Fences

=Q1−1.5*IQR ← lower
=Q3+1.5*IQR ← upper
=IF(MAX(r)>fence,"YES","NO")

Relationships

=CORREL(x_range, y_range)
=COVARIANCE.S(x, y)

Decision Rules Summary

StatDecision signal
Mean > MedianRight skew — report median
Value > Upper fenceFormal outlier — investigate or remove
CV > 30%High relative spread — check for banding issues
r < −0.8Strong negative link — consider causal study
Geomean < Arith meanVolatile growth — geomean is the correct CAGR
Approx. error < 5%Grouped data estimate acceptable for preliminary use

The three-step professional output

1. State the result: "The CV is 18.2%."
2. Interpret it: "This indicates above-average relative dispersion."
3. Make the call: "Recommend a pay equity audit for the Finance team."

A statistical result that doesn't end in a decision or recommendation is incomplete. Every number in professional practice is evidence for or against an action.

Sample (.S) vs Population (.P)

Use .S when your data is a sample from a larger group (default in most business settings). Use .P only when you have data on every member of the population — rare in practice.