Statistics for Accounting — Week 4

Introduction
to Probability

Sample spaces, conditional probability, Bayes’ theorem, and counting rules

3 Excel exercises with step-by-step solutions included

By the end of this week, you will be able to…

Demonstrate understanding of basic probability concepts including sample spaces, events, contingency tables, marginal and joint probability

Use and compute conditional probabilities from contingency tables and probability matrices

Apply the addition and multiplication rules to find union and intersection probabilities

Explain and demonstrate statistical independence and mutual exclusivity

Apply Bayes’ theorem to update probabilities based on new evidence

Use counting rules—permutations and combinations—to count sample space outcomes

Section 01

Probability Foundations

Sample spaces, events, and measuring uncertainty with relative frequencies

What is Probability?

Probability measures the likelihood of an outcome occurring. It always lies between 0 and 1.

Core Definition

P(Event) = Number of favourable outcomes / Total outcomes in sample space

Key Vocabulary

Experiment — any process producing an outcome
Sample Space (S) — set of all possible outcomes
Event — a subset of the sample space
Elementary event — exactly one outcome

Accounting context: When an auditor randomly selects a transaction to test, the sample space is all transactions in the population. The event “transaction contains an error” is a subset of that space.

Relative Frequency (PP4.1)

If A occurs 1050 times out of 3000 total:

P(A) = 1050 / 3000 = 0.35

The total of all mutually exclusive probabilities must sum to 1.0.

Complement Rule

The complement of event A (written A′ or Â) contains all outcomes not in A.

Complement Rule

P(A′) = 1 − P(A)

The event and its complement together exhaust the sample space:

P(A) + P(A′) = 1

PP4.8: If German parts arrive late with P = 0.08, then the probability of not arriving late = 1 − 0.08 = 0.92.

Why it matters for accounting: If you know the probability of a transaction being correct, the complement gives you the error probability—useful for audit risk assessment.

Section 02

Contingency Tables & Probability Matrices

Organising two-variable data to find joint and marginal probabilities

Contingency Tables

A contingency table cross-tabulates counts for two categorical variables. Divide every cell by the grand total to produce a probability matrix.

PP4.12: n = 50 cases

	D	E	Total
A	16	8	24
B	10	6	16
C	8	2	10
Total	34	16	50

Divide by 50 →

	D	E	Total
A	0.32	0.16	0.48
B	0.20	0.12	0.32
C	0.16	0.04	0.20
Total	0.68	0.32	1.00

Reading the matrix

Marginal probability — row or column totals
e.g. P(A) = 0.48
Joint probability — interior cells
e.g. P(A ∩ E) = 0.16 (highlighted)
Grand total — always 1.00

Marginal & Joint

P(A) = 24/50 = 0.48 [marginal]
P(A ∩ E) = 8/50 = 0.16 [joint]

Think of it like this: marginal probabilities live in the margins; joint probabilities live in the interior cells.

Section 03

The Addition Rule

Finding the probability of A or B occurring (union)

Addition Rule — P(A ∪ B)

General Addition Rule

P(A ∪ B) = P(A) + P(B) − P(A ∩ B)

We subtract the intersection to avoid double-counting outcomes in both events.

Mutually Exclusive Events (special case)

If A ∩ B = ∅, then:
P(A ∪ B) = P(A) + P(B)

PP4.20 Health Survey:
P(cancer) = 0.60, P(heart) = 0.29, P(both) = 0.25
P(cancer or heart) = 0.60 + 0.29 − 0.25 = 0.64

Mutually exclusive means events cannot happen at the same time. E.g., a company cannot simultaneously be “profitable” and “at a loss” in the same period.

Section 04

Conditional Probability & the Multiplication Rule

Updating probabilities when we know something has already occurred

Conditional Probability

P(A | B) reads: “the probability of A given that B has already occurred.”

Conditional Probability

P(A | B) = P(A ∩ B) / P(B)

Multiplication Rule (rearranged)

P(A ∩ B) = P(A | B) × P(B)
= P(B | A) × P(A)

PP4.22: Given P(A) = 0.40, P(B | A) = 0.25:

P(A ∩ B) = 0.25 × 0.40 = 0.10

Activity 4.1 — Office Temperature

120 workers; 70 start early, 50 start late. Of early starters, 30 prefer 24°C (40 prefer 20°C). Of late starters, 23 prefer 20°C (27 prefer 24°C).

	24°C	20°C	Total
Early	30	40	70
Late	27	23	50
Total	57	63	120

P(Early | 20°C) = P(Early ∩ 20°C) / P(20°C)
= (40/120) / (63/120) = 40/63 ≈ 0.635

Section 05

Statistical Independence

When knowing one event tells us nothing about another

Statistical Independence

Events A and B are independent if knowing B occurred gives no information about A.

Test for Independence

P(A | B) = P(A) or equivalently:
P(A ∩ B) = P(A) × P(B)

PP4.30 — Museum Visits

	Visited	Not visited	Total
Children <10	160	80	240
No children	40	120	160
Total	200	200	400

Independence test:

P(Visited) = 200/400 = 0.50
P(Visited | Children) = 160/240 = 0.667

0.667 ≠ 0.50 → NOT independent

Having young children increases the probability of visiting a museum — the variables are associated.

Independence ≠ Mutually exclusive. Independent events can both occur. Mutually exclusive events cannot. A company having both a tax issue and an OH&S breach in the same year is possible (not mutually exclusive), but the two events may or may not be statistically independent.

Section 06

Bayes’ Theorem

Reversing the direction of conditioning to update prior beliefs

Bayes’ Theorem

P(A | B) = P(B | A) × P(A)
——————————
P(B)

Where P(B) = P(B|A)P(A) + P(B|A′)P(A′)

Activity 4.3 — Traffic Violations

P(age ≥30) = 0.60 | P(age <30) = 0.40
P(violation | ≥30) = 0.04
P(violation | <30) = 0.10
Given a violation → P(under 30)?

Solution

P(B) = 0.04×0.60 + 0.10×0.40
      = 0.024 + 0.040 = 0.064

P(<30 | violation) = (0.10 × 0.40) / 0.064
                     = 0.040 / 0.064
                     = 0.625

Although drivers under 30 are only 40% of all drivers, they account for 62.5% of all violations — a powerful result from updating our prior belief with new evidence.

Accounting use: Bayes’ theorem underpins audit sampling—updating the probability of material misstatement after each audit procedure.

Section 07

Counting Rules

Permutations, combinations, and the fundamental counting principle

Counting Rules

Fundamental Counting Principle

If there are m ways to do one thing and n ways to do another, there are m × n ways total.

Act 4.6 (Licence Plates):
26³ × 10³ = 17,576,000 plates

Permutations (order matters)

nPr Formula

ⁿP_r = n! / (n−r)!

Act 4.5 (Race): 9P3 = 9!/(9−3)! = 504 ways to award 1st, 2nd, 3rd

Combinations (order does NOT matter)

nCr Formula

ⁿC_r = n! / [r! × (n−r)!]

Act 4.7 (Lottery): 45C6 = 45! / (6! × 39!) = 8,145,060 combinations

Memory tip:
Permutation = Position matters (podium places).
Combination = Choosing a committee (no ranking).

Act 4.8 (High Court):
7C3 × 4C2 = 35 × 6 = 210 ways

Excel Exercises

Three Applied Exercises

Build probability skills in Excel using realistic accounting and audit scenarios. Complete each exercise before checking the step-by-step solutions at the end of this deck.

Exercise 1

Excel

Audit Compliance Probability Matrix

Scenario: You are an audit manager at a mid-tier accounting firm. Your team reviewed 200 client files, classified by industry and compliance status after the most recent audit.

	Compliant	Minor Issues	Major Issues	Total
Financial Services	48	12	4	64
Retail	52	18	6	76
Construction	32	20	8	60
Total	132	50	18	200

Your Tasks

A
Build this table in Excel and create a probability matrix (divide all cells by the grand total)
B
Compute marginal probabilities for each industry and each compliance status
C
Find P(Major Issues) and P(Construction ∩ Major Issues)
D
Using the addition rule, find P(Retail ∪ Major Issues)
E
Find P(Major Issues | Construction) and compare it to P(Major Issues). Are compliance issues and being in Construction independent?

Exercise 2

Excel

Credit Card Ownership & Conditional Probability

Scenario: A financial services firm surveys 500 clients about credit card ownership and whether they have ever missed a payment. Use conditional probability to assess default risk by card type.

	Never missed	Missed ≥1	Total
Platinum Card	140	35	175
Standard Card	220	105	325
Total	360	140	500

Your Tasks

A
Convert the table to a probability matrix in Excel
B
Compute P(Missed ≥1 payment) and P(Platinum Card)
C
Calculate P(Missed | Platinum) and P(Missed | Standard) using the conditional probability formula
D
Test whether card type and payment behaviour are statistically independent. Show your working.
E
Use the multiplication rule to verify P(Platinum ∩ Missed) directly from your conditional probabilities

Exercise 3

Excel

Fraud Detection with Bayes’ Theorem

Scenario: A forensic accounting team uses an automated AI-based transaction screening tool to flag potentially fraudulent transactions. The following performance data is known from prior experience:

Prior Information
P(Fraud)	3% of all transactions are fraudulent
P(Flag \| Fraud)	System correctly flags 92% of fraudulent transactions
P(Flag \| No Fraud)	System incorrectly flags 6% of legitimate transactions (false positive rate)

Key question: An invoice has been flagged. What is the actual probability that it is fraudulent?

Your Tasks

A
Set up a Bayes table in Excel with columns: Hypothesis, Prior P(H), Likelihood P(Flag|H), Joint P(Flag ∩ H), Posterior P(H|Flag)
B
Calculate the total probability of being flagged P(Flag) using the law of total probability
C
Apply Bayes’ theorem to find P(Fraud | Flagged)
D
Interpret the result. Why is the posterior probability so much lower than you might expect given the system’s 92% detection rate?
E
Sensitivity analysis: In Excel, change the base rate P(Fraud) from 3% to 10%. How does P(Fraud | Flagged) change? What does this tell you about audit risk?

Solutions

Step-by-Step Excel Solutions

Detailed instructions for all three exercises, including every formula to type

Exercise 1 — Solution Part A–C

Excel

Audit Compliance Matrix — Setup & Marginal Probabilities

Step 1: Enter the Raw Data

In cell A1 type Industry. In B1: Compliant, C1: Minor Issues, D1: Major Issues, E1: Total.

Enter industry names in A2:A4 (Financial Services, Retail, Construction) and row totals in E2:E4. Enter all count data in B2:D4.

In row 5, compute column totals: =SUM(B2:B4) in B5, then copy across to D5. Grand total in E5: =SUM(E2:E4). Verify it equals 200.

Step 2: Build the Probability Matrix

In cell G1, start a duplicate table. In H2: =B2/$E$5. Note the $ signs lock the grand total. Copy this formula across H2:J4.

Format all probability cells: select range, right-click → Format Cells → Number → 4 decimal places.

Step 3: Marginal & Joint Probabilities

Row marginals (industry totals): in K2: =SUM(H2:J2). Copy to K3:K4.

Column marginals (compliance status): in H5: =SUM(H2:H4). Copy to I5:J5.

Answers for Part C

P(Major Issues)18/200 = 0.0900

P(Construction ∩ Major Issues)8/200 = 0.0400

Answer for Part D — Addition Rule

P(Retail ∪ Major) = P(Retail) + P(Major) − P(Retail ∩ Major)
= 0.38 + 0.09 − 0.03 = 0.44

In a new cell: =K3+J5-J3 (Retail marginal + Major Issues marginal − intersection cell). Should return 0.44.

Exercise 1 — Solution Part E

Excel

Audit Compliance Matrix — Independence Test

Step 4: Conditional Probability

P(Major Issues | Construction) = P(Construction ∩ Major Issues) / P(Construction)
In Excel: =J4/K4

P(Major | Construction)0.04 / 0.30 = 0.1333

P(Major Issues) [marginal]0.0900

Independence check:
If independent: P(Major | Construction) should equal P(Major Issues)
0.1333 ≠ 0.0900 → NOT independent

In a labelled cell, type: =IF(ABS(J4/K4-J5)<0.001,"Independent","NOT Independent") to automate the test.

Interpretation

Construction clients are 48% more likely to have Major Issues than the overall population suggests (13.3% vs 9.0%). This is an actionable audit insight: Construction should receive higher-intensity audit procedures.

Alternative independence check

If independent: P(A ∩ B) = P(A) × P(B)
Expected P(Construction ∩ Major) = 0.30 × 0.09 = 0.027
Actual P(Construction ∩ Major) = 0.04
0.04 ≠ 0.027 → NOT independent

In Excel: =K4*J5 gives 0.0270; compare to =J4 which gives 0.0400. Use =IF(ABS(J4-(K4*J5))<0.001,"Independent","NOT Independent").

Exercise 2 — Solution

Excel

Credit Card Ownership — Full Solution

Setup

Enter the raw counts in B2:C3 with row and column headers. Grand total 500 in D4.

Probability matrix: =B2/$D$4 in F2, copy to F2:G3. Row/column marginals via =SUM().

Part B — Marginal Probabilities

P(Missed ≥1 payment)140/500 = 0.2800

P(Platinum Card)175/500 = 0.3500

Part C — Conditional Probabilities

P(Missed | Platinum) = P(Platinum ∩ Missed) / P(Platinum)
In Excel: =G2/H2 where G2 = joint, H2 = Platinum marginal.

P(Missed | Platinum)35/175 = 0.2000

P(Missed | Standard)105/325 = 0.3231

Part D — Independence Test

P(Missed) = 0.2800
P(Missed | Platinum) = 0.2000
P(Missed | Standard) = 0.3231

0.2000 ≠ 0.2800 → NOT independent

Platinum card holders miss payments at a lower rate (20%) than the overall population (28%). Card type and payment behaviour are associated. Standard cardholders carry more default risk.

Part E — Multiplication Rule Verification

P(Platinum ∩ Missed) via multiplication rule:
= P(Missed|Platinum) × P(Platinum)
= 0.20 × 0.35 = 0.0700

Verify directly: 35/500 = 0.0700. Both methods agree.

In Excel: =G2/H2*H2 should equal =G2. Use this to confirm your formulas are consistent.

Exercise 3 — Solution Parts A–C

Excel

Fraud Detection — Bayes Table Setup

Step 1: Set Up the Bayes Table

Create the following column headers in row 1:
A1: Hypothesis
B1: Prior P(H)
C1: Likelihood P(Flag|H)
D1: Joint P(Flag ∩ H)
E1: Posterior P(H|Flag)

Enter hypotheses in A2:A3: “Fraud” and “Not Fraud”.

Priors in B2:B3: 0.03 and 0.97. Confirm they sum to 1: =SUM(B2:B3) should return 1.

Likelihoods in C2:C3: 0.92 (true positive rate) and 0.06 (false positive rate).

Step 2: Compute Joints & Posterior

Joint probabilities in D2: =B2*C2. In D3: =B3*C3.
These represent P(Flagged AND Fraud) and P(Flagged AND Not Fraud).

Total probability of being flagged in D4: =SUM(D2:D3). This is P(Flagged).

Posterior probabilities in E2: =D2/$D$4. In E3: =D3/$D$4. Verify =SUM(E2:E3) equals 1.

Answers

P(Flag) = total probability0.0276 + 0.0582 = 0.0858

P(Fraud | Flagged)0.0276 / 0.0858 = 0.3217

P(Not Fraud | Flagged)0.0582 / 0.0858 = 0.6783

Exercise 3 — Solution Parts D–E

Excel

Fraud Detection — Interpretation & Sensitivity

Part D — Why only 32%?

Even though the system detects 92% of fraud, a flagged transaction is only fraudulent 32% of the time. This is because fraud is very rare (only 3% base rate), so the absolute number of false positives overwhelms the true positives.

Out of 10,000 transactions:
• 300 fraudulent → 276 correctly flagged (TP)
• 9,700 legitimate → 582 incorrectly flagged (FP)

Of 858 total flags: only 276 are real fraud
276/858 = 0.32 → 32% precision

This is the base rate fallacy — a high detection rate does not guarantee a high posterior probability when the prior is low. Critical lesson for audit analytics.

Part E — Sensitivity Analysis in Excel

Change B2 from 0.03 to 0.10 (and B3 to 0.90). All formulas update automatically because you used cell references.

P(Fraud) base rate	P(Fraud \| Flagged)
1%	0.1341
3%	0.3217
10%	0.6311
20%	0.7931

Insight for auditors: When reviewing a high-risk client (higher P(Fraud) prior), the same flagging tool becomes far more reliable. Audit resources should be concentrated where fraud base rates are known to be elevated — exactly the rationale for risk-based auditing.

To automate a sensitivity table: use Excel’s Data → What-If Analysis → Data Table with B2 as the row input cell and E2 as the output formula.

Week 4 Complete

Summary

This week connected probability theory to real accounting practice: from reading contingency tables, applying the addition and multiplication rules, testing independence, through to updating beliefs using Bayes’ theorem. The fraud detection exercise demonstrates why base rates matter — a lesson central to audit risk assessment and forensic accounting.

Exercise 1

Audit Compliance
Probability Matrix

Exercise 2

Credit Card
Conditional Probability

Exercise 3

Fraud Detection
Bayes’ Theorem

Introductionto Probability

By the end of this week, you will be able to…

Probability Foundations

What is Probability?

Key Vocabulary

Relative Frequency (PP4.1)

Complement Rule

Contingency Tables & Probability Matrices

Contingency Tables

Reading the matrix

The Addition Rule

Addition Rule — P(A ∪ B)

Conditional Probability & the Multiplication Rule

Conditional Probability

Activity 4.1 — Office Temperature

Statistical Independence

Statistical Independence

PP4.30 — Museum Visits

Independence test:

Bayes’ Theorem

Bayes’ Theorem

Activity 4.3 — Traffic Violations

Counting Rules

Counting Rules

Fundamental Counting Principle

Permutations (order matters)

Combinations (order does NOT matter)

Three Applied Exercises

Audit Compliance Probability Matrix

Your Tasks

Credit Card Ownership & Conditional Probability

Your Tasks

Fraud Detection with Bayes’ Theorem

Your Tasks

Step-by-Step Excel Solutions

Audit Compliance Matrix — Setup & Marginal Probabilities

Step 1: Enter the Raw Data

Step 2: Build the Probability Matrix

Step 3: Marginal & Joint Probabilities

Answers for Part C

Answer for Part D — Addition Rule

Audit Compliance Matrix — Independence Test

Step 4: Conditional Probability

Interpretation

Alternative independence check

Credit Card Ownership — Full Solution

Setup

Part B — Marginal Probabilities

Part C — Conditional Probabilities

Part D — Independence Test

Part E — Multiplication Rule Verification

Fraud Detection — Bayes Table Setup

Step 1: Set Up the Bayes Table

Step 2: Compute Joints & Posterior

Answers

Fraud Detection — Interpretation & Sensitivity

Part D — Why only 32%?

Part E — Sensitivity Analysis in Excel

Summary

Introduction
to Probability