Sample spaces, events, and measuring uncertainty with relative frequencies
Probability measures the likelihood of an outcome occurring. It always lies between 0 and 1.
If A occurs 1050 times out of 3000 total:
The complement of event A (written A′ or Â) contains all outcomes not in A.
The event and its complement together exhaust the sample space:
Organising two-variable data to find joint and marginal probabilities
A contingency table cross-tabulates counts for two categorical variables. Divide every cell by the grand total to produce a probability matrix.
PP4.12: n = 50 cases
| D | E | Total | |
|---|---|---|---|
| A | 16 | 8 | 24 |
| B | 10 | 6 | 16 |
| C | 8 | 2 | 10 |
| Total | 34 | 16 | 50 |
Divide by 50 →
| D | E | Total | |
|---|---|---|---|
| A | 0.32 | 0.16 | 0.48 |
| B | 0.20 | 0.12 | 0.32 |
| C | 0.16 | 0.04 | 0.20 |
| Total | 0.68 | 0.32 | 1.00 |
Finding the probability of A or B occurring (union)
We subtract the intersection to avoid double-counting outcomes in both events.
Updating probabilities when we know something has already occurred
P(A | B) reads: “the probability of A given that B has already occurred.”
PP4.22: Given P(A) = 0.40, P(B | A) = 0.25:
120 workers; 70 start early, 50 start late. Of early starters, 30 prefer 24°C (40 prefer 20°C). Of late starters, 23 prefer 20°C (27 prefer 24°C).
| 24°C | 20°C | Total | |
|---|---|---|---|
| Early | 30 | 40 | 70 |
| Late | 27 | 23 | 50 |
| Total | 57 | 63 | 120 |
When knowing one event tells us nothing about another
Events A and B are independent if knowing B occurred gives no information about A.
| Visited | Not visited | Total | |
|---|---|---|---|
| Children <10 | 160 | 80 | 240 |
| No children | 40 | 120 | 160 |
| Total | 200 | 200 | 400 |
Reversing the direction of conditioning to update prior beliefs
Where P(B) = P(B|A)P(A) + P(B|A′)P(A′)
Permutations, combinations, and the fundamental counting principle
If there are m ways to do one thing and n ways to do another, there are m × n ways total.
Build probability skills in Excel using realistic accounting and audit scenarios. Complete each exercise before checking the step-by-step solutions at the end of this deck.
| Compliant | Minor Issues | Major Issues | Total | |
|---|---|---|---|---|
| Financial Services | 48 | 12 | 4 | 64 |
| Retail | 52 | 18 | 6 | 76 |
| Construction | 32 | 20 | 8 | 60 |
| Total | 132 | 50 | 18 | 200 |
| Never missed | Missed ≥1 | Total | |
|---|---|---|---|
| Platinum Card | 140 | 35 | 175 |
| Standard Card | 220 | 105 | 325 |
| Total | 360 | 140 | 500 |
| Prior Information | |
|---|---|
| P(Fraud) | 3% of all transactions are fraudulent |
| P(Flag | Fraud) | System correctly flags 92% of fraudulent transactions |
| P(Flag | No Fraud) | System incorrectly flags 6% of legitimate transactions (false positive rate) |
Detailed instructions for all three exercises, including every formula to type
A1 type Industry. In B1: Compliant, C1: Minor Issues, D1: Major Issues, E1: Total.A2:A4 (Financial Services, Retail, Construction) and row totals in E2:E4. Enter all count data in B2:D4.=SUM(B2:B4) in B5, then copy across to D5. Grand total in E5: =SUM(E2:E4). Verify it equals 200.G1, start a duplicate table. In H2: =B2/$E$5. Note the $ signs lock the grand total. Copy this formula across H2:J4.K2: =SUM(H2:J2). Copy to K3:K4.H5: =SUM(H2:H4). Copy to I5:J5.=K3+J5-J3 (Retail marginal + Major Issues marginal − intersection cell). Should return 0.44.=J4/K4=IF(ABS(J4/K4-J5)<0.001,"Independent","NOT Independent") to automate the test.=K4*J5 gives 0.0270; compare to =J4 which gives 0.0400. Use =IF(ABS(J4-(K4*J5))<0.001,"Independent","NOT Independent").B2:C3 with row and column headers. Grand total 500 in D4.=B2/$D$4 in F2, copy to F2:G3. Row/column marginals via =SUM().=G2/H2 where G2 = joint, H2 = Platinum marginal.= P(Missed|Platinum) × P(Platinum)= 0.20 × 0.35 = 0.0700=G2/H2*H2 should equal =G2. Use this to confirm your formulas are consistent.A1: HypothesisB1: Prior P(H)C1: Likelihood P(Flag|H)D1: Joint P(Flag ∩ H)E1: Posterior P(H|Flag)A2:A3: “Fraud” and “Not Fraud”.B2:B3: 0.03 and 0.97. Confirm they sum to 1: =SUM(B2:B3) should return 1.C2:C3: 0.92 (true positive rate) and 0.06 (false positive rate).D2: =B2*C2. In D3: =B3*C3.D4: =SUM(D2:D3). This is P(Flagged).E2: =D2/$D$4. In E3: =D3/$D$4. Verify =SUM(E2:E3) equals 1.B2 from 0.03 to 0.10 (and B3 to 0.90). All formulas update automatically because you used cell references.| P(Fraud) base rate | P(Fraud | Flagged) |
|---|---|
| 1% | 0.1341 |
| 3% | 0.3217 |
| 10% | 0.6311 |
| 20% | 0.7931 |
This week connected probability theory to real accounting practice: from reading contingency tables, applying the addition and multiplication rules, testing independence, through to updating beliefs using Bayes’ theorem. The fraud detection exercise demonstrates why base rates matter — a lesson central to audit risk assessment and forensic accounting.