1 / 20

Week 10: Chi-Squared Tests

Are the proportions what we expected? Are two variables related?

Introductory Statistics for Accounting

Three applications of the \(\chi^2\) distribution:

  1. Goodness-of-Fit Test
  2. Test of Independence
  3. Test of Homogeneity
2 / 20

Section 1

The Chi-Squared Distribution

3 / 20

1.1 What is the Chi-Squared Test?

The chi-squared (\(\chi^2\)) test is a hypothesis test for categorical data. It compares what we observed in our sample to what we would expect if a particular hypothesis were true.

Core idea: If the observed frequencies are close to the expected frequencies, the \(\chi^2\) statistic will be small — no evidence against \(H_0\). If they differ substantially, the statistic will be large — evidence against \(H_0\).

The test statistic is always computed the same way:

$$\chi^2 = \sum \frac{(O_i - E_i)^2}{E_i}$$

where \(O_i\) = observed frequency in category \(i\), and \(E_i\) = expected frequency in category \(i\).

Key properties: The \(\chi^2\) statistic is always non-negative. Larger values mean bigger discrepancies between observed and expected. The test is always right-tailed (we only reject in the upper tail).
4 / 20

1.2 The Chi-Squared Distribution

The \(\chi^2\) distribution is a family of curves, each defined by its degrees of freedom (df). As df increases, the distribution shifts right and becomes more symmetric.

χ² value Density 0 5 10 15 20 df = 2 df = 5 df = 10
Figure 1.2: Chi-squared distributions for df = 2, 5, and 10.
Mean = df and Variance = 2 × df. As df grows large, the \(\chi^2\) distribution approaches a normal distribution.
5 / 20

1.3 General Testing Framework

All three chi-squared tests follow the same five-step hypothesis testing procedure from Week 9:

Step 1State the null and alternative hypotheses (\(H_0\) and \(H_1\)).
Step 2Choose the significance level (\(\alpha\)), usually 0.05.
Step 3Compute the test statistic: \(\chi^2 = \sum \frac{(O_i - E_i)^2}{E_i}\)
Step 4Find the critical value from the \(\chi^2\) table using the appropriate df, or compute the p-value.
Step 5Decision: reject \(H_0\) if \(\chi^2 > \chi^2_{\alpha}\), or if p-value \(< \alpha\).
Assumption: All expected frequencies should be at least 5. If any \(E_i < 5\), combine adjacent categories before testing.
6 / 20

Knowledge Check — Section 1

Q1: The chi-squared test statistic can take which values?
Because we square each difference \((O_i - E_i)^2\) and divide by a positive number, the statistic is always \(\geq 0\).
Q2: We reject \(H_0\) when the \(\chi^2\) statistic is:
A large \(\chi^2\) means observed and expected frequencies differ substantially. The test is always right-tailed.
7 / 20

Section 2

The Goodness-of-Fit Test

8 / 20

2.1 Goodness-of-Fit: Purpose and Hypotheses

Purpose: Test whether the distribution of a single categorical variable matches a claimed or expected set of proportions.

We have one variable with \(k\) categories. We observe frequencies in each category and compare them to what we'd expect under \(H_0\).

Hypotheses:

\(H_0\): The population proportions are \(p_1, p_2, \ldots, p_k\) (as specified).

\(H_1\): At least one proportion differs from the specified value.

Degrees of freedom: \(\text{df} = k - 1\), where \(k\) = number of categories.

Accounting context:

An audit firm claims that invoice errors are equally distributed across four quarters. You sample 200 error reports and count how many fall in each quarter. Does the data support the claim?

9 / 20

2.2 Worked Example: Invoice Error Distribution

BrightPath Financial Services audited 200 invoice processing errors over the past year and recorded which quarter each error occurred in.

QuarterQ1Q2Q3Q4Total
Observed (\(O_i\))62484050200
Expected (\(E_i\))50505050200

If errors are equally distributed (\(H_0\): \(p_1 = p_2 = p_3 = p_4 = 0.25\)), we expect \(200 \times 0.25 = 50\) per quarter.

Step 3 — Compute the test statistic:

$$\chi^2 = \frac{(62-50)^2}{50} + \frac{(48-50)^2}{50} + \frac{(40-50)^2}{50} + \frac{(50-50)^2}{50}$$

$$= \frac{144}{50} + \frac{4}{50} + \frac{100}{50} + \frac{0}{50} = 2.88 + 0.08 + 2.00 + 0.00 = 4.96$$

10 / 20

2.3 Decision and Interpretation

Step 4 — Critical value: With \(\text{df} = 4 - 1 = 3\) and \(\alpha = 0.05\):

\(\chi^2_{\text{critical}} = 7.815\)

χ² = 7.815 4.96 Do not reject Reject

Step 5 — Decision: Since \(4.96 < 7.815\), we do not reject \(H_0\).

Business interpretation: At the 5% significance level, there is insufficient evidence to conclude that invoice errors are unevenly distributed across quarters. The audit team need not investigate seasonal staffing as a cause of errors.
11 / 20

Knowledge Check — Section 2

Q1: In a goodness-of-fit test with 6 categories, how many degrees of freedom?
For goodness-of-fit: df = k − 1 = 6 − 1 = 5. The degrees of freedom depend only on the number of categories, not the sample size.
Q2: If all observed frequencies exactly equal the expected frequencies, the test statistic is:
When \(O_i = E_i\) for every category, each term \((O_i - E_i)^2 / E_i = 0\), so \(\chi^2 = 0\). Perfect agreement with \(H_0\).
12 / 20

Section 3

Test of Independence

13 / 20

3.1 Contingency Tables and Expected Frequencies

Purpose: Test whether two categorical variables are statistically independent, using data from a single sample.

Data is arranged in an \(r \times c\) contingency table (r rows, c columns). The hypotheses are:

\(H_0\): The two variables are independent.

\(H_1\): The two variables are not independent (i.e., they are associated).

Degrees of freedom: \(\text{df} = (r - 1)(c - 1)\)

The expected frequency for each cell is:

$$E_{ij} = \frac{(\text{Row } i \text{ total}) \times (\text{Column } j \text{ total})}{\text{Grand total}}$$

Why this formula? Under independence, \(P(A \cap B) = P(A) \times P(B)\). The expected frequency applies this rule to the sample totals.
14 / 20

3.2 Example: Client Satisfaction vs. Service Type

BrightPath surveyed 300 clients about their satisfaction (Satisfied / Neutral / Dissatisfied) across three service types (Tax, Audit, Advisory). Is satisfaction independent of service type?

TaxAuditAdvisoryRow Total
Satisfied604055155
Neutral30352085
Dissatisfied10252560
Col Total100100100300

Computing expected frequencies (example cells):

\(E_{\text{Satisfied, Tax}} = \frac{155 \times 100}{300} = 51.67\)    \(E_{\text{Dissatisfied, Audit}} = \frac{60 \times 100}{300} = 20.00\)

15 / 20

3.3 Calculation: Expected Frequencies and Test Statistic

Complete expected frequency table:

TaxAuditAdvisory
Satisfied51.6751.6751.67
Neutral28.3328.3328.33
Dissatisfied20.0020.0020.00

Since all column totals are equal (100), each row's expected values are simply the row total / 3.

Test statistic:

$$\chi^2 = \frac{(60-51.67)^2}{51.67} + \frac{(40-51.67)^2}{51.67} + \cdots + \frac{(25-20)^2}{20}$$

$$= 1.34 + 2.63 + 0.21 + 0.10 + 1.57 + 2.45 + 5.00 + 1.25 + 1.25 = \mathbf{15.80}$$

\(\text{df} = (3-1)(3-1) = 4\). At \(\alpha = 0.05\): \(\chi^2_{\text{critical}} = 9.488\).

Decision: \(15.80 > 9.488\) — reject \(H_0\). Client satisfaction is not independent of service type. The audit and advisory divisions show notably higher dissatisfaction.
16 / 20

3.4 What Does This Mean for BrightPath?

We found a statistically significant association between service type and client satisfaction. But the test tells us that a relationship exists, not where. How do we dig deeper?

Reading the residuals:

Look at which cells contributed most to the test statistic. The largest contributions came from:

  • Dissatisfied × Tax — observed (10) much lower than expected (20). Tax clients are more satisfied than average.
  • Satisfied × Audit — observed (40) below expected (51.67). Audit clients are less satisfied.
  • Dissatisfied × Audit — observed (25) above expected (20). Confirms audit dissatisfaction.
Management recommendation: BrightPath should investigate the audit client experience. The data suggests the audit division has a satisfaction problem that tax does not share. This could relate to communication, pricing, or turnaround times — the statistical test identifies where to look, not why.
17 / 20

Knowledge Check — Section 3

Q1: In a 4 × 3 contingency table, how many degrees of freedom for the test of independence?
df = (r − 1)(c − 1) = (4 − 1)(3 − 1) = 3 × 2 = 6.
Q2: Rejecting \(H_0\) in a test of independence tells us:
The test detects association (dependence), not causation. We need further investigation or experimental design to establish causal relationships.
18 / 20

Section 4

Test of Homogeneity

19 / 20

4.1 Homogeneity vs. Independence

Purpose: Test whether the distribution of a categorical variable is the same across two or more populations. The mechanics (formula, expected frequencies, df) are identical to the test of independence — the difference is in the study design.

Test of Independence

  • One sample drawn from a single population
  • Both variables are observed on each subject
  • Question: are the two variables related?
Example:

Survey 300 BrightPath clients. Record their service type and satisfaction level. Are they associated?

Test of Homogeneity

  • Separate samples drawn from different populations
  • One variable is the grouping variable (population)
  • Question: is the distribution the same across groups?
Example:

Sample 100 clients from each of three offices (Sydney, Melbourne, Brisbane). Is the distribution of satisfaction the same?

Same maths, different question. The formula, df, and decision rule are identical. The distinction matters for how you frame the hypotheses and interpret the result.
20 / 20

4.2 Homogeneity Example and Week Summary

Homogeneity example:

BrightPath sampled 100 clients from each of its Sydney and Melbourne offices and recorded satisfaction:

SatisfiedNeutralDissatisfiedTotal
Sydney553015100
Melbourne452530100
Total1005545200

\(H_0\): The distribution of satisfaction is the same in both offices.

\(H_1\): The distributions differ.

df = (2 − 1)(3 − 1) = 2. You would compute expected frequencies exactly as before and compare \(\chi^2\) to \(\chi^2_{0.05, 2} = 5.991\).

Week 10 Summary — Three Chi-Squared Tests

TestQuestionDegrees of Freedom
Goodness-of-FitDoes one variable's distribution match a specified set of proportions?\(k - 1\)
IndependenceAre two variables associated? (single sample)\((r-1)(c-1)\)
HomogeneityIs the distribution the same across populations? (separate samples)\((r-1)(c-1)\)
All three tests use the same formula: \(\chi^2 = \sum \frac{(O_i - E_i)^2}{E_i}\). They differ in study design, hypotheses, and how expected frequencies are determined.

Table of Contents

Press T or Escape to close