1 / 20

Week 10: Chi-Squared Tests

Are the proportions what we expected? Are two variables related?

Introductory Statistics for Accounting

Three applications of the $\chi^2$ distribution:

Goodness-of-Fit Test
Test of Independence
Test of Homogeneity

2 / 20

Section 1

The Chi-Squared Distribution

3 / 20

1.1 What is the Chi-Squared Test?

The chi-squared ($\chi^2$) test is a hypothesis test for categorical data. It compares what we observed in our sample to what we would expect if a particular hypothesis were true.

Core idea: If the observed frequencies are close to the expected frequencies, the $\chi^2$ statistic will be small — no evidence against $H_0$. If they differ substantially, the statistic will be large — evidence against $H_0$.

The test statistic is always computed the same way:

$$\chi^2 = \sum \frac{(O_i - E_i)^2}{E_i}$$

where $O_i$ = observed frequency in category $i$, and $E_i$ = expected frequency in category $i$.

Key properties: The $\chi^2$ statistic is always non-negative. Larger values mean bigger discrepancies between observed and expected. The test is always right-tailed (we only reject in the upper tail).

4 / 20

1.2 The Chi-Squared Distribution

The $\chi^2$ distribution is a family of curves, each defined by its degrees of freedom (df). As df increases, the distribution shifts right and becomes more symmetric.

Figure 1.2: Chi-squared distributions for df = 2, 5, and 10.

Mean = df and Variance = 2 × df. As df grows large, the $\chi^2$ distribution approaches a normal distribution.

5 / 20

1.3 General Testing Framework

All three chi-squared tests follow the same five-step hypothesis testing procedure from Week 9:

Step 1	State the null and alternative hypotheses ($H_0$ and $H_1$).
Step 2	Choose the significance level ($\alpha$), usually 0.05.
Step 3	Compute the test statistic: $\chi^2 = \sum \frac{(O_i - E_i)^2}{E_i}$
Step 4	Find the critical value from the $\chi^2$ table using the appropriate df, or compute the p-value.
Step 5	Decision: reject $H_0$ if $\chi^2 > \chi^2_{\alpha}$, or if p-value $< \alpha$.

Assumption: All expected frequencies should be at least 5. If any $E_i < 5$, combine adjacent categories before testing.

6 / 20

Knowledge Check — Section 1

Q1: The chi-squared test statistic can take which values?

Because we square each difference $(O_i - E_i)^2$ and divide by a positive number, the statistic is always $\geq 0$.

Q2: We reject $H_0$ when the $\chi^2$ statistic is:

A large $\chi^2$ means observed and expected frequencies differ substantially. The test is always right-tailed.

7 / 20

Section 2

The Goodness-of-Fit Test

8 / 20

2.1 Goodness-of-Fit: Purpose and Hypotheses

Purpose: Test whether the distribution of a single categorical variable matches a claimed or expected set of proportions.

We have one variable with $k$ categories. We observe frequencies in each category and compare them to what we'd expect under $H_0$.

Hypotheses:

$H_0$: The population proportions are $p_1, p_2, \ldots, p_k$ (as specified).

$H_1$: At least one proportion differs from the specified value.

Degrees of freedom: $\text{df} = k - 1$, where $k$ = number of categories.

Accounting context:

An audit firm claims that invoice errors are equally distributed across four quarters. You sample 200 error reports and count how many fall in each quarter. Does the data support the claim?

9 / 20

2.2 Worked Example: Invoice Error Distribution

BrightPath Financial Services audited 200 invoice processing errors over the past year and recorded which quarter each error occurred in.

Quarter	Q1	Q2	Q3	Q4	Total
Observed ($O_i$)	62	48	40	50	200
Expected ($E_i$)	50	50	50	50	200

If errors are equally distributed ($H_0$: $p_1 = p_2 = p_3 = p_4 = 0.25$), we expect $200 \times 0.25 = 50$ per quarter.

Step 3 — Compute the test statistic:

$$\chi^2 = \frac{(62-50)^2}{50} + \frac{(48-50)^2}{50} + \frac{(40-50)^2}{50} + \frac{(50-50)^2}{50}$$

$$= \frac{144}{50} + \frac{4}{50} + \frac{100}{50} + \frac{0}{50} = 2.88 + 0.08 + 2.00 + 0.00 = 4.96$$

10 / 20

2.3 Decision and Interpretation

Step 4 — Critical value: With $\text{df} = 4 - 1 = 3$ and $\alpha = 0.05$:

$\chi^2_{\text{critical}} = 7.815$

Step 5 — Decision: Since $4.96 < 7.815$, we do not reject $H_0$.

Business interpretation: At the 5% significance level, there is insufficient evidence to conclude that invoice errors are unevenly distributed across quarters. The audit team need not investigate seasonal staffing as a cause of errors.

11 / 20

Knowledge Check — Section 2

Q1: In a goodness-of-fit test with 6 categories, how many degrees of freedom?

For goodness-of-fit: df = k − 1 = 6 − 1 = 5. The degrees of freedom depend only on the number of categories, not the sample size.

Q2: If all observed frequencies exactly equal the expected frequencies, the test statistic is:

When $O_i = E_i$ for every category, each term $(O_i - E_i)^2 / E_i = 0$, so $\chi^2 = 0$. Perfect agreement with $H_0$.

12 / 20

Section 3

Test of Independence

13 / 20

3.1 Contingency Tables and Expected Frequencies

Purpose: Test whether two categorical variables are statistically independent, using data from a single sample.

Data is arranged in an $r \times c$ contingency table (r rows, c columns). The hypotheses are:

$H_0$: The two variables are independent.

$H_1$: The two variables are not independent (i.e., they are associated).

Degrees of freedom: $\text{df} = (r - 1)(c - 1)$

The expected frequency for each cell is:

$$E_{ij} = \frac{(\text{Row } i \text{ total}) \times (\text{Column } j \text{ total})}{\text{Grand total}}$$

Why this formula? Under independence, $P(A \cap B) = P(A) \times P(B)$. The expected frequency applies this rule to the sample totals.

14 / 20

3.2 Example: Client Satisfaction vs. Service Type

BrightPath surveyed 300 clients about their satisfaction (Satisfied / Neutral / Dissatisfied) across three service types (Tax, Audit, Advisory). Is satisfaction independent of service type?

	Tax	Audit	Advisory	Row Total
Satisfied	60	40	55	155
Neutral	30	35	20	85
Dissatisfied	10	25	25	60
Col Total	100	100	100	300

Computing expected frequencies (example cells):

$E_{\text{Satisfied, Tax}} = \frac{155 \times 100}{300} = 51.67$ $E_{\text{Dissatisfied, Audit}} = \frac{60 \times 100}{300} = 20.00$

15 / 20

3.3 Calculation: Expected Frequencies and Test Statistic

Complete expected frequency table:

	Tax	Audit	Advisory
Satisfied	51.67	51.67	51.67
Neutral	28.33	28.33	28.33
Dissatisfied	20.00	20.00	20.00

Since all column totals are equal (100), each row's expected values are simply the row total / 3.

Test statistic:

$$\chi^2 = \frac{(60-51.67)^2}{51.67} + \frac{(40-51.67)^2}{51.67} + \cdots + \frac{(25-20)^2}{20}$$

$$= 1.34 + 2.63 + 0.21 + 0.10 + 1.57 + 2.45 + 5.00 + 1.25 + 1.25 = \mathbf{15.80}$$

$\text{df} = (3-1)(3-1) = 4$. At $\alpha = 0.05$: $\chi^2_{\text{critical}} = 9.488$.

Decision: $15.80 > 9.488$ — reject $H_0$. Client satisfaction is not independent of service type. The audit and advisory divisions show notably higher dissatisfaction.

16 / 20

3.4 What Does This Mean for BrightPath?

We found a statistically significant association between service type and client satisfaction. But the test tells us that a relationship exists, not where. How do we dig deeper?

Reading the residuals:

Look at which cells contributed most to the test statistic. The largest contributions came from:

Dissatisfied × Tax — observed (10) much lower than expected (20). Tax clients are more satisfied than average.
Satisfied × Audit — observed (40) below expected (51.67). Audit clients are less satisfied.
Dissatisfied × Audit — observed (25) above expected (20). Confirms audit dissatisfaction.

Management recommendation: BrightPath should investigate the audit client experience. The data suggests the audit division has a satisfaction problem that tax does not share. This could relate to communication, pricing, or turnaround times — the statistical test identifies where to look, not why.

17 / 20

Knowledge Check — Section 3

Q1: In a 4 × 3 contingency table, how many degrees of freedom for the test of independence?

df = (r − 1)(c − 1) = (4 − 1)(3 − 1) = 3 × 2 = 6.

Q2: Rejecting $H_0$ in a test of independence tells us:

The test detects association (dependence), not causation. We need further investigation or experimental design to establish causal relationships.

18 / 20

Section 4

Test of Homogeneity

19 / 20

4.1 Homogeneity vs. Independence

Purpose: Test whether the distribution of a categorical variable is the same across two or more populations. The mechanics (formula, expected frequencies, df) are identical to the test of independence — the difference is in the study design.

Test of Independence

One sample drawn from a single population
Both variables are observed on each subject
Question: are the two variables related?

Example:

Survey 300 BrightPath clients. Record their service type and satisfaction level. Are they associated?

Test of Homogeneity

Separate samples drawn from different populations
One variable is the grouping variable (population)
Question: is the distribution the same across groups?

Example:

Sample 100 clients from each of three offices (Sydney, Melbourne, Brisbane). Is the distribution of satisfaction the same?

Same maths, different question. The formula, df, and decision rule are identical. The distinction matters for how you frame the hypotheses and interpret the result.

20 / 20

4.2 Homogeneity Example and Week Summary

Homogeneity example:

BrightPath sampled 100 clients from each of its Sydney and Melbourne offices and recorded satisfaction:

	Satisfied	Neutral	Dissatisfied	Total
Sydney	55	30	15	100
Melbourne	45	25	30	100
Total	100	55	45	200

$H_0$: The distribution of satisfaction is the same in both offices.

$H_1$: The distributions differ.

df = (2 − 1)(3 − 1) = 2. You would compute expected frequencies exactly as before and compare $\chi^2$ to $\chi^2_{0.05, 2} = 5.991$.

Week 10 Summary — Three Chi-Squared Tests

Test	Question	Degrees of Freedom
Goodness-of-Fit	Does one variable's distribution match a specified set of proportions?	$k - 1$
Independence	Are two variables associated? (single sample)	$(r-1)(c-1)$
Homogeneity	Is the distribution the same across populations? (separate samples)	$(r-1)(c-1)$

All three tests use the same formula: $\chi^2 = \sum \frac{(O_i - E_i)^2}{E_i}$. They differ in study design, hypotheses, and how expected frequencies are determined.

Quarter	Q1	Q2	Q3	Q4	Total
Observed (\(O_i\))	62	48	40	50	200
Expected (\(E_i\))	50	50	50	50	200

Test	Question	Degrees of Freedom
Goodness-of-Fit	Does one variable's distribution match a specified set of proportions?	\(k - 1\)
Independence	Are two variables associated? (single sample)	\((r-1)(c-1)\)
Homogeneity	Is the distribution the same across populations? (separate samples)	\((r-1)(c-1)\)

Week 10: Chi-Squared Tests

Section 1

1.1 What is the Chi-Squared Test?

1.2 The Chi-Squared Distribution

1.3 General Testing Framework

Knowledge Check — Section 1

Section 2

2.1 Goodness-of-Fit: Purpose and Hypotheses

2.2 Worked Example: Invoice Error Distribution

2.3 Decision and Interpretation

Knowledge Check — Section 2

Section 3

3.1 Contingency Tables and Expected Frequencies

3.2 Example: Client Satisfaction vs. Service Type

3.3 Calculation: Expected Frequencies and Test Statistic

3.4 What Does This Mean for BrightPath?

Knowledge Check — Section 3

Section 4

4.1 Homogeneity vs. Independence

Test of Independence

Test of Homogeneity

4.2 Homogeneity Example and Week Summary

Week 10 Summary — Three Chi-Squared Tests

Table of Contents