Chi-Square Goodness-of-Fit and Test of Independence
Chi-square (χ²) tests are used for categorical data — counts or frequencies rather than means. The goodness-of-fit test assesses whether an observed frequency distribution matches an expected (theoretical) distribution. Example: a die is rolled 600 times. Expected frequency for each face: 100. Observed: 85, 110, 97, 115, 88, 105. χ² = Σ [(O − E)² / E] = [(85-100)²/100 + (110-100)²/100 + ...] = 7.24. With df = k − 1 = 5 and α = 0.05, critical value = 11.07. Since 7.24 < 11.07, fail to reject H₀ — the die does not significantly deviate from uniform distribution. The chi-square test of independence tests whether two categorical variables are associated in a contingency table (cross-tabulation). H₀: the two variables are independent (no association). Formula: same χ² = Σ [(O − E)² / E], where expected values E = (row total × column total) / grand total for each cell. df = (rows − 1)(columns − 1). Assumptions: (1) observations are independent, (2) expected frequency in each cell ≥ 5 (if violated, use Fisher's exact test for 2×2 tables). Effect size: Cramér's V = √(χ²/ (n × min(r-1, c-1))). V = 0.1 (small), 0.3 (medium), 0.5 (large). The chi-square test is not appropriate for small samples (expected cell counts below 5), ordinal data where rank order matters, or testing whether a correlation coefficient equals zero (use t-test for r).