Critical issues with the Pearson's chi-square test

Authors

DOI:

https://doi.org/10.64700/mmm.75

Keywords:

Pearson chi-square test, difference between two proportions, goodness of fit, contingency tables

Abstract

Pearson's chi-square tests are among the most commonly applied statistical tools across a wide range of scientific disciplines, including medicine, engineering, biology, sociology, marketing and business. However, its usage in some areas is not correct. For example, the chi-square test for homogeneity of proportions (that is, comparing proportions across groups in a contingency table) is frequently used to verify if the rows of a given nonnegative \(m \times n\) (contingency) matrix \(A\) are proportional. The null hypothesis \(H_0\): ``\(m\) rows are proportional'' (for the whole population) is rejected with confidence level \(1 - \alpha\) if and only if \(\chi^2_{stat} > \chi^2_{crit}\), where the first term is given by Pearson's formula, while the second one depends only on \(m, n\), and \(\alpha,\) but not on the entries of \(A\).

It is immediate to notice that the Pearson's formula is not invariant. More precisely, whenever we multiply all entries of \(A\) by a constant \(c\), the value \(\chi^2_{stat}(A)\) is multiplied by \(c\), too, \(\chi^2_{stat}(cA) = c \chi^2_{stat} (A)\). Thus, if all rows of \(A\) are exactly proportional then \(\chi^2_{stat}(cA) = c \chi^2_{stat}(A) = 0\) for any \(c\) and any \(\alpha\). Otherwise, \(\chi^2_{stat} (cA)\) becomes arbitrary large or small, as positive \(c\) is increasing or decreasing. Hence, at any fixed significance level \(\alpha\), the null hypothesis \(H_0\) will be rejected with confidence \(1 - \alpha\), when \(c\) is sufficiently large and not rejected when \(c\) is sufficiently small. Yet, obviously, the rows of \(cA\) should be proportional or not for all \(c\) simultaneously. For this reason, Pearson's test certainly cannot be applied to ``physical data'', which are obtained by measurements. Indeed, in this case matrix \(A\) depends on the unit of measurement.

The test can be applied only to categorical data and even then some further limitations are required, which we consider in this paper.

References

D. Baird: The Fisher/Pearson Chi-squared controversy: a turning point for inductive inference, Br. J. Philos. Sci., 34 (2) (1983), 105–118.

W. G. Cochran: The χ2 test of goodness of fit, Ann. Math. Stat., 23 (3) (1952), 315–345.

P. E. Greenwood, M. S. Nikulin: A guide to chi-squared testing, John Wiley & Sons., New York (1996).

S. M. Fincham, G. B. Hill, J. Hanson and C. Wijayasinghe: Epidemiology of prostatic cancer: a case-control study, The Prostate, 17 (3) (1990), 189–206.

R. A. Fisher: On the interpretation of χ2 from contingency tables, and the calculation of P, J. Roy. Statist. Soc., 85 (1) (1922), 87–94.

R. A. Fisher: The conditions under which χ2 measures the discrepancy between observation and hypothesis, J. Roy. Statist. Soc., 87 (3) (1924), 442–450.

T. M. Franke, T. Ho and C. A. Christie: The chi-square test: Often used and more often misinterpreted, Am. J. Eval., 33 (3) (2012), 448–458.

V. Gurvich, M. Naumova: Logical contradictions in the one-way ANOVA and Tukey-Kramer multiple comparisons tests with more than two groups of observations, Symmetry, 13 (2021), Article ID: 1387.

A. E. Kossovsky: On the mistaken use of the chi-square test in Benford’s law, Stats., 4 (2) (2021), 419–453.

M. L. McHugh: The chi-square test of independence, Biochem. Med., 23 (2) (2013), 143–149.

K. Pearson: Contributions to the mathematical theory of evolution, Philos. Trans. R. Soc. A, 185 (1894), 71–110.

K. Pearson: X. Contributions to the mathematical theory of evolution.—II. Skew variation in homogeneous material, Philos. Trans. R. Soc. A, 186 (1895), 343–414.

K. Pearson: Mathematical contributions to the theory of evolution — On a form of spurious correlation which may arise when indices are used in the measurement of organs, Proc. R. Soc. Lond., 60 (359-367) (1897), 489–498.

K. Pearson: X. On the criterion that a given system of deviations from the probable in the case of a correlated system of variables is such that it can be reasonably supposed to have arisen from random sampling, London Edinburgh Philos. Mag. & J. Sci., 50 (302) (1900), 157–175.

R. L. Plackett: Karl Pearson and the chi-squared test, Int. Stat. Rev., 51 (1) (1983), 59–72.

Downloads

Published

26-08-2025

How to Cite

Gurvich, V., & Naumova, M. (2025). Critical issues with the Pearson’s chi-square test. Modern Mathematical Methods, 3(2), 101–109. https://doi.org/10.64700/mmm.75

Issue

Section

Articles