Situations often arise in a large-scale household survey where a complex probability sample of clusters rather than of individuals is drawn from a large population. Typically, the clusters of such complex samples include a number of correlated members. The responses of these members are then weighted to obtain estimates for the population. Such weighted data are commonly published by the National Center for Health Statistics and other U.S. federal agencies. Frequently, problems arise when such data are tested by usual chi-square test statistics for goodness of fit or independence. Researchers have discovered that the usual chi-square tests provide spuriously inflated results when applied to cluster samples and that new methods are required to correct such problems. This paper proposes a strategy for a goodness-of-fit or independence test based on correlated and weighted data arising in cluster samples, and provides a factor that validly reduces the inflation of the usual chi-square statistics. This method is applied to the chronic condition data collected from the St Paul-Minneapolis, Minnesota, primary sampling unit (PSU) during the 1975 National Health Interview Survey (NHIS). This analysis, together with simulation studies presented elsewhere, provides evidence that the usual chi-square statistics from such data can be corrected for the impacts of clustering and weighting by use of the proposed reduction factor
A Reduction Factor in Goodness-of-Fit and Independence Tests for Clustered and Weighted Observations
Choi, JW., & McHugh, RB. (1989). A Reduction Factor in Goodness-of-Fit and Independence Tests for Clustered and Weighted Observations. Biometrics, 45(3), 979-996.