Investigating alternative estimators for the prevalence of serious mental illness based on a two-phase sample

W. Jeremy Aldworth; Phillip Samuel Kott; Dan Liao; Phillip Samuel Kott; Dan Liao; W. Jeremy Aldworth; Sarra L. Hedden; JC Gfroerer; Jonaki Bose; Lisa Colpe

Investigating alternative estimators for the prevalence of serious mental illness based on a two-phase sample

Kott, P. S., Liao, D., Aldworth, J., Hedden, S. L., Gfroerer, JC., Bose, J., & Colpe, L. (2018). Investigating alternative estimators for the prevalence of serious mental illness based on a two-phase sample. Survey Methodology, 44(1), 61-73.

Copy citation

Abstract

A two-phase process was used by the Substance Abuse and Mental Health Services Administration to estimate the proportion of US adults with serious mental illness (SMI). The first phase was the annual National Survey on Drug Use and Health (NSDUH), while the second phase was a random subsample of adult respondents to the NSDUH. Respondents to the second phase of sampling were clinically evaluated for serious mental illness. A logistic prediction model was fit to this subsample with the SMI status (yes or no) determined by the second-phase instrument treated as the dependent variable and related variables collected on the NSDUH from all adults as the model’s explanatory variables. Estimates were then computed for SMI prevalence among all adults and within adult subpopulations by assigning an SMI status to each NSDUH respondent based on comparing his (her) estimated probability of having SMI to a chosen cut point on the distribution of the predicted probabilities. We investigate alternatives to this standard cut point estimator such as the probability estimator. The latter assigns an estimated probability of having SMI to each NSDUH respondent. The estimated prevalence of SMI is the weighted mean of those estimated probabilities. Using data from NSDUH and its subsample, we show that, although the probability estimator has a smaller mean squared error when estimating SMI prevalence among all adults, it has a greater tendency to be biased at the subpopulation level than the standard cut point estimator.