Estimating the prevalence of drug use from self-reports in a cohort for which biologic data are available for a subsample
Poole, W., Flynn, P., Rao, A. V., & Cooley, P. (1996). Estimating the prevalence of drug use from self-reports in a cohort for which biologic data are available for a subsample. American Journal of Epidemiology, 144(4), 413-420.
Diagnostic procedures, used singly or in combination, are crucial in the determination of the presence and prevalence of medical and other conditions. In the absence of a 'gold standard,' two or more measures or diagnostic tests are often available that may be used to estimate true prevalence. The authors have developed a statistical method with which to calculate more precise estimates of a condition in the presence of two diagnostic measures, one measurement being performed on the entire study sample and a second, more precise one being made in a random sample of the study sample. This method uses the well-known equations which express the probabilities of the four possible outcomes of the two measures in terms of the sensitivities and specificities of the measures and the prevalence of the condition and some properties of maximum likelihood estimates to obtain an expression for the estimated true prevalence and its precision. The method is illustrated by applying it to data collected by urinalysis and self-report in 1992-1993 in a national multisite study-the Cocaine Treatment Outcome Study. Through application of this methodology, a more precise estimate of the true prevalence of substance use can be obtained from two measures, one biologic and the other self-reported. Detailed equations and expressions are provided so that the method can be applied in other situations where diagnostic data from two different sources or procedures are available