Predictive mean neighborhood imputation with application to the person-pair data of the National Household Survey on Drug Abuse
Singh, A. C., Grau, E. A., & Folsom, R. E. (2001). Predictive mean neighborhood imputation with application to the person-pair data of the National Household Survey on Drug Abuse. In Proceedings of the Survey Research Methods Section, American Statistical Association,.
In 1999, the instrument used to administer the National Household Survey on Drug Abuse (NHSDA) was changed from a paper and pencil format (PAPI) to a computer assisted format (CAI). In previous years, imputation of missing values for most of the drug use variables was accomplished with an unweighted sequential hot deck. For other variables, including person-pair data, no imputation was attempted at all. In the spirit of efforts to improve the quality of estimates from the redesigned NHSDA, and as a result of fundamental differences between PAPI and CAI, there was a need to change the way missing data were edited and imputed. Changes in the editing rules from PAPI to CAI put more of a burden on statistical imputation for resolving inconsistent values. These rules are referred to as “flag and impute”, where ambiguous or inconsistent responses are flagged and replaced with consistent values in imputation. In addition, imputation was required for more variables in CAI. Finally, many of the variables in the NHSDA are closely related to each other, often in a hierarchical manner. These points all illustrate that the need for a method that was both rigorous, flexible, and preferably multivariate. This paper presents a new imputation method with these characteristics, termed Predictive Mean Neighborhoods (PMN), that was used to impute missing values in many variables in the NHSDA, including both drug use variables and variables derived from the person-pair data.