A Comparative Assessment of Disclosure Risk and Data Quality Between MASSC and Other Statistical Disclosure Limitation Methods
Yu, F., & Sathe, N. S. (2011, July). A Comparative Assessment of Disclosure Risk and Data Quality Between MASSC and Other Statistical Disclosure Limitation Methods. Presented at JSM 2011, Miami Beach, FL.
MASSC (an acronym for Micro-Agglomeration, optimal probabilistic Substitution, optimal probabilistic Subsampling, and optimal weight Calibration) is a statistical disclosure limitation (SDL) methodology developed at RTI International for simultaneous confidentiality and analytic utility protection. In this paper, we will compare MASSC with other SDL methods by examining the degree to which MASSC and other methods impact data quality and lower disclosure risk. Other SDL methods that will be presented include Post Randomization (PRAM) and random swapping. We will take a sample from the 2006 and 2007 National Survey on Drug Use and Health public use files (PUFs) as an initial data set for treatment, where the original PUFs will be viewed as the “population”, and compare MASSC with the other two methods via simulations. For risk assessment, we will calculate the matching probability that a record in the treated sample can be correctly linked to the corresponding record in the “population” under the different treatment methods. For utility assessment, we will compare estimates as well as impact of different treatment methods on inference and model parameters using regression models.