A method and system for ensuring statistical disclosurelimitation (SDL) of categorical or continuous micro data, whilemaintaining the analytical quality of the micro data. The new SDLmethodology exploits the analogy between (1) taking a sample(instead of a census,) along with some adjustments, includingimputation, for missing information, and (2) releasing a subset,instead of the original data set, along with some adjustments forrecords still at disclosure risk. Survey sampling reduces monetarycost in comparison to a census, but entails some loss ofinformation. Similarly, releasing a subset reduces disclosure costin comparison to the full database, but entails some loss ofinformation. Thus, optimal survey sampling methods can be used forstatistical disclosure limitation. The method includes partitioningthe database into risk strata, optimal probabilistic substitution,optimal probabilistic subsampling, and optimal sampling weightcalibration.
Method for statistical disclosure limitation
Singh, A. (2006). IPC No. U.S. Method for statistical disclosure limitation. (U.S. Patent No. 7058638).