Sample Frame Deduplication in the World Trade Center Health Registry: Minimizing Overcoverage and Cost
Murphy, J., Pulliam, P. A., & Lucas, R. (2004, August). Sample Frame Deduplication in the World Trade Center Health Registry: Minimizing Overcoverage and Cost. Presented at , .
The World Trade Center (WTC) Health Registry is designed to assess thehealth effects of the WTC disaster of September 11, 2001. It willfollow those exposed to dust and fumes on 9/11 and in the ensuing weeksas the fires burned. Persons who may enroll in the Registry includethose who were in lower Manhattan on 9/11; residents and schoolchildren south of Canal Street; and persons involved in rescue,recovery, or clean-up at the WTC site or Staten Island RecoveryOperations between September 11, 2001 and June 30, 2002. The sampleframe will include people from more than 1,000 potential overlappinglist sources of individuals. To avoid overcoverage, list entries aresystematically deduplicated using an algorithm to identify likelyduplicates. Indeterminates are manually reviewed to assure that thesame individual was not included in the sample more than once. Thispaper describes the process of deduplication and assesses the resultingincrease in quality and reduction in cost. Respondent demographics andlist sources are evaluated to determine where overcoverage may havebeen most problematic, had deduplication not taken place.