A method for accounting for classification error in a stratified cellphone sample

Marcus E. Berzofsky; Howard Speizer; Caroline Blanton Scruggs; Marcus E. Berzofsky; Caroline Blanton Scruggs; Howard Speizer; K Peterson; B Lu; Timothy Sahr

A method for accounting for classification error in a stratified cellphone sample

Berzofsky, M., Scruggs, C. B., Speizer, H., Peterson, K., Lu, B., & Sahr, T. (2018). A method for accounting for classification error in a stratified cellphone sample. Journal of Survey Statistics and Methodology, 6(4), 539–563. https://doi.org/10.1093/jssam/smx033

Copy citation

Abstract

State-based telephone surveys are often designed to make estimates at substate levels, such as county or county group. Under a traditional random-digit-dial design, the telephone exchange of a landline number could be used to accurately identify the county for which the associated household resides. However, initially, no good analogous data methods existed for the cellphone frame. This required survey methodologists to draw random samples of cellphone numbers from the entire state, making it difficult to target areas within a state. To overcome this shortcoming, sample vendors have used a cellphone number’s rate center (where the number was activated) as a proxy estimate for the county where the cellphone owner resides. Our paper shows that county assignations that are based on rate center data may have classification error rates as high as 30%. These high classification error rates make it difficult to accurately devise a cellphone frame sample allocation using the rate center data. This paper proposes a new method—the Rate Center Plus method—which uses rate centers and an estimate of the classification probabilities to stratify and allocate the desired respondent sample to counties. The new method uses Bayes’ rule to distribute a desired county-level sample allocation across rate center counties. We demonstrate how the Rate Center Plus method was applied to the 2015 Ohio Medicaid Assessment Survey and the resulting efficacy of the method. Finally, we evaluate whether the new approach is more efficient than the traditional statewide sample method. In addition, we look at four approaches to estimating the necessary classification probabilities. We found that the Rate Center Plus method can be more cost efficient than the statewide sample method when the classification probabilities are reasonably estimated, reducing data collection costs as much as 12.8%.