• Article

Predicting pharmacy costs and other medical costs using diagnoses and drug claims

Background: Predicting health care costs for individuals and populations is essential for managing care. However, the comparative power of diagnostic and drug data for predicting future costs has not been closely examined. Objective: We sought to compare the predictive performance of claims-based models using diagnoses, drugs claims, and combined data to predict health care costs. Subjects: More than I million commercially insured, nonelderly individuals in a national (MEDSTAT MarketScan(R)) research database comprised our sample. Measures: We used 1997 and 1998 drug and diagnostic profiles to predict costs in 1998 and 1999, respectively. To assess model performance, we compared R 2 values and predictive ratios (predicted costs/actual costs) for important subgroups. Results: Models using both drug and diagnostic data best predicted subsequent-year total health care costs (highest R-2 = 0.168 versus 0.116 and 0.146 for models based on drug or diagnostic data alone, respectively), with highly accurate predictive ratios (0.95-1.05) for subgroups of patients with major medical conditions. Models predicting pharmacy costs had substantially higher R-2 values than models predicting other medical costs (highest R-2 0.493 versus 0.124). Drug-based models predicted future pharmacy costs better than diagnosis-based models (highest R-2 = 0.482 versus 0.243), whereas diagnosis-based models predicted total costs (highest R-2 = 0.146 versus 0.116) and nonpharmacy costs (highest R-2 = 0.116 versus 0.071) more effectively 2 than drug-based models. Newer models had markedly higher R values than older ones, largely because of richer data rather than model refinements. Conclusions: Combined drug and diagnostic data predicts total health care costs better than either type of data alone. Pharmacy spending is particularly predictable from drug data, whereas diagnoses are more useful than drugs for predicting other medical costs and total costs. Using even slightly more recent data can substantially boost model performance measures; thus, model comparisons should be conducted on the same dataset


Zhao, Y., Ash, AS., Ellis, RP., Ayanian, JZ., Pope, G., Bowen, B., & Weyuker, L. (2005). Predicting pharmacy costs and other medical costs using diagnoses and drug claims. Medical Care, 43(1), 34-43.