PURPOSE: Strategies to identify and validate acute myocardial infarction (AMI) and stroke in primary-care electronic records may impact effect measures, but to an unknown extent. Additionally, the validity of cardiovascular risk factors that could act as confounders in studies on those endpoints has not been thoroughly assessed in the United Kingdom Clinical Practice Research Datalink's (CPRD's) GOLD database. We explored the validity of algorithms to identify cardiovascular outcomes and risk factors and evaluated different outcome-identification strategies using these algorithms for estimation of adjusted incidence rate ratios (IRRs).
METHODS: First, we identified AMI, stroke, smoking, obesity, and menopausal status in a cohort treated for overactive bladder by applying computerized algorithms to primary care medical records (2004-2012). We validated these cardiovascular outcomes and risk factors with physician questionnaires (gold standard for this analysis). Second, we estimated IRRs for AMI and stroke using algorithm-identified and questionnaire-confirmed cases, comparing these with IRRs from cases identified through linkage with hospitalization/mortality data (best estimate).
RESULTS: For AMI, the algorithm's positive predictive value (PPV) was >90%. Initial algorithms for stroke performed less well because of inclusion of codes for prevalent stroke; algorithm refinement increased PPV to 80% but decreased sensitivity by 20%. Algorithms for smoking and obesity were considered valid. IRRs based on questionnaire-confirmed cases only were closer to IRRs estimated from hospitalization/mortality data than IRRs from algorithm-identified cases.
CONCLUSIONS: AMI, stroke, smoking, obesity, and postmenopausal status can be accurately identified in CPRD. Physician questionnaire-validated AMI and stroke cases yield IRRs closest to the best estimate.