The Agony and the Ecstasy: A Tale of Repository Data Analysts
Pugh, N., Tan, S., Turner, C., & Rogers, S. (2012, July). The Agony and the Ecstasy: A Tale of Repository Data Analysts. Presented at JSM 2012, San Diego, CA.
The National Institute of Diabetes and Digestive and Kidney Diseases (NIDDK) Central Repository makes data and biospecimens from NIDDK-funded research available to the broader scientific community. The repository facilitates the testing of new hypotheses without new data or biospecimen collection; informative genetic analyses using well-curated phenotypic data; and the pooling of data across several studies, to increase statistical power and provide a rich source of high-quality data to the scientific community. The management team of the repository is multidisciplinary, and includes consent specialists, database programmers, and web designers. The repository also includes a small team of data analysts, who face a unique set of challenges: performing statistical analyses on submitted clinical datasets to ensure dataset integrity; speaking for study data coordinating centers when responding to user inquiries about the data; and harmonizing datasets across protocols to facilitate larger-scale studies. Here, we discuss these unique challenges and the tremendous value we strive to provide as repository data analysts.