Mining Data to Find Genetic Markers of Type 2 Diabetes

Funding forward-thinking, collaborative, groundbreaking work to advance medical research

North Carolina State University, University of North Carolina at Chapel Hill

Game-changing scientific discoveries are often the product of relentless collaboration. The Game-Changing Research Incentive Program (GRIP) at North Carolina State University asks local university personnel and RTI researchers to propose ways to stimulate the growth of interdisciplinary research. GRIP provides large-scale seed funding for collaborative research efforts.

The GRIP program, a joint effort by the NCSU Office of Research, Innovation & Economic Development, RTI, and the University of North Carolina Kenan Institute of Private Enterprise, allocates funding over three years to advance visionary research ideas. The program committee comprised of faculty and administrators from NC State, the University of North Carolina and RTI received 59 pre-proposals from more than 300 faculty members at ten North Carolina universities. They awarded funding to four projects, three of which included RTI researchers as principal investigators.

Forging a New Analytical Pipeline for Genomic Research and Precision Medicine

One of these winning GRIP projects aims to mine disparate big data sets curated by the National Institutes of Health (NIH) and make available valuable information for people researching type 2 diabetes.

Under the NCSU/RTI Program in Genetic Discovery and Prediction (PGDP), our joint research team will spend the next three years seeking insights on the genetic associations of disease by mining data from publicly available sources. The results will have implications for precision medicine, an emerging field that holds great promise for the prevention and treatment of complex, difficult-to-treat diseases such as diabetes.

The ultimate goal of the project is to assemble and process clinically relevant data sets and analyze them to uncover subtle genetic interactions and predictions of genetic elements of human disease. Eventually, PGDP will develop a phenotyping infrastructure to help prioritize genome-wide association studies—such as those recorded in the NIH database of genotypes and phenotypes, known as dbGaP. The project will also develop new methods for rapidly performing associations at the gene and pathway levels, as well as for conducting gene-to-gene analysis.

NIH’s dbGaP was created as a repository for researchers performing human genome-wide association studies (GWAS). Because these studies come from a variety of sources, the results input into dbGaP were produced in different ways, using different technologies, and analyzed using different platforms. In many cases, these differences mean the data cannot be easily compared.

Our bioinformaticians, in collaboration with NC State researchers, saw an opportunity to create an analytical pipeline for data stored in dbGaP to be harmonized and made useful to the research community.

Our approach will leverage RTI’s PhenX toolkit, which catalogs standard measures of phenotypes and exposures for use in genome studies, as well as epidemiological and biomedical research. Using PhenX to map the data in dbGaP will allow researchers to see similarities in disparate data sets, indicating precursors of type 2 diabetes.

This collaborative effort takes advantage of the strengths of both NCSU and RTI—NCSU as a leader in quantitative and statistical genetics, and RTI as innovators of the PhenX toolkit. Together, the two groups of investigators aim to bring order to existing databases, setting the stage for new genetic discoveries.