Bioinformatics

Bioinformatics is defined as the use of information technology to solve fundamental problems in the area of life science research. An integral component of the Human Genome Project and a fundamental infrastructure for the fields of genomics and proteomics, bioinformatics is a relatively new and rapidly evolving field.

As part of our multidisciplinary program in Molecular Epidemiology, Genomics, Environment, and Health (MEGEH), we are addressing a range of research challenges in bioinformatics. For example, program staff are creating tools for the collection and curation of high-quality data to advance the understanding of cancers. We are also collaborating with epidemiologists to create powerful tools for modeling the spread of infectious diseases such as avian flu or agents such as anthrax or smallpox that could be used in a bioterrorism attack.

Projects include the creation of databases that house links to biological data and samples to benefit current and future researchers, providing the opportunity for more widespread access to this data to accelerate scientific discovery.

Capabilities

  • Construction of large-scale relational databases and data warehouses
  • Data integration across heterogeneous sources
  • Custom solutions for microarray and protein expression analysis
  • Quantitative genetic analysis, including gene mapping and genotype prediction
  • Guidance in the use of high-performance computing resources

Focus Areas

  • Cancer research
  • Data repositories
  • Data mining
  • Portal development
  • Model development
  • High-performance computing

Projects

  • Consensus Measures for Phenotypes and Exposures (PhenX) (NHGRI, 2007–2010). Contributes to the integration of genetics and epidemiologic research. Includes direct consensus-building towards a recommended minimal set of standard measures for use in Genome-wide Association Studies (GWAS) and other large-scale genomic research efforts.
  • Breast and Colon Cancer Family Registries (NCI, 2005–2009). The Informatics Center provides assistance to investigators in experiment design, creation of custom data sets, development of analytic methodology, and the consolidation of epidemiological, family history, lifestyle, clinical, pathology, genomic, and proteomic data from 12 primary research centers.
  • National Institute of Diabetes and Digestive and Kidney Diseases, Central Data Repository (NIDDK, 2003–2013). Allows access to data and documentation from completed NIDDK-funded studies, and provides a data management system for the cataloging and retrieval of genetic samples collected during clinical trials. Features include a powerful tool to allow researchers to perform in-depth secondary analysis of the collected data.
  • Models of Infectious Disease Agents Study (MIDAS) (NIGMS, 2004–2009). Web-based portal, mathematical models, and a set of computational and analytical tools have been developed for researchers and public health officials to model emerging infectious diseases and influence rapid public health responses. Portal houses a central catalog of models and results from participating research groups.
  • Web-Genome (NCI, 2005–2008). Built upon the webCGH code base, this system adds an integration component that enables it to serve as a plotting service to client applications. It provides support for more complex microarray designs and co-visualization of CGH and gene expression data.

More Information


Contact us for more information

Related Content