Evaluating statistical methods using plasmode data sets in the age of massive public databases: An illustration using false discovery rates

GL Gadbury; Q Xiang; L Yang; S Barnes; Grier Page; DB Allison

Evaluating statistical methods using plasmode data sets in the age of massive public databases

An illustration using false discovery rates

Gadbury, GL., Xiang, Q., Yang, L., Barnes, S., Page, G., & Allison, DB. (2008). Evaluating statistical methods using plasmode data sets in the age of massive public databases: An illustration using false discovery rates. PLoS Genetics, 4(6), e1000098. https://doi.org/10.1371/journal.pgen.1000098

Copy citation

Abstract

Plasmode is a term coined several years ago to describe data sets that are derived from real data but for which some truth is known. Omic techniques, most especially microarray and genomewide association studies, have catalyzed a new zeitgeist of data sharing that is making data and data sets publicly available on an unprecedented scale. Coupling such data resources with a science of plasmode use would allow statistical methodologists to vet proposed techniques empirically (as opposed to only theoretically) and with data that are by definition realistic and representative. We illustrate the technique of empirical statistics by consideration of a common task when analyzing high dimensional data: the simultaneous testing of hundreds or thousands of hypotheses to determine which, if any, show statistical significance warranting follow-on research. The now-common practice of multiple testing in high dimensional experiment (HDE) settings has generated new methods for detecting statistically significant results. Although such methods have heretofore been subject to comparative performance analysis using simulated data, simulating data that realistically reflect data from an actual HDE remains a challenge. We describe a simulation procedure using actual data from an HDE where some truth regarding parameters of interest is known. We use the procedure to compare estimates for the proportion of true null hypotheses, the false discovery rate (FDR), and a local version of FDR obtained from 15 different statistical methods.

Publications Info

To contact an RTI author, request a report, or for additional information about publications by our experts, send us your request.

publications@rti.org

RTI shares its evidence-based research - through peer-reviewed publications and media - to ensure that it is accessible for others to build on, in line with our mission and scientific standards.

Meet the Experts

Navigate to Grier Page

Grier Page

Recent Publications

Book

The Economics of Heart Disease Research: Knowledge Transfers, Funding, and US National Institutes of Health

January 2026

Article

Molecular genotype-phenotype correlation in ACTB- and ACTG1-related non-muscle actinopathies

January 2026

Article

Combining alcohol misuse and sexual assault prevention enhances training outcomes in a U.S. military service academy

January 2026

Article

Protocol for a phase 2, partially blinded, randomized trial assessing the safety and efficacy of sorfequiline or bedaquiline in combination with pretomanid and linezolid in adult participants with newly diagnosed drug-sensitive, smear-positive pulmonary tuberculosis (NC-009)

January 2026

Article

Human trafficking in the wake of natural disasters: A case study of preparedness and response in Louisiana

January 2026

Article

Project lifeline-II: Feasibility of implementing screening, brief intervention, and referral to treatment (SBIRT) in Allegheny County, Pennsylvania

January 2026

Article

Does Medicaid cover the cost of nursing home care? Variation by ownership status, payer-mix, and staffing level

January 2026

Article

Implementing a peer navigation program for individuals with serious mental illness in the criminal legal system: insights and lessons learned

January 2026

View All Publications