SNPMClust: Bivariate Gaussian Genotype Clustering and Calling for Illumina Microarrays
Erickson, S., & Callaway, J. C. (2016). SNPMClust: Bivariate Gaussian Genotype Clustering and Calling for Illumina Microarrays. Journal of Statistical Software, 71. DOI: 10.18637/jss.v071.c02
SNPMClust is an R package for genotype clustering and calling with Illumina microarrays. It was originally developed for studies using the GoldenGate custom genotyping platform but can be used with other Illumina platforms, including Infinium BeadChip. The algorithm first rescales the fluorescent signal intensity data, adds empirically derived pseudo-data to minor allele genotype clusters, then uses the package mclust for bivariate Gaussian model fitting. We compared the accuracy and sensitivity of SNPMClust to that of GenCall, Illumina's proprietary algorithm, on a data set of 94 whole-genome amplified buccal (cheek swab) DNA samples. These samples were genotyped on a custom panel which included 1064 SNPs for which the true genotype was known with high confidence. SNPMClust produced uniformly lower false call rates over a wide range of overall call rates.