Massively parallel resequencing of the isogenic Drosophila melanogaster strain w(1118); iso-2; iso-3 identifies hotspots for mutations in sensory perception genes
We used the Illumina reversible-short sequencing technology to obtain 17-fold average depth (s.d.~8) of ~94% of the euchromatic genome and ~1-5% of the heterochromatin sequence of the Drosophila melogaster isogenic strain w1118; iso-2; iso-3. We show that this strain has a ~9 kb deletion that uncovers the first exon of the white (w) gene, ~4 kb of downstream promoter sequences, and most of the first intron, thus demonstrating that whole-genome sequencing can be used for mutation characterization. We chose this strain because there are thousands of transposon insertion lines and hundreds of isogenic deficiency lines available with this genetic background, such as the Exelixis, Inc., and the DrosDEL collections. We compared our sequence to Release 5 of the finished reference genome sequence which was made from the isogenic strain y1; cn1 bw1 sp1 and identified ~356,614 candidate SNPs in the ~117 Mb unique sequence genome, which represents a substitution rate of ~1/305 nucleotides (~0.30%). The distribution of SNPs is not uniform, but rather there is a ~2-fold increase in SNPs on the autosome arms compared with the X chromosome and a ~7-fold increase when compared to the small 4th chromosome. This is consistent with previous analyses that demonstrated a correlation between recombination frequency and SNP frequency. An unexpected finding was a SNP hotpot in a ~20Mb central region of the 4th chromosome, which might indicate higher than expected recombination frequency in this region of this chromosome. Interestingly, genes involved in sensory perception are enriched in SNP hotspots and genes encoding developmental genes are enriched in SNP coldspots, which suggests that recombination frequencies might be proportional to the evolutionary selection coefficient. There are currently 12 Drosophila species sequenced, and this represents one of many isogenic Drosophila melanogaster genome sequences that are in progress. Because of the dramatic increase in power in using isogenic lines rather than outbred individuals, the SNP information should be valuable as a test bed for understanding genotype-by-environment interactions in human population studies.