Project description:Polymorphic inversions contribute to adaptation and phenotypic variation. However, large multi-centric association studies of inversions remain challenging. We present scoreInvHap, a method to genotype inversions from SNP data for genome-wide association studies (GWASs), overcoming important limitations of current methods and outperforming them in accuracy and applicability. scoreInvHap calls individual inversion-genotypes from a similarity score to the SNPs of experimentally validated references. It can be used on different sources of SNP data, including those with low SNP coverage such as exome sequencing, and is easily adaptable to genotype new inversions, either in humans or in other species. We present 20 human inversions that can be reliably and easily genotyped with scoreInvHap to discover their role in complex human traits, and illustrate a first genome-wide association study of experimentally-validated human inversions. scoreInvHap is implemented in R and it is freely available from Bioconductor.
Project description:ObjectiveThe admixture of domestic pig into wild boar populations is controlled until now, by cytogenetic analysis. Even if a first-generation hybrid animal is discernable because of its 37-chromosome karyotype, the cytogenetic method is not applicable in the case of advanced intercrosses. The aim of this study is therefore to evaluate the use of SNP (Single Nucleotide Polymorphism) markers as an alternative technology to characterize recent or past hybridization between the two sub-species. The final goal would be to develop a molecular diagnostic tool.Data descriptionThe Geneseek Genomic Profiler High-Density porcine beadchip (GGP70KHD, Illumina, USA), comprising 68,516 porcine SNPs, was used on a set of 362 wild boars with diverse chromosomal statuses collected from different areas and breeding environments in France. We generated approximately 62,192-64,046 genotypes per wild boar. The present dataset might be useful for the community (i) for developing molecular tools to evaluate the admixture of domestic pig into wild boar populations, and (ii) for genetic diversity studies including wild boar species or phylogeny analyses of Suidae populations. Raw data files and a processed matrix data file were deposited in the ArrayExpress at European Molecular Biology Laboratory-European Bioinformatics Institute (EMBL-EBI) data portal under accession number E-MTAB-10591.
Project description:With the increasing demand for higher throughput single nucleotide polymorphism (SNP) genotyping, the quantity of genomic DNA often falls short of the number of assays required. We investigated the use of degenerate oligonucleotide primed polymerase chain reaction (DOP-PCR) to generate a template for our SNP genotyping methodology of fluorescence polarization template-directed dye-terminator incorporation detection. DOP-PCR employs a degenerate primer (5'-CCGACTCGAGNNNNNNATGTGG-3') to produce non-specific uniform amplification of DNA. This approach has been successfully applied to microsatellite genotyping. We compared genotyping of DOP-PCR-amplified genomic DNA to genomic DNA as a template. Results were analyzed with respect to feasibility, allele loss of alleles, genotyping accuracy and storage conditions in a high-throughput genotyping environment. DOP-PCR yielded overall satisfactory results, with a certain loss in accuracy and quality of the genotype assignments. Accuracy and quality of genotypes generated from the DOP-PCR template also depended on storage conditions. Adding carrier DNA to a final concentration of 10 ng/microl improved results. In conclusion, we have successfully used DOP-PCR to amplify our genomic DNA collection for subsequent SNP genotyping as a standard process.
Project description:BackgroundMany aspects of transfusion medicine are affected by genetics. Current single-nucleotide polymorphism (SNP) arrays are limited in the number of targets that can be interrogated and cannot detect all variation of interest. We designed a transfusion medicine array (TM-Array) for study of both common and rare transfusion-relevant variations in genetically diverse donor and recipient populations.Study design and methodsThe array was designed by conducting extensive bioinformatics mining and consulting experts to identify genes and genetic variation related to a wide range of transfusion medicine clinical relevant and research-related topics. Copy number polymorphisms were added in the alpha globin, beta globin, and Rh gene clusters.ResultsThe final array contains approximately 879,000 SNP and copy number polymorphism markers. Over 99% of SNPs were called reliably. Technical replication showed the array to be robust and reproducible, with an error rate less than 0.03%. The array also had a very low Mendelian error rate (average parent-child trio accuracy of 0.9997). Blood group results were in concordance with serology testing results, and the array accurately identifies rare variants (minor allele frequency of 0.5%). The array achieved high genome-wide imputation coverage for African-American (97.5%), Hispanic (96.1%), East Asian (94.6%), and white (96.1%) genomes at a minor allele frequency of 5%.ConclusionsA custom array for transfusion medicine research has been designed and evaluated. It gives wide coverage and accurate identification of rare SNPs in diverse populations. The TM-Array will be useful for future genetic studies in the diverse fields of transfusion medicine research.
Project description:IntroductionHeel pricks are performed on newborns for diagnostic screenings of various pre-symptomatic metabolic and genetic diseases. Excess blood is spotted on Guthrie cards and archived by many states in biobanks for follow-up diagnoses and public health research. However, storage environment may vary across biobanks and across time within biobanks. With increased applications of DNA extracted from spots for genetic studies, identifying factors associated with genotyping success is critical to maximize DNA quality for future studies.MethodWe evaluated 399 blood spots, which were part of a genome-wide association study of childhood leukemia risk in children with Down syndrome, archived at the Michigan Neonatal Biobank between 1992 and 2008. High quality DNA was defined as having post-quality control call rate ≥ 99.0% based on the Illumina GenomeStudio 2.0 GenCall algorithm after processing the samples on the Illumina Infinium Global Screening Array. Bivariate analyses and multivariable logistic regression models were applied to evaluate effects of storage environment and storage duration on DNA genotyping quality.ResultsBoth storage environment and duration were associated with sample genotyping call rates (p-values < 0.001). Sample call rates were associated with storage duration independent of storage environment (p-trend = 0.006 for DBS archived in an uncontrolled environment and p-trend = 0.002 in a controlled environment). However, 95% of the total sample had high genotyping quality with a call rate ≥ 95.0%, a standard threshold for acceptable sample quality in many genetic studies.ConclusionBlood spot DNA quality was lower in samples archived in uncontrolled storage environments and for samples archived for longer durations. Still, regardless of storage environment or duration, neonatal biobanks including the Michigan Neonatal Biobanks can provide access to large collections of spots with DNA quality acceptable for most genotyping studies.
Project description:We examine the measurement properties of pooled DNA odds ratio estimates for 7,357 single nucleotide polymorphisms (SNPs) genotyped in a genome-wide association study of postmenopausal breast cancer. This study involved DNA pools formed from 125 cases or 125 matched controls. Individual genotyping for these SNPs subsequently came available for a substantial majority of women included in seven pool pairs, providing the opportunity for a comparison of pooled DNA and individual odds ratio estimates and their variances. We find that the "per minor allele" odds ratio estimates from the pooled DNA comparisons agree fairly well with those from individual genotyping. Furthermore, the log-odds ratio variance estimates support a pooled DNA measurement model that we previously described, although with somewhat greater extra-binomial variation than was hypothesized in project design. Implications for the role of pooled DNA comparisons in the future genetic epidemiology research agenda are discussed.
Project description:During the last several years, high-density genotyping SNP arrays have facilitated genome-wide association studies (GWAS) that successfully identified common genetic variants associated with a variety of phenotypes. However, each of the identified genetic variants only explains a very small fraction of the underlying genetic contribution to the studied phenotypic trait. Moreover, discordance observed in results between independent GWAS indicates the potential for Type I and II errors. High reliability of genotyping technology is needed to have confidence in using SNP data and interpreting GWAS results. Therefore, reproducibility of two widely genotyping technology platforms from Affymetrix and Illumina was assessed by analyzing four technical replicates from each of the six individuals in five laboratories. Genotype concordance of 99.40% to 99.87% within a laboratory for the sample platform, 98.59% to 99.86% across laboratories for the same platform, and 98.80% across genotyping platforms was observed. Moreover, arrays with low quality data were detected when comparing genotyping data from technical replicates, but they could not be detected according to venders' quality control (QC) suggestions. Our results demonstrated the technical reliability of currently available genotyping platforms but also indicated the importance of incorporating some technical replicates for genotyping QC in order to improve the reliability of GWAS results. The impact of discordant genotypes on association analysis results was simulated and could explain, at least in part, the irreproducibility of some GWAS findings when the effect size (i.e. the odds ratio) and the minor allele frequencies are low.