Comparison of PrASE and Pyrosequencing for SNP Genotyping.
ABSTRACT: BACKGROUND: There is an imperative need for SNP genotyping technologies that are cost-effective per sample with retained high accuracy, throughput and flexibility. We have developed a microarray-based technique and compared it to Pyrosequencing. In the protease-mediated allele-specific extension (PrASE), the protease constrains the elongation reaction and thus prevents incorrect nucleotide incorporation to mismatched 3'-termini primers. RESULTS: The assay is automated for 48 genotyping reactions in parallel followed by a tag-microarray detection system. A script automatically visualizes the results in cluster diagrams and assigns the genotypes. Ten polymorphic positions suggested as prothrombotic genetic variations were analyzed with Pyrosequencing and PrASE technologies in 442 samples and 99.8 % concordance was achieved. In addition to accuracy, the robustness and reproducibility of the technique has been investigated. CONCLUSION: The results of this study strongly indicate that the PrASE technology can offer significant improvements in terms of accuracy and robustness and thereof increased number of typeable SNPs.
Project description:PURPOSE: The gene coding cytochrome P4501B1 (CYP1B1) has been shown to be a major cause of primary congenital glaucoma in the Iranian population. More recently it was shown to also be important in juvenile-onset open angle glaucoma (JOAG). We aimed to further investigate the role of CYP1B1 in a larger cohort of primary open angle glaucoma (POAG) patients which included late-onset patients. We also aimed to set up a microarray based protocol for mutation screening with an intent of using the protocol in a future population level screening program. METHODS: Sixty three POAG patients, nine affected family members, and thirty three previously genotyped primary congenital glaucoma (PCG) patients were included in the study. Clinical examination included slit lamp biomicroscopy, IOP measurement, gonioscopic evaluation, fundus examination, and measurement of perimetry. G61E, R368H, R390H, and R469W were screened by a protocol that included multiplexed allele specific amplification in the presence of a protease (PrASE), use of sequence tagged primers, and hybridization to generic arrays on microarray slides. The entire coding sequences of CYP1B1 and myocilin (MYOC) genes were sequenced in all individuals assessed by the microarray assay to carry a mutation. Intragenic single nucleotide polymorphism (SNP) haplotpes were determined for mutated alleles. RESULTS: Genotypes assessed by the array-based PrASE methodology were in 100% concordance with sequencing results. Seven mutation carrying POAG patients (11.1%) were identified, and their distribution was quite skewed between the juvenile-onset individuals (5/21) as compared to late-onset cases (2/42). Four of the seven mutation carrying Iranian patients harbored two mutated alleles. CYP1B1 mutated alleles in Iranian PCG and POAG patients shared common haplotypes. MYOC mutations were not observed in any of the patients. CONCLUSIONS: The PrASE approach allowed reliable simultaneous genotyping of many individuals. It can be an appropriate tool for screening common mutations in large sample sizes. The results suggest that CYP1B1 is implicated in POAG among Iranians, notably in the juvenile-onset form. Contrary to POAG patients studied in other populations, many mutation harboring Iranian patients carry two mutated alleles. We propose an explanation for this observation.
Project description:Here, we present a novel method for SNP genotyping based on protease-mediated allele-specific primer extension (PrASE), where the two allele-specific extension primers only differ in their 3'-positions. As reported previously [Ahmadian,A., Gharizadeh,B., O'Meara,D., Odeberg,J. and Lundeberg,J. (2001), Nucleic Acids Res., 29, e121], the kinetics of perfectly matched primer extension is faster than mismatched primer extension. In this study, we have utilized this difference in kinetics by adding protease, a protein-degrading enzyme, to discriminate between the extension reactions. The competition between the polymerase activity and the enzymatic degradation yields extension of the perfectly matched primer, while the slower extension of mismatched primer is eliminated. To allow multiplex and simultaneous detection of the investigated single nucleotide polymorphisms (SNPs), each extension primer was given a unique signature tag sequence on its 5' end, complementary to a tag on a generic array. A multiplex nested PCR with 13 SNPs was performed in a total of 36 individuals and their alleles were scored. To demonstrate the improvements in scoring SNPs by PrASE, we also genotyped the individuals without inclusion of protease in the extension. We conclude that the developed assay is highly allele-specific, with excellent multiplex SNP capabilities.
Project description:We present GStream, a method that combines genome-wide SNP and CNV genotyping in the Illumina microarray platform with unprecedented accuracy. This new method outperforms previous well-established SNP genotyping software. More importantly, the CNV calling algorithm of GStream dramatically improves the results obtained by previous state-of-the-art methods and yields an accuracy that is close to that obtained by purely CNV-oriented technologies like Comparative Genomic Hybridization (CGH). We demonstrate the superior performance of GStream using microarray data generated from HapMap samples. Using the reference CNV calls generated by the 1000 Genomes Project (1KGP) and well-known studies on whole genome CNV characterization based either on CGH or genotyping microarray technologies, we show that GStream can increase the number of reliably detected variants up to 25% compared to previously developed methods. Furthermore, the increased genome coverage provided by GStream allows the discovery of CNVs in close linkage disequilibrium with SNPs, previously associated with disease risk in published Genome-Wide Association Studies (GWAS). These results could provide important insights into the biological mechanism underlying the detected disease risk association. With GStream, large-scale GWAS will not only benefit from the combined genotyping of SNPs and CNVs at an unprecedented accuracy, but will also take advantage of the computational efficiency of the method.
Project description:To ensure accuracy of UGT1A1 (TA)n (rs3064744) genotyping for use in pharmacogenomics-based irinotecan dosing, we tested the concordance of several commonly used genotyping technologies. Heuristic genotype groupings and principal component analysis demonstrated concordance for Illumina sequencing, fragment analysis, and fluorescent PCR. However, Illumina sequencing and fragment analysis returned a range of fragment sizes, likely arising due to PCR "slippage". Direct sequencing was accurate, but this method led to ambiguous electrophoregrams, hampering interpretation of heterozygotes. Gel sizing, pyrosequencing, and array-based technologies were less concordant. Pharmacoscan genotyping was concordant, but it does not ascertain (TA)8 genotypes that are common in African populations. Method-based genotyping differences were also observed in the publication record (p < 0.0046), although fragment analysis and direct sequencing were concordant (p = 0.11). Genotyping errors can have significant consequences in a clinical setting. At the present time, we recommend that all genotyping for this allele be conducted with fluorescent PCR (fPCR).
Project description:Current technologies in next-generation sequencing are offering high throughput reads at low costs, but still suffer from various sequencing errors. Although pyro- and ion semiconductor sequencing both have the advantage of delivering long and high quality reads, problems might occur when sequencing homopolymer-containing regions, since the repeating identical bases are going to incorporate during the same synthesis cycle, which leads to uncertainty in base calling. The aim of this study was to evaluate the analytical performance of a pyrosequencing-based next-generation sequencing system in detecting homopolymer sequences using homopolymer-preintegrated plasmid constructs and human DNA samples originating from patients with cystic fibrosis.In the plasmid system average correct genotyping was 95.8% in 4-mers, 87.4% in 5-mers and 72.1% in 6-mers. Despite the experienced low genotyping accuracy in 5- and 6-mers, it was possible to generate amplicons with more than a 90% adequate detection rate in every homopolymer tract. When homopolymers in the CFTR gene were sequenced average accuracy was 89.3%, but varied in a wide range (52.2 - 99.1%). In all but one case, an optimal amplicon-sequencing primer combination could be identified. In that single case (7A tract in exon 14 (c.2046_2052)), none of the tested primer sets produced the required analytical performance.Our results show that pyrosequencing is the most reliable in case of 4-mers and as homopolymer length gradually increases, accuracy deteriorates. With careful primer selection, the NGS system was able to correctly genotype all but one of the homopolymers in the CFTR gene. In conclusion, we configured a plasmid test system that can be used to assess genotyping accuracy of NGS devices and developed an accurate NGS assay for the molecular diagnosis of CF using self-designed primers for amplification and sequencing.
Project description:BACKGROUND: Variations in the composition of the human intestinal microbiota are linked to diverse health conditions. High-throughput molecular technologies have recently elucidated microbial community structure at much higher resolution than was previously possible. Here we compare two such methods, pyrosequencing and a phylogenetic array, and evaluate classifications based on two variable 16S rRNA gene regions. METHODS AND FINDINGS: Over 1.75 million amplicon sequences were generated from the V4 and V6 regions of 16S rRNA genes in bacterial DNA extracted from four fecal samples of elderly individuals. The phylotype richness, for individual samples, was 1,400-1,800 for V4 reads and 12,500 for V6 reads, and 5,200 unique phylotypes when combining V4 reads from all samples. The RDP-classifier was more efficient for the V4 than for the far less conserved and shorter V6 region, but differences in community structure also affected efficiency. Even when analyzing only 20% of the reads, the majority of the microbial diversity was captured in two samples tested. DNA from the four samples was hybridized against the Human Intestinal Tract (HIT) Chip, a phylogenetic microarray for community profiling. Comparison of clustering of genus counts from pyrosequencing and HITChip data revealed highly similar profiles. Furthermore, correlations of sequence abundance and hybridization signal intensities were very high for lower-order ranks, but lower at family-level, which was probably due to ambiguous taxonomic groupings. CONCLUSIONS: The RDP-classifier consistently assigned most V4 sequences from human intestinal samples down to genus-level with good accuracy and speed. This is the deepest sequencing of single gastrointestinal samples reported to date, but microbial richness levels have still not leveled out. A majority of these diversities can also be captured with five times lower sampling-depth. HITChip hybridizations and resulting community profiles correlate well with pyrosequencing-based compositions, especially for lower-order ranks, indicating high robustness of both approaches. However, incompatible grouping schemes make exact comparison difficult.
Project description:BACKGROUND: Arrayed primer extension (APEX) is a microarray-based rapid minisequencing methodology that may have utility in 'personalized medicine' applications that involve genetic diagnostics of single nucleotide polymorphisms (SNPs). However, to date there have been few reports that objectively evaluate the assay completion rate, call rate and accuracy of APEX. We have further developed robust assay design, chemistry and analysis methodologies, and have sought to determine how effective APEX is in comparison to leading 'gold-standard' genotyping platforms. Our methods have been tested against industry-leading technologies in two blinded experiments based on Coriell DNA samples and SNP genotype data from the International HapMap Project. RESULTS: In the first experiment, we genotyped 50 SNPs across the entire 270 HapMap Coriell DNA sample set. For each Coriell sample, DNA template was amplified in a total of 7 multiplex PCRs prior to genotyping. We obtained good results for 41 of the SNPs, with 99.8% genotype concordance with HapMap data, at an automated call rate of 94.9% (not including the 9 failed SNPs). In the second experiment, involving modifications to the initial DNA amplification so that a single 50-plex PCR could be achieved, genotyping of the same 50 SNPs across each of 49 randomly chosen Coriell DNA samples allowed extremely robust 50-plex genotyping from as little as 5 ng of DNA, with 100% assay completion rate, 100% call rate and >99.9% accuracy. CONCLUSION: We have shown our methods to be effective for robust multiplex SNP genotyping using APEX, with 100% call rate and >99.9% accuracy. We believe that such methodology may be useful in future point-of-care clinical diagnostic applications where accuracy and call rate are both paramount.
Project description:To date, microarray-based genotyping of large, complex plant genomes has been complicated by the need to perform genome complexity reduction to obtain sufficiently strong hybridization signals. Genome complexity reduction techniques are, however, tedious and can introduce unwanted variables into genotyping assays. Here, we report a microarray-based genotyping technology for complex genomes (such as the 2.3 GB maize genome) that does not require genome complexity reduction prior to hybridization. Approximately 200,000 long oligonucleotide probes were identified as being polymorphic between the inbred parents of a mapping population and used to genotype two recombinant inbred lines. While multiple hybridization replicates provided ?97% accuracy, even a single replicate provided ?95% accuracy. Genotyping accuracy was further increased to >99% by utilizing information from adjacent probes. This microarray-based method provides a simple, high-density genotyping approach for large, complex genomes.
Project description:The advent of phylogenetic DNA microarrays and high-throughput pyrosequencing technologies has dramatically increased the resolution and accuracy of detection of distinct microbial lineages in mixed microbial assemblages. Despite an expanding array of approaches for detecting microbes in a given sample, rapid and robust means of assessing the differential viability of these cells, as a function of phylogenetic lineage, remain elusive. In this study, pre-PCR propidium monoazide (PMA) treatment was coupled with downstream pyrosequencing and PhyloChip DNA microarray analyses to better understand the frequency, diversity and distribution of viable bacteria in spacecraft assembly cleanrooms. Sample fractions not treated with PMA, which were indicative of the presence of both live and dead cells, yielded a great abundance of highly diverse bacterial pyrosequences. In contrast, only 1% to 10% of all of the pyrosequencing reads, arising from a few robust bacterial lineages, originated from sample fractions that had been pre-treated with PMA. The results of PhyloChip analyses of PMA-treated and -untreated sample fractions were in agreement with those of pyrosequencing. The viable bacterial population detected in cleanrooms devoid of spacecraft hardware was far more diverse than that observed in cleanrooms that housed mission-critical spacecraft hardware. The latter was dominated by hardy, robust organisms previously reported to survive in oligotrophic cleanroom environments. Presented here are the findings of the first ever comprehensive effort to assess the viability of cells in low-biomass environmental samples, and correlate differential viability with phylogenetic affiliation.
Project description:Efforts to correlate genetic variations with phenotypic differences are intensifying due to the availability of high-density maps of single nucleotide polymorphisms (SNPs) and the development of high throughput scoring methods. These recent advances have led to an increased interest for improved multiplex preparations of genetic material to facilitate such whole genome analyses. Here we propose a strategy for the parallel amplification of polymorphic loci based on a reduced set of nucleotides. The technique denoted Tri-nucleotide Threading (TnT), allows SNPs to be amplified via controlled linear amplification followed by complete removal of the target material and subsequent amplification with a pair of universal primers. A dedicated software tool was developed for this purpose and variable positions in genes associated with different forms of cancer were analyzed using sub-nanogram amounts of starting material. The amplified fragments were then successfully scored using a microarray-based PrASE technique. The results of this study, in which 75 SNPs were analyzed, show that the TnT technique circumvents potential problems associated with multiplex amplification of SNPs from minute amounts of material. The technique is specific, sensitive and can be readily adapted to equipment and genotyping techniques used in other research laboratories without requiring changes to the preferred typing method.