Massively parallel sequencing of micro-manipulated cells targeting a comprehensive panel of disease-causing genes: A comparative evaluation of upstream whole-genome amplification methods.
ABSTRACT: Single Gene Disorders (SGD) are still routinely diagnosed using PCR-based assays that need to be developed and validated for each individual disease-specific gene fragment. The TruSight One sequencing panel currently covers 12 Mb of genomic content, including 4813 genes associated with a clinical phenotype. When only a limited number of cells are available, whole genome amplification (WGA) is required prior to DNA target capture techniques such as the TruSight One panel. In this study, we compared 4 different WGA methods in combination with the TruSight One sequencing panel to perform single nucleotide polymorphism (SNP) genotyping starting from 3 micro-manipulated cells. This setting simulates clinical settings such as day-5 blastocyst biopsy for Preimplantation Genetic Testing (PGT), liquid biopsy of circulating tumor cells (CTCs) and cancer-cell profiling. Bulk cell samples were processed alongside these WGA samples to serve as a performance reference. Target coverage, coverage uniformity and SNP calling accuracy obtained using any of the WGA, is inferior to the results obtained on bulk cell samples. However, results after REPLI-g come close. Compared to the other WGA methods, the method using REPLI-g WGA results in a better coverage of the targeted genomic regions with a more uniform read depth. Consequently, this method also results in a more accurate SNP calling and could be considered for clinical genotyping of a limited number of cells.
Project description:<h4>Background</h4>Whole genome amplification (WGA) is currently a prerequisite for single cell whole genome or exome sequencing. Depending on the method used the rate of artifact formation, allelic dropout and sequence coverage over the genome may differ significantly.<h4>Results</h4>The largest difference between the evaluated protocols was observed when analyzing the target coverage and read depth distribution. These differences also had impact on the downstream variant calling. Conclusively, the products from the AMPLI1 and MALBAC kits were shown to be most similar to the bulk samples and are therefore recommended for WGA of single cells.<h4>Discussion</h4>In this study four commercial kits for WGA (AMPLI1, MALBAC, Repli-G and PicoPlex) were used to amplify human single cells. The WGA products were exome sequenced together with non-amplified bulk samples from the same source. The resulting data was evaluated in terms of genomic coverage, allelic dropout and SNP calling.
Project description:Whole genome amplification (WGA) is required for single cell genotyping. Effectiveness of currently available WGA technologies in combination with next generation sequencing (NGS) and material preservation is still elusive.In respect to the accuracy of SNP/mutation, indel, and copy number aberrations (CNA) calling, the HiSeq2000 platform outperformed IonProton in all aspects. Furthermore, more accurate SNP/mutation and indel calling was demonstrated using single tumor cells obtained from EDTA-collected blood in respect to CellSave-preserved blood, whereas CNA analysis in our study was not detectably affected by fixation. Although MDA-based WGA yielded the highest DNA amount, DNA quality was not adequate for downstream analysis. PCR-based WGA demonstrates superiority over MDA-PCR combining technique for SNP and indel analysis in single cells. However, SNP calling performance of MDA-PCR WGA improves with increasing amount of input DNA, whereas CNA analysis does not. The performance of PCR-based WGA did not significantly improve with increase of input material. CNA profiles of single cells, amplified with MDA-PCR technique and sequenced on both HiSeq2000 and IonProton platforms, resembled unamplified DNA the most.We analyzed the performance of PCR-based, multiple-displacement amplification (MDA)-based, and MDA-PCR combining WGA techniques (WGA kits Ampli1, REPLI-g, and PicoPlex, respectively) on single and pooled tumor cells obtained from EDTA- and CellSave-preserved blood and archival material. Amplified DNA underwent exome-Seq with the Illumina HiSeq2000 and ThermoFisher IonProton platforms.We demonstrate the feasibility of single cell genotyping of differently preserved material, nevertheless, WGA and NGS approaches have to be chosen carefully depending on the study aims.
Project description:Genomic characterization of circulating tumor cells (CTCs) may prove useful as a surrogate for conventional tissue biopsies. This is particularly important as studies have shown different mutational profiles between CTCs and ctDNA in some tumor subtypes. However, isolating rare CTCs from whole blood has significant hurdles. Very limited DNA quantities often can't meet NGS requirements without whole genome amplification (WGA). Moreover, white blood cells (WBC) germline contamination may confound CTC somatic mutation analyses. Thus, a good CTC enrichment platform with an efficient WGA and NGS workflow are needed. Here, Vortex label-free CTC enrichment platform was used to capture CTCs. DNA extraction was optimized, WGA evaluated and targeted NGS tested. We used metastatic colorectal cancer (CRC) as the clinical target, HCT116 as the corresponding cell line, GenomePlex® and REPLI-g as the WGA methods, GeneRead DNAseq Human CRC Panel as the 38 gene panel. The workflow was further validated on metastatic CRC patient samples, assaying both tumor and CTCs. WBCs from the same patients were included to eliminate germline contaminations. The described workflow performed well on samples with sufficient DNA, but showed bias for rare cells with limited DNA input. REPLI-g provided an unbiased amplification on fresh rare cells, enabling an accurate variant calling using the targeted NGS. Somatic variants were detected in patient CTCs and not found in age matched healthy donors. This demonstrates the feasibility of a simple workflow for clinically relevant monitoring of tumor genetics in real time and over the course of a patient's therapy using CTCs.
Project description:Circulating tumor cells (CTCs) have a great potential for noninvasive diagnosis and real-time monitoring of cancer. A comprehensive evaluation of four whole genome amplification (WGA)/next-generation sequencing workflows for genomic analysis of single CTCs, including PCR-based (GenomePlex and Ampli1), multiple displacement amplification (Repli-g), and hybrid PCR- and multiple displacement amplification-based [multiple annealing and loop-based amplification cycling (MALBAC)] is reported herein. To demonstrate clinical utilities, copy number variations (CNVs) in single CTCs isolated from four patients with squamous non-small-cell lung cancer were profiled. Results indicate that MALBAC and Repli-g WGA have significantly broader genomic coverage compared with GenomePlex and Ampli1. Furthermore, MALBAC coupled with low-pass whole genome sequencing has better coverage breadth, uniformity, and reproducibility and is superior to Repli-g for genome-wide CNV profiling and detecting focal oncogenic amplifications. For mutation analysis, none of the WGA methods were found to achieve sufficient sensitivity and specificity by whole exome sequencing. Finally, profiling of single CTCs from patients with non-small-cell lung cancer revealed potentially clinically relevant CNVs. In conclusion, MALBAC WGA coupled with low-pass whole genome sequencing is a robust workflow for genome-wide CNV profiling at single-cell level and has great potential to be applied in clinical investigations. Nevertheless, data suggest that none of the evaluated single-cell sequencing workflows can reach sufficient sensitivity or specificity for mutation detection required for clinical applications.
Project description:Combining single-cell methods and next-generation sequencing should provide a powerful means to understand single-cell biology and obviate the effects of sample heterogeneity. Here we report a single-cell identification method and seamless cancer gene profiling using semiconductor-based massively parallel sequencing. A549 cells (adenocarcinomic human alveolar basal epithelial cell line) were used as a model. Single-cell capture was performed using laser capture microdissection (LCM) with an Arcturus® XT system, and a captured single cell and a bulk population of A549 cells (? 10(6) cells) were subjected to whole genome amplification (WGA). For cell identification, a multiplex PCR method (AmpliSeq™ SNP HID panel) was used to enrich 136 highly discriminatory SNPs with a genotype concordance probability of 10(31-35). For cancer gene profiling, we used mutation profiling that was performed in parallel using a hotspot panel for 50 cancer-related genes. Sequencing was performed using a semiconductor-based bench top sequencer. The distribution of sequence reads for both HID and Cancer panel amplicons was consistent across these samples. For the bulk population of cells, the percentages of sequence covered at coverage of more than 100 × were 99.04% for the HID panel and 98.83% for the Cancer panel, while for the single cell percentages of sequence covered at coverage of more than 100 × were 55.93% for the HID panel and 65.96% for the Cancer panel. Partial amplification failure or randomly distributed non-amplified regions across samples from single cells during the WGA procedures or random allele drop out probably caused these differences. However, comparative analyses showed that this method successfully discriminated a single A549 cancer cell from a bulk population of A549 cells. Thus, our approach provides a powerful means to overcome tumor sample heterogeneity when searching for somatic mutations.
Project description:Methods of comprehensive microarray based analyses of single cell DNA are rapidly emerging. Whole genome amplification (WGA) remains a critical component for these methods to be successful. A number of commercially available WGA kits have been independently utilized in previous single cell microarray studies. However, direct comparison of their performance on single cells has not been conducted. The present study demonstrates that among previously published methods, a single cell GenomePlex WGA protocol provides the best combination of speed and accuracy for SNP microarray based copy number analysis when compared to a REPLI-g or GenomiPhi based protocol. Alternatively, for applications that do not have constraints on turn-around time and that are directed at accurate genotyping rather than copy number assignments, a REPLI-g based protocol may provide the best solution. Affymetrix SNP arrays were processed according to the manufacturer's directions on DNA extracted from human fibroblast cell lines and single fibroblast cells. Afflymetrix SNP array analysis was successfully completed on 46 lymphocyte single cell samples, 8 gDNA extracted from cell lines, 11 reference gDNA extracted from cell lines and 3 reference gDNA samples from the RMA of New Jersey DNA bank. GSM617116 to GSM617129: CEL files were processed using GTYPE version 4 (Affymetrix Inc., Genotyping Console 4.0 Manual) using the DM algorithm for genotype calls. Copy number and loss of heterozygosity were calculated from CHP files using CNAT version 4.1 (Affymetrix Inc., Genotyping Console 4.0 Manual) analysis against a reference set consisting of three normal females from in house gDNA bank, 11 normal females from Coriel cell lines and 16 normal females from the HapMap database (www.hapmap.org). The 16 normal females are NA10855, NA10863, NA11832, NA12057, NA12234, NA12717, NA12813, NA18505, NA18508, NA18517, NA19137, NA19152, NE00088, NE00091, NE00403, and NE01119.
Project description:The custom-designed single nucleotide polymorphism (SNP) panel amplified 231 autosomal SNPs in one PCR reaction and subsequently sequenced with massively parallel sequencing (MPS) technology and Ion Torrent personal genome machine (PGM). SNPs were chosen from SNPforID, IISNP, HapMap, dbSNP, and related published literatures. Full concordance was obtained between available MPS calling and Sanger sequencing with 9947A and 9948 controls. Ten SNPs (rs4606077, rs334355, rs430046, rs2920816, rs4530059, rs1478829, rs1498553, rs7141285, rs12714757 and rs2189011) with low coverage or heterozygote imbalance should be optimized or excluded from the panel. Sequence data had sufficiently high coverage and gave reliable SNP calling for the remaining 221 loci with the custom MPS-SNP panel. A default DNA input amount of 10 ng per reaction was recommended by Ampliseq technology but sensitivity testing revealed positive results from as little as 1 ng input DNA. Mixture testing with this panel is possible through analysis of the F MAR (frequency of major allele reads) values at most loci with enough high coverage depth and low level of sequencing noise. These results indicate the potential advantage of the custom MPS-SNP assays and Ion Torrent PGM platform for forensic study.
Project description:Sequencing key cancer-driver genes using formalin-fixed, paraffin-embedded (FFPE) cancer tissues is becoming the standard for identifying the best treatment regimen. However, about 25% of all samples are rejected for genetic analyses for reasons that include too little tissue to extract enough high quality DNA. One way to overcome this is to do whole-genome amplification (WGA) in clinical samples, but only limited studies have tested different WGA methods in FFPE cancer specimens using targeted next-generation sequencing (NGS). We therefore tested the two most commonly used WGA methods, multiple displacement amplification (MDA-Qiagen REPLI-g kit) and the hybrid or modified PCR-based method (Sigma/Rubicon Genomics Inc. GenomePlex kit) in FFPE normal and tumor tissue specimens. For the normalized copy number analysis, the FFPE process caused none or very minimal bias. Variations in copy number were minimal in samples amplified using the GenomePlex kit, but they were statistically significantly higher in samples amplified using the REPLI-g kit. The pattern was similar for variant allele frequencies across the samples, which was minimal for the GenomePlex kit but highly variable for the REPLI-g kit. These findings suggest that each WGA method should be tested thoroughly before using it for clinical cancer samples.
Project description:Methods of comprehensive microarray-based aneuploidy screening in single cells are rapidly emerging. Whole-genome amplification (WGA) remains a critical component for these methods to be successful. A number of commercially available WGA kits have been independently utilized in previous single-cell microarray studies. However, direct comparison of their performance on single cells has not been conducted. The present study demonstrates that among previously published methods, a single-cell GenomePlex WGA protocol provides the best combination of speed and accuracy for single nucleotide polymorphism microarray-based copy number (CN) analysis when compared with a REPLI-g- or GenomiPhi-based protocol. Alternatively, for applications that do not have constraints on turnaround time and that are directed at accurate genotyping rather than CN assignments, a REPLI-g-based protocol may provide the best solution.
Project description:Amplification of minute quantities of DNA is a fundamental challenge in low-biomass metagenomic and microbiome studies because of potential biases in coverage, guanine-cytosine (GC) content, and altered species abundances. Whole genome amplification (WGA), although widely used, is notorious for introducing artifact sequences, either by amplifying laboratory contaminants or by nonrandom amplification of a sample's DNA. In this study, we investigate the effect of REPLI-g multiple displacement amplification (MDA; Qiagen, Valencia, CA, USA) on sequencing data quality and species abundance detection in 8 paired metagenomic samples and 1 titrated, mixed control sample. We extracted and sequenced genomic DNA (gDNA) from 8 environmental samples and compared the quality of the sequencing data for the MDA and their corresponding non-MDA samples. The degree of REPLI-g MDA bias was evaluated by sequence metrics, species composition, and cross-validating observed species abundance and species diversity estimates using the One Codex and MetaPhlAn taxonomic classification tools. Here, we provide evidence of the overall efficacy of REPLI-g MDA on retaining sequencing data quality and species abundance measurements while providing increased yields of high-fidelity DNA. We find that species abundance estimates are largely consistent across samples, even with REPLI-g amplification, as demonstrated by the Spearman's rank order coefficient (R2 > 0.8). However, REPLI-g MDA often produced fewer classified reads at the species, genera, and family level, resulting in decreased species diversity. We also observed some areas with the PCR "jackpot effect," with varying input DNA values for the Metagenomics Research Group (MGRG) controls at specific genomic loci. We visualize this effect in whole genome coverage plots and with sequence composition analyses and note these caveats of the MDA method. Despite overall concordance of species abundance between the amplified and unamplified samples, these results demonstrate that amplification of DNA using the REPLI-g method has some limitations. These concerns could be addressed by future improvements in the enzymes or methods for REPLI-g to be considered a >99% robust method for increasing the amount of high-fidelity DNA from low-biomass samples or at the very least, accounted for during computational analysis of MDA samples.