SCCNV: A Software Tool for Identifying Copy Number Variation From Single-Cell Whole-Genome Sequencing.
ABSTRACT: Identification of de novo copy number variations (CNVs) across the genome in single cells requires single-cell whole-genome amplification (WGA) and sequencing. Although many experimental protocols of amplification methods have been developed, all suffer from uneven distribution of read depth across the genome after sequencing of DNA amplicons, which constrains the usage of conventional CNV calling methodologies. Here, we present SCCNV, a software tool for detecting CNVs from whole genome-amplified single cells. SCCNV is a read-depth based approach with adjustment for the WGA bias. We demonstrate its performance by analyzing data obtained with most of the single-cell amplification methods that have been employed for CNV analysis, including DOP-PCR, MDA, MALBAC, and LIANTI. SCCNV is freely available at https://github.com/biosinodx/SCCNV.
Project description:Single-cell resequencing (SCRS) provides many biomedical advances in variations detection at the single-cell level, but it currently relies on whole genome amplification (WGA). Three methods are commonly used for WGA: multiple displacement amplification (MDA), degenerate-oligonucleotide-primed PCR (DOP-PCR) and multiple annealing and looping-based amplification cycles (MALBAC). However, a comprehensive comparison of variations detection performance between these WGA methods has not yet been performed.We systematically compared the advantages and disadvantages of different WGA methods, focusing particularly on variations detection. Low-coverage whole-genome sequencing revealed that DOP-PCR had the highest duplication ratio, but an even read distribution and the best reproducibility and accuracy for detection of copy-number variations (CNVs). However, MDA had significantly higher genome recovery sensitivity (~84 %) than DOP-PCR (~6 %) and MALBAC (~52 %) at high sequencing depth. MALBAC and MDA had comparable single-nucleotide variations detection efficiency, false-positive ratio, and allele drop-out ratio. We further demonstrated that SCRS data amplified by either MDA or MALBAC from a gastric cancer cell line could accurately detect gastric cancer CNVs with comparable sensitivity and specificity, including amplifications of 12p11.22 (KRAS) and 9p24.1 (JAK2, CD274, and PDCD1LG2).Our findings provide a comprehensive comparison of variations detection performance using SCRS amplified by different WGA methods. It will guide researchers to determine which WGA method is best suited to individual experimental needs at single-cell level.
Project description:Single-cell genomic analysis has grown rapidly in recent years and finds widespread applications in various fields of biology, including cancer biology, development, immunology, pre-implantation genetic diagnosis, and neurobiology. To date, the amplification bias, amplification uniformity and reproducibility of the three major single cell whole genome amplification methods (GenomePlex WGA4, MDA and MALBAC) have not been systematically investigated using mammalian cells. In this study, we amplified genomic DNA from individual hippocampal neurons using three single-cell DNA amplification methods, and sequenced them at shallow depth. We then systematically evaluated the GC-bias, reproducibility, and copy number variations among individual neurons. Our results showed that single-cell genome sequencing results obtained from the MALBAC and WGA4 methods are highly reproducible and have a high success rate. The MALBAC displays significant biases towards high GC content. We then attempted to correct the GC bias issue by developing a bioinformatics pipeline, which allows us to call CNVs in single cell sequencing data, and chromosome level and sub-chromosomal level CNVs among individual neurons can be detected. We also proposed a metric to determine the CNV detection limits. Overall, MALBAC and WGA4 have better performance than MDA in detecting CNVs.
Project description:<h4>Background</h4>Whole genome amplification (WGA) is currently a prerequisite for single cell whole genome or exome sequencing. Depending on the method used the rate of artifact formation, allelic dropout and sequence coverage over the genome may differ significantly.<h4>Results</h4>The largest difference between the evaluated protocols was observed when analyzing the target coverage and read depth distribution. These differences also had impact on the downstream variant calling. Conclusively, the products from the AMPLI1 and MALBAC kits were shown to be most similar to the bulk samples and are therefore recommended for WGA of single cells.<h4>Discussion</h4>In this study four commercial kits for WGA (AMPLI1, MALBAC, Repli-G and PicoPlex) were used to amplify human single cells. The WGA products were exome sequenced together with non-amplified bulk samples from the same source. The resulting data was evaluated in terms of genomic coverage, allelic dropout and SNP calling.
Project description:<h4>Aim</h4>To select an optimal whole-genome amplification (WGA) method to improve the efficiency of the preimplantation genetic diagnosis and screening (PGD/PGS) of beta-thalassaemia disorders.<h4>Methods</h4>Fifty-seven fibroblast samples with defined beta-thalassaemia variations and forty-eight single-blastomere samples were amplified from single-, two-, and five-cell samples by multiple annealing and looping-based amplification cycles (MALBAC) and the multiple displacement amplification (MDA) method. Low-depth, high-throughput sequencing was performed to evaluate and compare the coefficiencies of the chromosomal copy number variation (CNV) detection rate and the allele dropout (ADO) rate between these two methods.<h4>Results</h4>At the single-cell level, the success rates of the CNV detection in the fibroblast samples were 100% in the MALBAC group and 91.67% in the MDA group; the coefficient of variation in the CNV detection in the MALBAC group was significantly superior to that in the MDA group (0.15 vs 0.37). The total ADO rate in the HBB allele detection was 4.55% in the MALBAC group, which was significantly lower than the 22.5% rate observed in the MDA group. However, when five or more cells were used as the starting template, the ADO rate significantly decreased, and these two methods did not differ significantly.<h4>Conclusions</h4>For the genetic diagnosis of HBB gene variation at the single-cell level, MALBAC is a more suitable method due to its higher level of uniformity and specificity. When five or more cells are used as the starting template, both methods exhibit similar efficiency, increased accuracy, and a similar success rate in PGD/PGS.
Project description:With the development and clinical application of genomics, more and more concern is focused on single-cell sequencing. In the process of single-cell sequencing, whole genome amplification is a key step to enrich sample DNA. Previous studies have compared the performance of different whole genome amplification (WGA) strategies on Illumina sequencing platforms, but there is no related research aimed at Ion Proton platform, which is also a popular next-generation sequencing platform. Here by amplifying cells from six cell lines with different karyotypes, we estimated the data features of four common commercial WGA kits (PicoPLEX WGA Kit, GenomePlex Single Cell Whole Genome Amplification Kit, MALBAC Single Cell Whole Genome Amplification Kit, and REPLI-g Single Cell Kit), including median absolute pairwise difference, uniformity, reproducibility, and fidelity, and examined their performance of copy number variation detection. The results showed that both MALBAC and PicoPLEX could yield high-quality data and had high reproducibility and fidelity; and as for uniformity, PicoPLEX was slightly superior to MALBAC.
Project description:The genomes of large numbers of single cells must be sequenced to further understanding of the biological significance of genomic heterogeneity in complex systems. Whole genome amplification (WGA) of single cells is generally the first step in such studies, but is prone to nonuniformity that can compromise genomic measurement accuracy. Despite recent advances, robust performance in high-throughput single-cell WGA remains elusive. Here, we introduce droplet multiple displacement amplification (MDA), a method that uses commercially available liquid dispensing to perform high-throughput single-cell MDA in nanoliter volumes. The performance of droplet MDA is characterized using a large dataset of 129 normal diploid cells, and is shown to exceed previously reported single-cell WGA methods in amplification uniformity, genome coverage, and/or robustness. We achieve up to 80% coverage of a single-cell genome at 5× sequencing depth, and demonstrate excellent single-nucleotide variant (SNV) detection using targeted sequencing of droplet MDA product to achieve a median allelic dropout of 15%, and using whole genome sequencing to achieve false and true positive rates of 9.66 × 10(-6) and 68.8%, respectively, in a G1-phase cell. We further show that droplet MDA allows for the detection of copy number variants (CNVs) as small as 30 kb in single cells of an ovarian cancer cell line and as small as 9 Mb in two high-grade serous ovarian cancer samples using only 0.02× depth. Droplet MDA provides an accessible and scalable method for performing robust and accurate CNV and SNV measurements on large numbers of single cells.
Project description:Whole genome amplification (WGA) has become an invaluable tool to perform copy number variation (CNV) detection in single, or a limited number of cells. Unfortunately, current WGA methods introduce representation bias that limits the detection of small CNVs. New WGA methods have been introduced that might have the potential to reduce this bias. We compared the performance of PicoPLEX DNA-Seq (Picoseq), DOPlify, REPLI-g and Ampli-1 WGA for aneuploidy screening and copy number analysis using shallow whole genome massively parallel sequencing (MPS), starting from single or a limited number of cells. Although the four WGA methods perform differently, they are all suited for this application.
Project description:Whole-genome amplification (WGA) techniques are used for non-specific amplification of low-copy number DNA, and especially for single-cell genome and transcriptome amplification. There are a number of WGA methods that have been developed over the years. One example is degenerate oligonucleotide-primed PCR (DOP-PCR), which is a very simple, fast and inexpensive WGA technique. Although DOP-PCR has been regarded as one of the pioneering methods for WGA, it only provides low genome coverage and a high allele dropout rate when compared to more modern techniques. Here we describe an improved DOP-PCR (iDOP-PCR). We have modified the classic DOP-PCR by using a new thermostable DNA polymerase (SD polymerase) with a strong strand-displacement activity and by adjustments in primers design. We compared iDOP-PCR, classic DOP-PCR and the well-established PicoPlex technique for whole genome amplification of both high- and low-copy number human genomic DNA. The amplified DNA libraries were evaluated by analysis of short tandem repeat genotypes and NGS data. In summary, iDOP-PCR provided a better quality of the amplified DNA libraries compared to the other WGA methods tested, especially when low amounts of genomic DNA were used as an input material.
Project description:Kindred cells can have different genomes because of dynamic changes in DNA. Single-cell sequencing is needed to characterize these genomic differences but has been hindered by whole-genome amplification bias, resulting in low genome coverage. Here, we report on a new amplification method-multiple annealing and looping-based amplification cycles (MALBAC)-that offers high uniformity across the genome. Sequencing MALBAC-amplified DNA achieves 93% genome coverage ?1x for a single human cell at 25x mean sequencing depth. We detected digitized copy-number variations (CNVs) of a single cancer cell. By sequencing three kindred cells, we were able to identify individual single-nucleotide variations (SNVs), with no false positives detected. We directly measured the genome-wide mutation rate of a cancer cell line and found that purine-pyrimidine exchanges occurred unusually frequently among the newly acquired SNVs.
Project description:Artifacts introduced in whole-genome amplification (WGA) make it difficult to derive accurate genomic information from single-cell genomes and require different analytical strategies from bulk genome analysis. Here, we describe statistical methods to quantitatively assess the amplification bias resulting from whole-genome amplification of single-cell genomic DNA. Analysis of single-cell DNA libraries generated by different technologies revealed universal features of the genome coverage bias predominantly generated at the amplicon level (1-10?kb). The magnitude of coverage bias can be accurately calibrated from low-pass sequencing (?0.1 × ) to predict the depth-of-coverage yield of single-cell DNA libraries sequenced at arbitrary depths. We further provide a benchmark comparison of single-cell libraries generated by multi-strand displacement amplification (MDA) and multiple annealing and looping-based amplification cycles (MALBAC). Finally, we develop statistical models to calibrate allelic bias in single-cell whole-genome amplification and demonstrate a census-based strategy for efficient and accurate variant detection from low-input biopsy samples.