High-Throughput Single-Cell Sequencing with Linear Amplification.
ABSTRACT: Conventional methods for single-cell genome sequencing are limited with respect to uniformity and throughput. Here, we describe sci-L3, a single-cell sequencing method that combines combinatorial indexing (sci-) and linear (L) amplification. The sci-L3 method adopts a 3-level (3) indexing scheme that minimizes amplification biases while enabling exponential gains in throughput. We demonstrate the generalizability of sci-L3 with proof-of-concept demonstrations of single-cell whole-genome sequencing (sci-L3-WGS), targeted sequencing (sci-L3-target-seq), and a co-assay of the genome and transcriptome (sci-L3-RNA/DNA). We apply sci-L3-WGS to profile the genomes of >10,000 sperm and sperm precursors from F1 hybrid mice, mapping 86,786 crossovers and characterizing rare chromosome mis-segregation events in meiosis, including instances of whole-genome equational chromosome segregation. We anticipate that sci-L3 assays can be applied to fully characterize recombination landscapes, to couple CRISPR perturbations and measurements of genome stability, and to other goals requiring high-throughput, high-coverage single-cell sequencing.
Project description:Single-cell combinatorial indexing (sci) with transposase-based library construction increases the throughput of single-cell genomics assays but produces sparse coverage in terms of usable reads per cell. We develop symmetrical strand sci ('s3'), a uracil-based adapter switching approach that improves the rate of conversion of source DNA into viable sequencing library fragments following tagmentation. We apply this chemistry to assay chromatin accessibility (s3-assay for transposase-accessible chromatin, s3-ATAC) in human cortical and mouse whole-brain tissues, with mouse datasets demonstrating a six- to 13-fold improvement in usable reads per cell compared with other available methods. Application of s3 to single-cell whole-genome sequencing (s3-WGS) and to whole-genome plus chromatin conformation (s3-GCC) yields 148- and 14.8-fold improvements, respectively, in usable reads per cell compared with sci-DNA-sequencing and sci-HiC. We show that s3-WGS and s3-GCC resolve subclonal genomic alterations in patient-derived pancreatic cancer cell lines. We expect that the s3 platform will be compatible with other transposase-based techniques, including sci-MET or CUT&Tag.
Project description:We present a highly scalable assay for whole-genome methylation profiling of single cells. We use our approach, single-cell combinatorial indexing for methylation analysis (sci-MET), to produce 3,282 single-cell bisulfite sequencing libraries and achieve read alignment rates of 68 ± 8%. We apply sci-MET to discriminate the cellular identity of a mixture of three human cell lines and to identify excitatory and inhibitory neuronal populations from mouse cortical tissue.
Project description:To resolve cellular heterogeneity, we developed a combinatorial indexing strategy to profile the transcriptomes of single cells or nuclei, termed sci-RNA-seq (single-cell combinatorial indexing RNA sequencing). We applied sci-RNA-seq to profile nearly 50,000 cells from the nematode Caenorhabditis elegans at the L2 larval stage, which provided >50-fold "shotgun" cellular coverage of its somatic cell composition. From these data, we defined consensus expression profiles for 27 cell types and recovered rare neuronal cell types corresponding to as few as one or two cells in the L2 worm. We integrated these profiles with whole-animal chromatin immunoprecipitation sequencing data to deconvolve the cell type-specific effects of transcription factors. The data generated by sci-RNA-seq constitute a powerful resource for nematode biology and foreshadow similar atlases for other organisms.
Project description:The highly dynamic nature of chromosome conformation and three-dimensional (3D) genome organization leads to cell-to-cell variability in chromatin interactions within a cell population, even if the cells of the population appear to be functionally homogeneous. Hence, although Hi-C is a powerful tool for mapping 3D genome organization, this heterogeneity of chromosome higher order structure among individual cells limits the interpretive power of population based bulk Hi-C assays. Moreover, single-cell studies have the potential to enable the identification and characterization of rare cell populations or cell subtypes in a heterogeneous population. However, it may require surveying relatively large numbers of single cells to achieve statistically meaningful observations in single-cell studies. By applying combinatorial cellular indexing to chromosome conformation capture, we developed single-cell combinatorial indexed Hi-C (sci-Hi-C), a high throughput method that enables mapping chromatin interactomes in large number of single cells. We demonstrated the use of sci-Hi-C data to separate cells by karytoypic and cell-cycle state differences and to identify cellular variability in mammalian chromosomal conformation. Here, we provide a detailed description of method design and step-by-step working protocols for sci-Hi-C.
Project description:<h4>Background</h4>Massively-parallel-sequencing, coupled with sample multiplexing, has made genetic tests broadly affordable. However, intractable index mis-assignments (commonly exceeds 1%) were repeatedly reported on some widely used sequencing platforms.<h4>Results</h4>Here, we investigated this quality issue on BGI sequencers using three library preparation methods: whole genome sequencing (WGS) with PCR, PCR-free WGS, and two-step targeted PCR. BGI's sequencers utilize a unique DNA nanoball (DNB) technology which uses rolling circle replication for DNA-nanoball preparation; this linear amplification is PCR free and can avoid error accumulation. We demonstrated that single index mis-assignment from free indexed oligos occurs at a rate of one in 36 million reads, suggesting virtually no index hopping during DNB creation and arraying. Furthermore, the DNB-based NGS libraries have achieved an unprecedentedly low sample-to-sample mis-assignment rate of 0.0001 to 0.0004% under recommended procedures.<h4>Conclusions</h4>Single indexing with DNB technology provides a simple but effective method for sensitive genetic assays with large sample numbers.
Project description:Fowl cholera, caused by Pasteurella multocida, continues to be a challenge in meat-chicken-breeder operations and has emerged as a problem for free-range meat chickens. Here, using whole-genome sequencing (WGS) and phylogenomic analysis, we investigate isolate relatedness during outbreaks of fowl cholera on a free-range meat chicken farm over a 5-year period. Our genomic analysis revealed that while all outbreak isolates were sequence type (ST) 20, they could be separated into two distinct clades (clade 1 and clade 2) consistent with difference in their lipopolysaccharide (LPS) type. The isolates from the earlier outbreaks (clade 1) were carrying LPS type L3 while those from the more recent outbreaks (clade 2) were LPS type L1. Additionally, WGS data indicated high inter- and intra-chicken genetic diversity during a single outbreak. Furthermore, we demonstrate that while a killed autogenous vaccine carrying LPS type L3 had been successful in protecting against challenge from L3 isolates it might have driven the emergence of the closely related clade 2, against which the vaccine was ineffective. The genomic results also revealed a 14?bp deletion in the galactosyltransferase gene gatG in LPS type L3 isolates, which would result in producing a semi-truncated LPS in those isolates. In conclusion, our study clearly demonstrates the advantages of genomic analysis over the conventional PCR-based approaches in providing clear insights in terms of linkage of isolate within and between outbreaks. More importantly, it provides more detailed information than the multiplex PCR on the possible structure of outer LPS, which is very important in the case of strain selection for killed autogenous vaccines.
Project description:Gene expression programs change over time, differentiation and development, and in response to stimuli. However, nearly all techniques for profiling gene expression in single cells do not directly capture transcriptional dynamics. In the present study, we present a method for combined single-cell combinatorial indexing and messenger RNA labeling (sci-fate), which uses combinatorial cell indexing and 4-thiouridine labeling of newly synthesized mRNA to concurrently profile the whole and newly synthesized transcriptome in each of many single cells. We used sci-fate to study the cortisol response in >6,000 single cultured cells. From these data, we quantified the dynamics of the cell cycle and glucocorticoid receptor activation, and explored their intersection. Finally, we developed software to infer and analyze cell-state transitions. We anticipate that sci-fate will be broadly applicable to quantitatively characterize transcriptional dynamics in diverse systems.
Project description:<h4>Purpose</h4>The purpose of this study was to develop a feasible approach for single sperm isolation and chromosome analysis by next-generation sequencing (NGS).<h4>Methods</h4>Single sperm cells were isolated from semen samples of normozoospermic male and an infertile reciprocal translocation (RcT) carrier with the 46,XY,t(7;13)(p12;q12.1) karyotype using the optimized fluorescence-activated cell sorting (FACS) technique. Genome profiling was performed using NGS.<h4>Results</h4>Following whole-genome amplification, NGS, and quality control, the final chromosome analysis was performed on 31 and 6 single cell samples derived from the RcT carrier and normozoospermic male, respectively. All sperm cells from normozoospermic male showed a normal haploid 23-chromosome profile. For the RcT carrier, the sequencing data revealed that 64.5% of sperm cells harbored different variants of chromosome aberrations, involving deletion of 7p or 7q, duplication of 7p, and duplication of 13q, which is concordant with the expected chromosome segregation patterns observed in balanced translocation carriers. In one sample, a duplication of 9q was also detected.<h4>Conclusions</h4>We optimized FACS protocol for simple and efficient isolation of single human sperm cells that subsequently enabled a successful genome-wide chromosome profiling and identification of segmental aneuploidies from these individual cells, following NGS analysis. This approach may be useful for analyzing semen samples of infertile men or chromosomal aberration carriers to facilitate the reproductive risk assessment.
Project description:High-throughput single-cell RNA sequencing has transformed our understanding of complex cell populations, but it does not provide phenotypic information such as cell-surface protein levels. Here, we describe cellular indexing of transcriptomes and epitopes by sequencing (CITE-seq), a method in which oligonucleotide-labeled antibodies are used to integrate cellular protein and transcriptome measurements into an efficient, single-cell readout. CITE-seq is compatible with existing single-cell sequencing approaches and scales readily with throughput increases.
Project description:Due to the potential of enterohemorrhagic <i>Escherichia coli</i> (EHEC) serogroup O157 to cause large food borne outbreaks, national and international surveillance is necessary. For developing an effective method of molecular surveillance, a conventional method, multilocus variable-number tandem-repeat analysis (MLVA), and whole-genome sequencing (WGS) analysis were compared. WGS of 369 isolates of EHEC O157 belonging to 7 major MLVA types and their relatives were subjected to comprehensive <i>in silico</i> typing, core genome single nucleotide polymorphism (cgSNP), and core genome multilocus sequence typing (cgMLST) analyses. The typing resolution was the highest in cgSNP analysis. However, determination of the sequence of the mismatch repair protein gene <i>mutS</i> is necessary because spontaneous deletion of the gene could lead to a hypermutator phenotype. MLVA had sufficient typing resolution for a short-term outbreak investigation and had advantages in rapidity and high throughput. cgMLST showed less typing resolution than cgSNP, but it is less time-consuming and does not require as much computer power. Therefore, cgMLST is suitable for comparisons using large data sets (e.g., international comparison using public databases). In conclusion, screening using MLVA followed by cgMLST and cgSNP analyses would provide the highest typing resolution and improve the accuracy and cost-effectiveness of EHEC O157 surveillance.<b>IMPORTANCE</b> Intensive surveillance for enterohemorrhagic <i>Escherichia coli</i> (EHEC) serogroup O157 is important to detect outbreaks and to prevent the spread of the bacterium. Recent advances in sequencing technology made molecular surveillance using whole-genome sequence (WGS) realistic. To develop rapid, high-throughput, and cost-effective typing methods for real-time surveillance, typing resolution of WGS and a conventional typing method, multilocus variable-number tandem-repeat analysis (MLVA), was evaluated. Nation-level systematic comparison of MLVA, core genome single nucleotide polymorphism (cgSNP), and core genome multilocus sequence typing (cgMLST) indicated that a combination of WGS and MLVA is a realistic approach to improve EHEC O157 surveillance.