Project description:Calling cards technology using self-reporting transposons enables the identification of DNA-protein interactions through RNA sequencing . Here, we have drastically reduced the cost and labor requirements of calling card experiments in bulk populations of cells by introducing a DNA barcode into the calling card itself. An additional barcode incorporated during reverse transcription enables simultaneous transcriptome measurement in a facile and affordable protocol. We demonstrate that barcoded self-reporting transposons recover in vitro binding sites for four basic helix-loop-helix transcription factors with important roles in cell fate specification: ASCL1, MYOD1, NEUROD2, and NGN1. Further, simultaneous calling cards and transcriptional profiling during transcription factor overexpression identified both binding sites and gene expression changes for two of these factors. In sum, RNA-based identification of transcription factor binding sites and gene expression through barcoded self-reporting transposon calling cards and transcriptomes is an efficient and powerful method to infer gene regulatory networks in a population of cells.
Project description:Droplet-based single-cell sequencing techniques have provided unprecedented insight into cellular heterogeneities within tissues. However, these approaches only allow for the measurement of the distal parts of a transcript following short-read sequencing. Therefore, splicing and sequence diversity information is lost for the majority of the transcript. The application of long-read Nanopore sequencing to droplet-based methods is challenging because of the low base-calling accuracy currently associated with Nanopore sequencing. Although several approaches that use additional short-read sequencing to error-correct the barcode and UMI sequences have been developed, these techniques are limited by the requirement to sequence a library using both short- and long-read sequencing. Here we introduce a novel approach termed single-cell Barcode UMI Correction sequencing (scBUC-seq) to efficiently error-correct barcode and UMI oligonucleotide sequences synthesized by using blocks of dimeric nucleotides. The method can be applied to correct both short-read and long-read sequencing, thereby allowing users to recover more reads per cell that permits direct single-cell Nanopore sequencing for the first time. We illustrate our method by using species-mixing experiments to evaluate barcode assignment accuracy and multiple myeloma cell lines to evaluate differential isoform usage and Ewing’s sarcoma cells to demonstrate Ig fusion transcript analysis.
Project description:Azithromycin (AZM) reduces pulmonary inflammation and exacerbations in chronic obstructive pulmonary disease patients with emphysema. The antimicrobial effects of AZM on the lung microbiome are not known and may contribute to its beneficial effects. Methods. Twenty smokers with emphysema were randomized to receive AZM 250 mg or placebo daily for 8 weeks. Bronchoalveolar lavage (BAL) was performed at baseline and after treatment. Measurements included: rDNA gene quantity and sequence. Results. Compared with placebo, AZM did not alter bacterial burden but reduced α-diversity, decreasing 11 low abundance taxa, none of which are classical pulmonary pathogens. Conclusions. AZM treatment the lung microbiome Randomized trial comparing azithromycin (AZM) treatment with placebo for eight weeks. Bronchoalveolar lavage (BAL) samples were obtained before and after treatment to explore the effects of AZM on microbiome, in the lower airways. 16S rRNA was quantified and sequenced (MiSeq) The amplicons from total 39 samples are barcoded and the barcode is provided in the metadata_complete.txt file.
Project description:Transcription factors direct gene expression, and so there is much interest in mapping their genome-wide binding locations. Current methods do not allow for the multiplexed analysis of TF binding, and this limits their throughput. We describe a novel method for determining the genomic target genes of multiple transcription factors simultaneously. DNA-binding proteins are endowed with the ability to direct transposon insertions into the genome near to where they bind. The transposon becomes a “Calling Card” marking the visit of the DNA-binding protein to that location. A unique sequence “barcode” in the transposon matches it to the DNA-binding protein that directed its insertion. The sequences of the DNA flanking the transposon (which reveal where in the genome the transposon landed) and the barcode within the transposon (which identifies the TF that put it there) are determined by massively-parallel DNA sequencing. To demonstrate the method’s feasibility, we determined the genomic targets of eight transcription factors in a single experiment. The Calling Card method promises to significantly reduce the cost and labor needed to determine the genomic targets of many transcription factors in different environmental conditions and genetic backgrounds.
Project description:RNA-Seq is a powerful tool for transcriptome profiling, but is hampered by sequence-dependent bias and inaccuracy at low copy numbers intrinsic to exponential PCR amplification. We developed a simple strategy for mitigating these complications, allowing truly digital RNA-Seq. Following reverse transcription, a large set of barcode sequences is added in excess, and nearly every cDNA molecule is uniquely labeled by random attachment of barcode sequences to both ends. After PCR, we applied paired-end deep sequencing to read the two barcodes and cDNA sequences. Rather than counting the number of reads, RNA abundance is measured based on the number of unique barcode sequences observed for a given cDNA sequence. We optimized the barcodes to be unambiguously identifiable even in the presence of multiple sequencing errors. This method allows counting with single copy resolution despite sequence-dependent bias and PCR amplification noise, and is analogous to digital PCR but amendable to quantifying a whole transcriptome. We demonstrated transcriptome profiling of E. coli with more accurate and reproducible quantification than conventional RNA-Seq.
Project description:Transcription factors direct gene expression, and so there is much interest in mapping their genome-wide binding locations. M-BM- Current methods do not allow for the multiplexed analysis of TF binding, and this limits their throughput. We describe a novel method for determining the genomic target genes of multiple transcription factors simultaneously. DNA-binding proteins are endowed with the ability to direct transposon insertions into the genome near to where they bind. The transposon becomes a M-bM-^@M-^\Calling CardM-bM-^@M-^] marking the visit of the DNA-binding protein to that location. A unique sequence M-bM-^@M-^\barcodeM-bM-^@M-^] in the transposon matches it to the DNA-binding protein that directed its insertion. The sequences of the DNA flanking the transposon (which reveal where in the genome the transposon landed) and the barcode within the transposon (which identifies the TF that put it there) are determined by massively-parallel DNA sequencing. To demonstrate the methodM-bM-^@M-^Ys feasibility, we determined the genomic targets of eight transcription factors in a single experiment. The Calling Card method promises to significantly reduce the cost and labor needed to determine the genomic targets of many transcription factors in different environmental conditions and genetic backgrounds. These data contain Ty5 insertion sites mapped by an Illumina GAII analyzer in the S. cerevisiae genome for the background strain without any Sir4 present (1 run), in strains expressing Sir4-tagged copies of three well-characterized TFs: Gal4, Leu3, and Gcn4 (1 run each), and a multiplex of eight Sir4-tagged TFs pooled in a single experiment (2 biological replicates), and insertions from the Thi2-Sir4 fusion expressed from its native locus in two conditions (1 run each). The format of each insertions file is [chromosome number] [position of genomic base] [direction of insertion] [number of reads at that position]. Raw sequencing data comes in two varieties. Paired-end data contains a 5 bp barcode at the beginning of read #2. Single-end data contains a 2 bp barcode on the beggining of read #1.
Project description:RNA-Seq is a powerful tool for transcriptome profiling, but is hampered by sequence-dependent bias and inaccuracy at low copy numbers intrinsic to exponential PCR amplification. We developed a simple strategy for mitigating these complications, allowing truly digital RNA-Seq. Following reverse transcription, a large set of barcode sequences is added in excess, and nearly every cDNA molecule is uniquely labeled by random attachment of barcode sequences to both ends. After PCR, we applied paired-end deep sequencing to read the two barcodes and cDNA sequences. Rather than counting the number of reads, RNA abundance is measured based on the number of unique barcode sequences observed for a given cDNA sequence. We optimized the barcodes to be unambiguously identifiable even in the presence of multiple sequencing errors. This method allows counting with single copy resolution despite sequence-dependent bias and PCR amplification noise, and is analogous to digital PCR but amendable to quantifying a whole transcriptome. We demonstrated transcriptome profiling of E. coli with more accurate and reproducible quantification than conventional RNA-Seq. We analyzed two replicates of the same bulk E. coli transcriptome sample. In each sample, we included internal standards to demonstrate that the digital RNA-Seq system may accurately count fragments correctly.