Project description:20 random DNA barcodes were designed in silico and transfected into PC3 cells. Barcodes were sequenced using Illumina-Miseq technology to find the sequence and their respective copy numbers. Current file contains the raw data of these DNA barcodes in fastq format
Project description:20 random DNA barcodes were designed in silico and transfected into PC3 cells. Barcodes were sequenced using Illumina-Miseq technology to find the sequence and their respective copy numbers. Current file contains the raw data of these DNA barcodes in fastq format Validating an algorithm called SRiD that generates random DNA barcodes that do not match a genome of interest, in this case human genome. 20 DNA barcodes were used for this validation.
Project description:Following the removal of implanted mammary tumors, nude mice develop multiple-organ metastases at late stage. The metastases may originate from the primary tumors before the resection surgery, or alternatively, from some established metastases. By multiple approaches, we have proved that bone environment could invigorate cancer cells for further dissemination. this study aims to examine if metastatic dissemination from bone to other sites occurs in natural setting of metastatic spread. We herein apply the rapidly evolving barcode system using homing guide RNA/Cas9 to trace the metastases formation in mouse. hgRNA/Cas9 is a self-targeting Crispr system which allows the mutation occurs in the DNA sequence of guide RNA. Tumor cells wer labelled with doxycycline inducible evolving barcoding system. Upon doxycycline treatment the DNA sequence of hgRNA accumulate mutations with time. The diversity of barcodes in each lesion can infer the timeing of seeding while the mutation patterns of barcodes suggest the phylogenetic correlation of metastases. Several findings were made on this study. First, at the terminal stage, multi-organ metastases are not genetically grouped according to sites of metastases. Nonnegative Matrix Factorization (NMF) analysis of mutant barcodes suggested the early disseminated metastases, which have highest level of Shannon entropy, were featured with a common cluster of mutant barcodes irrespective of their locations. Second, most metastases are potentially multiclonal as indicated by multiple clusters of independent mutant barcodes. Third, when we use Shannon entropy as an index of metastasis age , putative parent-child relationship between metastases with unique mutant barcodes clearly exemplified secondary metastatic seeding from bone to other organs. Finally, we did not observe a clear correlation between tumor burden and Shannon entropy across different metastases, suggesting that putative parental metastases might remain small after seeding further metastases.
Project description:High-throughput single-cell assays increasingly require special consideration in experimental design, sample multiplexing, batch effect removal, and data interpretation. Here, we describe a lentiviral barcode-based multiplexing approach, CellTag Indexing, which uses predefined genetic barcodes that are also heritable, enabling cell populations to be tagged, pooled, and tracked over time in the same experimental replicate. We demonstrate the utility of CellTag Indexing by sequencing transcriptomes using a variety of cell types, including long-term tracking of cell engraftment and differentiation in vivo. Together, this presents CellTag Indexing as a broadly applicable genetic multiplexing tool that is complementary with existing single-cell technologies.
Project description:Analysis of microbial community composition in arctic tundra and boreal forest soils using serial analysis of ribosomal sequence tags (SARST). Keywords: other
Project description:A widespread assumption for single-cell analyses specifies that one cell’s nucleic acids are predominantly captured by one oligonucleotide barcode. However, we show that ~13-21% of cell barcodes from the 10x Chromium scATAC-seq assay may have been derived from a droplet with more than one oligonucleotide sequence, which we call “barcode multiplets”. We demonstrate that barcode multiplets can be derived from at least two different sources. First, we confirm that approximately 4% of droplets from the 10x platform may contain multiple beads. Additionally, we find that approximately 5% of beads may contain detectable levels of multiple oligonucleotide barcodes. We show that this artifact can confound single-cell analyses, including the interpretation of clonal diversity and proliferation of intra-tumor lymphocytes. Overall, our work provides a conceptual and computational framework to identify and assess the impacts of barcode multiplets in single-cell data.
Project description:Accurate description of a microbial community is an important first step in understanding the role of its components in ecosystem function. A method for surveying microbial communities termed Serial Analysis of Ribosomal DNA (SARD) is described here. Through a series of molecular cloning steps, short DNA sequence tags are recovered from the fifth variable (V5) region of the prokaryotic 16S rRNA gene from microbial communities. These tags are ligated to form concatemers comprised of 20-40 tags which are cloned and identified by DNA sequencing. Four agricultural soil samples were profiled with SARD to assess the method’s utility. A total of 37,008 SARD tags comprising 3,127 unique sequences were identified. Comparison of duplicate profiles from one soil genomic DNA preparation revealed the method was highly reproducible. The large numbers of singleton tags together with non-parametric richness estimates indicated a significant amount of sequence tag diversity remained undetected with this level of sampling. The abundance classes of the observed tags were scale-free and conformed to a power law distribution. Numerically, the majority of the total tags observed belonged to abundance classes that were each present at less than 1% of the community. Over 99% of the unique tags individually made up less than 1% of the community. Therefore, from either numerical or diversity standpoints, low abundant taxa comprised a significant proportion of the microbial communities examined and could potentially make a large contribution to ecosystem function. SARD may provide a means to explore the ecological role of these rare members of microbial communities in qualitative and quantitative terms. Keywords: SARD profiles, culture-independent study, microbial community survey, microbial census
Project description:Elucidating the role of gut microbiota in physiological and pathological processes has recently emerged as a key research aim in life sciences. In this respect, metaproteomics (the study of the whole protein complement of a microbial community) can provide a unique contribution by revealing which functions are actually being expressed by specific microbial taxa. However, its wide application to gut microbiota research has been hindered by challenges in data analysis, especially related to the choice of the proper sequence databases for protein identification. Here we present a systematic investigation of variables concerning database construction and annotation, and evaluate their impact on human and mouse gut metaproteomic results. We found that both publicly available and experimental metagenomic databases lead to the identification of unique peptide assortments, suggesting parallel database searches as a mean to gain more complete information. Taxonomic and functional results were revealed to be strongly database-dependent, especially when dealing with mouse samples. As a striking example, in mouse the Firmicutes/Bacteroidetes ratio varied up to 10-fold depending on the database used. Finally, we provide recommendations regarding metagenomic sequence processing aimed at maximizing gut metaproteome characterization, and contribute to identify an optimized pipeline for metaproteomic data analysis.