Project description:We describe an R package designed for processing aligned reads from chromatin-oriented high-throughput sequencing experiments. Pasha (preprocessing of aligned sequences from HTS analyses) allows easy manipulation of aligned reads from short-read sequencing technologies (ChIP-Seq, FAIRE-seq, Mnase-Seq...) and offers innovative approaches to process and extract relevant information.
Project description:mRNA profiles of control Human Trophoblast Stem cell (HTS) and TEAD4 knock down HTS cells were generated by deep sequencing, in triplicate, using Illumina NovaSeq 6000 platform. TEAD4 Knock down in HTS cells were confirmed by RT-PCR analysis and immuno staining.
Project description:Barcode swapping results in the mislabeling of sequencing reads between multiplexed samples on the new patterned flow cell Illumina sequencing machines. This may compromise the validity of numerous genomic assays, especially for single-cell studies where many samples are routinely multiplexed together. The severity and consequences of barcode swapping for single-cell transcriptomic studies remain poorly understood. We have used two statistical approaches to robustly quantify the fraction of swapped reads in each of two plate-based single-cell RNA sequencing datasets. We found that approximately 2.5% of reads were mislabeled between samples on the HiSeq 4000 machine, which is lower than previous reports. We observed no correlation between the swapped fraction of reads and the concentration of free barcode across plates. Further- more, we have demonstrated that barcode swapping may generate complex but artefactual cell libraries in droplet-based single-cell RNA sequencing studies. To eliminate these artefacts, we have developed an algorithm to exclude individual molecules that have swapped between samples in 10X Genomics experiments, exploiting the combinatorial complexity present in the data. This permits the continued use of cutting-edge sequencing machines for droplet-based experiments while avoiding the confounding effects of barcode swapping. This data repository contains the sequencing files associated with the droplet based scRNA-seq dataset in Griffiths et al. (2018). The data presented here should purely used for technical analysis, the biological motivation is nonetheless briefly described in the following: The mammary gland is a unique organ as it undergoes most of its development during puberty and adulthood. Characterising the hierarchy of the various mammary epithelial cells and how they are regulated in response to gestation, lactation and involution is important for understanding how breast cancer develops. Recent studies have used numerous markers to enrich, isolate and characterise the different epithelial cell compartments within the adult mammary gland. However, in all of these studies only a handful of markers were used to define and trace cell populations. Therefore, there is a need for an unbiased and comprehensive description of mammary epithelial cells within the gland at different developmental stages. To this end we used single cell RNA sequencing (scRNAseq) to determine the gene expression profile of individual mammary epithelial cells across four adult developmental stages; nulliparous, mid gestation, lactation and post weaning (full natural involution).
Project description:Trypanosoma brucei library consisting of a pool of reporters whose 14Ts polypyrimidine tract is replaced with 11 random nucleotides. The library is treated with puromycin at a concentration of either 0.2ug/ml, 0.4ug/ml or 1ug/ml. The reporters are recovered and PCR amplicons targeting the random sequences identified by HTS. The results are paired reads, where the suffixes _1 and _2 are reads 1 and 2 respectively. Also included are sequencing results from the plasmid library.
Project description:We discovered that PCR-mediated template switching poses a significant challenge in ensemble tagged PCR, particularly in the template sequences with high similarities, which we successfully addressed by introducing a dual barcode. Template switching presents in repair sequencing (repair seq) and severely interfere the interpretation of the association between repair outcomes and the sgRNA in the vicinity. Among the 67 DNA damage response genes, we found that ERCC6L2 was crucial for preventing the DNA end resection.
Project description:BackgroundMixtures of internationally traded organic substances can contain parts of species protected by the Convention on International Trade in Endangered Species of Wild Fauna and Flora (CITES). These mixtures often raise the suspicion of border control and customs offices, which can lead to confiscation, for example in the case of Traditional Chinese medicines (TCMs). High-throughput sequencing of DNA barcoding markers obtained from such samples provides insight into species constituents of mixtures, but manual cross-referencing of results against the CITES appendices is labor intensive. Matching DNA barcodes against NCBI GenBank using BLAST may yield misleading results both as false positives, due to incorrectly annotated sequences, and false negatives, due to spurious taxonomic re-assignment. Incongruence between the taxonomies of CITES and NCBI GenBank can result in erroneous estimates of illegal trade.ResultsThe HTS barcode checker pipeline is an application for automated processing of sets of 'next generation' barcode sequences to determine whether these contain DNA barcodes obtained from species listed on the CITES appendices. This analytical pipeline builds upon and extends existing open-source applications for BLAST matching against the NCBI GenBank reference database and for taxonomic name reconciliation. In a single operation, reads are converted into taxonomic identifications matched with names on the CITES appendices. By inclusion of a blacklist and additional names databases, the HTS barcode checker pipeline prevents false positives and resolves taxonomic heterogeneity.ConclusionsThe HTS barcode checker pipeline can detect and correctly identify DNA barcodes of CITES-protected species from reads obtained from TCM samples in just a few minutes. The pipeline facilitates and improves molecular monitoring of trade in endangered species, and can aid in safeguarding these species from extinction in the wild. The HTS barcode checker pipeline is available at https://github.com/naturalis/HTS-barcode-checker.