Project description:As transposon sequencing (TnSeq) assays have become prolific in the microbiology field, it is of interest to scrutinize their potential drawbacks. TnSeq results are determined by counting transposon insertions following the PCR-based enrichment and subsequent deep sequencing of transposon insertions. Here we explore the possibility that PCR amplification of transposon insertions in a TnSeq library skews the results by introducing bias into the detection and/or enumeration of insertions. We compared the detection and frequency of mapped insertions when altering the number of PCR cycles in the enrichment step. In addition, we devised and validated a novel, PCR-free TnSeq method where the insertions are enriched via CRISPR/Cas9-targeted transposon cleavage and subsequent Oxford Nanopore sequencing. These PCR-based and PCR-free experiments demonstrate that, overall, PCR amplification does not significantly bias the results of the TnSeq assay insofar as insertions in the majority of genes represented in our library were similarly detected regardless of PCR cycle number and whether or not PCR amplification was employed. However, the detection of a small subset of genes which had been previously described as essential is indeed sensitive to the number of PCR cycles. We conclude that PCR-based enrichment of transposon insertions in a TnSeq assay is reliable but researchers interested in profiling essential genes should carefully weigh the number of amplification cycles employed in their library preparation protocols. In addition, we present a PCR-free TnSeq alternative that is comparable to traditional PCR-based methods although the latter remain superior owing to their accessibility and high sequencing depth.
Project description:We amplified DNA fragments randomly sheared from PER1 BAC library with four different PCR cycles (3, 6, 12, and 18 cycles). We report the effect of Gibbs free energy bias to coverage significantly increases with additional number of PCR cycles, especially for fragments with high Gibbs free energy (usually corresponding to low GC content).
Project description:Enterococcus faecalis is a common commensal organism and a prolific nosocomial pathogen that causes biofilm-associated infections. Numerous E. faecalis OG1RF genes required for biofilm formation have been identified, but few studies have compared genetic determinants of biofilm formation and biofilm morphology across multiple conditions. Here, we cultured transposon (Tn) libraries in CDC biofilm reactors in two different media and used Tn sequencing (TnSeq) to identify core and accessory biofilm determinants, including many genes that are poorly characterized or annotated as hypothetical. Multiple secondary assays (96-well plates, submerged Aclar, and MultiRep biofilm reactors) were used to validate phenotypes of new biofilm determinants.
Project description:Several recent studies have suggested that genes that are over 100 kb in length are particularly likely to be misregulated in neurological diseases associated with synaptic dysfunction, such as autism, Fragile X syndrome, and Rett syndrome. These length-dependent transcriptional changes seem to be modest, but, given the low sensitivity of high-throughput transcriptome profiling technology, the statistical significance of these results needs to be reevaluated. Here we show that transcriptional changes reflected in microarray and RNA-Sequencing benchmark datasets from the SEQC Consortium show a bias toward genes of greater length, even in the comparison of technical replicates. We hypothesized that PCR amplification, which is used in both microarray and RNA-Seq technologies, could be introducing this bias. We found that, when the fold-change values are small, PCR amplification in microarray and RNA-Seq technologies does produce a bias toward longer genes; we found no similar bias with nCounter technology, which is not based on PCR amplification. We provide an approach to more rigorously assess length-dependent changes that begins with comparing randomized control samples to estimate baseline gene length dependency and evaluate the statistical significance of gene length regulation.
Project description:Spot intensity serves as a proxy for gene expression in dual-label microarray experiments. Dye bias is defined as an intensity difference between samples labeled with different dyes attributable to the dyes instead of the gene expression in the samples. Dye bias that is not removed by array normalization can introduce bias into comparisons between samples of interest. But if the bias is consistent across the samples for the same gene, it can be corrected by proper experimental design and analysis. If the dye bias is not consistent across samples for the same gene, but is different for different samples, then removing the bias becomes more problematic, perhaps indicating a technical limitation to the ability of fluorescent signals to accurately represent gene expression. Thus, it is important to characterize dye bias to determine: (1) whether it will be removed for all genes by array normalization, (2) whether it will not be removed by normalization but can be removed by proper experimental design and analysis and (3) whether dye bias correction is more problematic than either of these and is not easily removable. Keywords: dye swap design
Project description:Coupling molecular biology to high throughput sequencing has revolutionized the study of biology. Molecular genomics techniques are continually refined to provide higher resolution mapping of nucleic acid interactions and nucleic acid structure. These assays are converging on single-nucleotide resolution measurements, but the sequence preferences of molecular biology enzymes can interfere with the accurate interpretation of the data. Enzymatic sequence preferences manifest more prominently as the resolution of these assays increase. We developed seqOutBias to seek out enzymatic sequence bias from experimental data and scale individual sequence reads to correct the bias. We show that this software efficiently and successfully corrects the sequence bias resulting from DNase-seq, TACh-seq, ATAC-seq, MNase-seq, and PRO-seq data.
Project description:High-throughput sequencing (HTS) has become a powerful tool for the detection of and sequence characterization of microRNAs (miRNA) and other small RNAs (sRNA). Unfortunately, the use of HTS data to determine the relative quantity of different miRNAs in a sample has been shown to be inconsistent with quantitative PCR and Northern Blot results. Several recent studies have concluded that the major contributor to this inconsistency is bias introduced during the construction of sRNA libraries for HTS and that the bias is primarily derived from the adaptor ligation steps; specifically where single stranded adaptors are sequentially ligated to the 3' and 5'-end of sRNAs using T4 RNA ligases. In this study we investigated the effects of ligation bias by using a pool of randomized ligation substrates, defined mixtures of miRNA sequences and several combinations of adaptors in HTS library construction. We show that like the 3' adaptor ligation step, the 5' adaptor ligation is also biased, not because of primary sequence, but instead due to secondary structures of the two ligation substrates. We find that multiple secondary structural factors influence final representation in HTS results. Our results provide insight about the nature of ligation bias and allowed us to design adaptors that reduce ligation bias and produce HTS results that more accurately reflect the actual concentrations of miRNAs in the defined starting material.