Project description:Somatic transposon mutagenesis in mice is an efficient strategy to investigate the genetic mechanisms of tumorigenesis. The identification of tumor driving transposon insertions traditionally requires the generation of large tumor cohorts to obtain information about common insertion sites. Tumor driving insertions are also characterized by their clonal expansion in tumor tissue, a phenomenon that is facilitated by the slow and evolving transformation process of transposon mutagenesis. We describe here an improved approach for the detection of tumor driving insertions that assesses the clonal expansion of insertions by quantifying the relative proportion of sequence reads obtained in individual tumors. To this end, we have developed a protocol for insertion site sequencing that utilizes acoustic shearing of tumor DNA and Illumina sequencing. We analyzed various solid tumors generated by PiggyBac mutagenesis and for each tumor >10^6 reads corresponding to >10^4 insertion sites were obtained. In each tumor, 9 to 25 insertions stood out by their enriched sequence read frequencies when compared to frequencies obtained from tail DNA controls. These enriched insertions are potential clonally expanded tumor driving insertions, and thus identify candidate cancer genes. The candidate cancer genes of our study comprised many established cancer genes, but also novel candidate genes such as Mastermind-like1 (Mamld1) and Diacylglycerolkinase delta (Dgkd). We show that clonal expansion analysis by high-throughput sequencing is a robust approach for the identification of candidate cancer genes in insertional mutagenesis screens on the level of individual tumors. Solid tumors in mice were generated by somatic transposon mutagenesis with a PiggyBac transposon system. Insertion sites of transposons in 11 tumors and 6 non-cancerous tail controls were determined by Illumina high-throughput sequencing. Insertions were determined both on 5' and 3' sides of the transposon (PB5 and PB3, respectively). Quantitative analysis of read numbers revealed enrichment of certain insertions in tumors, but not in controls, and these enriched insertions identify candidate cancer genes.
Project description:Transcription factors direct gene expression, and so there is much interest in mapping their genome-wide binding locations. M-BM- Current methods do not allow for the multiplexed analysis of TF binding, and this limits their throughput. We describe a novel method for determining the genomic target genes of multiple transcription factors simultaneously. DNA-binding proteins are endowed with the ability to direct transposon insertions into the genome near to where they bind. The transposon becomes a M-bM-^@M-^\Calling CardM-bM-^@M-^] marking the visit of the DNA-binding protein to that location. A unique sequence M-bM-^@M-^\barcodeM-bM-^@M-^] in the transposon matches it to the DNA-binding protein that directed its insertion. The sequences of the DNA flanking the transposon (which reveal where in the genome the transposon landed) and the barcode within the transposon (which identifies the TF that put it there) are determined by massively-parallel DNA sequencing. To demonstrate the methodM-bM-^@M-^Ys feasibility, we determined the genomic targets of eight transcription factors in a single experiment. The Calling Card method promises to significantly reduce the cost and labor needed to determine the genomic targets of many transcription factors in different environmental conditions and genetic backgrounds. These data contain Ty5 insertion sites mapped by an Illumina GAII analyzer in the S. cerevisiae genome for the background strain without any Sir4 present (1 run), in strains expressing Sir4-tagged copies of three well-characterized TFs: Gal4, Leu3, and Gcn4 (1 run each), and a multiplex of eight Sir4-tagged TFs pooled in a single experiment (2 biological replicates), and insertions from the Thi2-Sir4 fusion expressed from its native locus in two conditions (1 run each). The format of each insertions file is [chromosome number] [position of genomic base] [direction of insertion] [number of reads at that position]. Raw sequencing data comes in two varieties. Paired-end data contains a 5 bp barcode at the beginning of read #2. Single-end data contains a 2 bp barcode on the beggining of read #1.
Project description:As transposon sequencing (TnSeq) assays have become prolific in the microbiology field, it is of interest to scrutinize their potential drawbacks. TnSeq results are determined by counting transposon insertions following the PCR-based enrichment and subsequent deep sequencing of transposon insertions. Here we explore the possibility that PCR amplification of transposon insertions in a TnSeq library skews the results by introducing bias into the detection and/or enumeration of insertions. We compared the detection and frequency of mapped insertions when altering the number of PCR cycles in the enrichment step. In addition, we devised and validated a novel, PCR-free TnSeq method where the insertions are enriched via CRISPR/Cas9-targeted transposon cleavage and subsequent Oxford Nanopore sequencing. These PCR-based and PCR-free experiments demonstrate that, overall, PCR amplification does not significantly bias the results of the TnSeq assay insofar as insertions in the majority of genes represented in our library were similarly detected regardless of PCR cycle number and whether or not PCR amplification was employed. However, the detection of a small subset of genes which had been previously described as essential is indeed sensitive to the number of PCR cycles. We conclude that PCR-based enrichment of transposon insertions in a TnSeq assay is reliable but researchers interested in profiling essential genes should carefully weigh the number of amplification cycles employed in their library preparation protocols. In addition, we present a PCR-free TnSeq alternative that is comparable to traditional PCR-based methods although the latter remain superior owing to their accessibility and high sequencing depth.
Project description:The National Cancer Institute-60 (NCI-60) cell lines are among the most widely used models of human cancer. They provide a platform to integrate DNA sequence information, epigenetic data, RNA and protein expression, and pharmacologic susceptibilities in studies of cancer cell biology. Genome-wide studies of the NCI-60 have included exome sequencing, karyotyping, and copy number analyses but have not targeted repetitive sequences. Interspersed repeats are a significant source of heritable genetic variation, and insertions of active elements can occur somatically in malignancy. To approach a functional understanding of these sequences in transformed cells, we used transposon insertion profiling (TIP) to map Long INterspersed Element-1 (LINE-1, L1) and Alu Short INterspersed Element (SINE) insertions in cancer genes in NCI-60 cells. As expected, this identified known insertions, polymorphisms shared in unrelated tumor cell lines, as well as unique, potentially tumor-specific insertions. Here, we report a map of these insertion sites and conduct association analyses relating individual insertions to a variety of cellular phenotypes.
Project description:Somatic transposon mutagenesis in mice is an efficient strategy to investigate the genetic mechanisms of tumorigenesis. The identification of tumor driving transposon insertions traditionally requires the generation of large tumor cohorts to obtain information about common insertion sites. Tumor driving insertions are also characterized by their clonal expansion in tumor tissue, a phenomenon that is facilitated by the slow and evolving transformation process of transposon mutagenesis. We describe here an improved approach for the detection of tumor driving insertions that assesses the clonal expansion of insertions by quantifying the relative proportion of sequence reads obtained in individual tumors. To this end, we have developed a protocol for insertion site sequencing that utilizes acoustic shearing of tumor DNA and Illumina sequencing. We analyzed various solid tumors generated by PiggyBac mutagenesis and for each tumor >10^6 reads corresponding to >10^4 insertion sites were obtained. In each tumor, 9 to 25 insertions stood out by their enriched sequence read frequencies when compared to frequencies obtained from tail DNA controls. These enriched insertions are potential clonally expanded tumor driving insertions, and thus identify candidate cancer genes. The candidate cancer genes of our study comprised many established cancer genes, but also novel candidate genes such as Mastermind-like1 (Mamld1) and Diacylglycerolkinase delta (Dgkd). We show that clonal expansion analysis by high-throughput sequencing is a robust approach for the identification of candidate cancer genes in insertional mutagenesis screens on the level of individual tumors.
Project description:Transcription factors direct gene expression, and so there is much interest in mapping their genome-wide binding locations. Current methods do not allow for the multiplexed analysis of TF binding, and this limits their throughput. We describe a novel method for determining the genomic target genes of multiple transcription factors simultaneously. DNA-binding proteins are endowed with the ability to direct transposon insertions into the genome near to where they bind. The transposon becomes a “Calling Card” marking the visit of the DNA-binding protein to that location. A unique sequence “barcode” in the transposon matches it to the DNA-binding protein that directed its insertion. The sequences of the DNA flanking the transposon (which reveal where in the genome the transposon landed) and the barcode within the transposon (which identifies the TF that put it there) are determined by massively-parallel DNA sequencing. To demonstrate the method’s feasibility, we determined the genomic targets of eight transcription factors in a single experiment. The Calling Card method promises to significantly reduce the cost and labor needed to determine the genomic targets of many transcription factors in different environmental conditions and genetic backgrounds.
Project description:Genomic DNA from pools of H. pylori strain G27 Clones as indicated (pools of 300 (300p) or insertions in specific mapped genes) were amplifed using the MATT method to label DNA adjacent to the site of transposon insertion with the primer pairs indicated. The left side of the transposon was labeled in the Cy3 channel (Primer S) and the right side of the transposon was labeled in the Cy5 channel (Primer N). Keywords: reference_design
Project description:Microarray Tracking of transposon mutants for a H. pylori mouse colonization screen described in Baldwin DN et al. Screen in NSH57 H. pylori strain background. Original 50,000 clone transposon library was plated and patched to make 25 pools of 48 clones. Clones were infected into 4-8 C57Bl/6 mice and stomach bacteria from at least two mice were harvested at 1 week or one month. Semi-random PCR was used to amplify and label the DNA next to the transposon insertion from the input (Cy3) and output pool (Cy5) genomic DNA for each array. Two arrays were done per mouse. One array labeled from the left side of transposon (primers S, 2C) and one array labeled from the right side of the transposon (primers N3, 2C). Transposon insertions were defined by spots with signal four standard deviations above background in both arrays. We also counted insertions where two adjacent gene spots (after arranging the data in genome order) gave signal from the two different sides of the transposon (but not both). A pathogenicity experiment design type is where an infective agent such as a bacterium, virus, protozoan, fungus etc. infects a host organism(s) and the infective agent is assayed. Keywords: pathogenicity_design
Project description:Microarray Tracking of transposon mutants for a H. pylori mouse colonization screen described in Baldwin DN et al. 2007. Screen in NSH79 H. pylori strain background. Original 2000 clone transposon library was plated and patched to make 25 pools of 48 clones. Clones were infected into 4-8 C57Bl/6 mice and stomach bacteria from at least two mice were harvested at 1 week or one month. Semi-random PCR was used to amplify and label the DNA next to the transposon insertion from the input (Cy3) and output pool (Cy5) genomic DNA for each array. Two arrays were done per mouse. One array labeled from the left side of transposon (primers S, 2C) and one array labeled from the right side of the transposon (primers N3, 2C). Transposon insertions were defined by spots with signal four standard deviations above background in both arrays. We also counted insertions where two adjacent gene spots (after arranging the data in genome order) gave signal from the two different sides of the transposon (but not both). A pathogenicity experiment design type is where an infective agent such as a bacterium, virus, protozoan, fungus etc. infects a host organism(s) and the infective agent is assayed. Keywords: pathogenicity_design
Project description:Microarray Tracking of transposon mutants for a H. pylori mouse colonization screen described in Baldwin DN et al. 2007, I&I, 75(2):??, doi:10.1128/IAI.01176-06. Screen in NSH57 H. pylori strain background. Original 50,000 clone transposon library was plated and patched to make 25 pools of 48 clones. Clones were infected into 4-8 C57Bl/6 mice and stomach bacteria from at least two mice were harvested at 1 week or one month. Semi-random PCR was used to amplify and label the DNA next to the transposon insertion from the input (Cy3) and output pool (Cy5) genomic DNA for each array. Two arrays were done per mouse. One array labeled from the left side of transposon (primers S, 2C) and one array labeled from the right side of the transposon (primers N3, 2C). Transposon insertions were defined by spots with signal four standard deviations above background in both arrays. We also counted insertions where two adjacent gene spots (after arranging the data in genome order) gave signal from the two different sides of the transposon (but not both).