Project description:The birth of new genes from non-coding sequences has been postulated to be preceded by a proto-gene phase, in which a sequence is translated into protein but does not exhibit hallmarks of a clear function. Despite the abundance of such proto-genes in bacterial genomes, the frequency of their emergence and whether they actually act as precursors of new genes in natural populations are still open questions. To address these issues, we applied a combination of transcriptomic, proteomic and comparative genomic approaches to identify and analyze hundreds of novel bacterial protein-coding genes that have previously escaped annotation. These novel proteins, including many that are widely conserved across genera, display sequence properties indistinguishable from the non-coding regions of the genome, suggesting that the vast majority are evolving neutrally. We provide evidence of de novo emergence of three proto-genes within the history of the E. coli species; however, most such elements are formed via the mutational modification of existing open reading frames. Contrary to expectations, we discover that proto-genes emerge at a uniform rate across distant bacterial taxa despite significant differences in their genomic characteristics, suggesting the presence of taxon-specific mechanisms that regulate their origination and persistence.
Project description:Mutations such as gene fusion, translocation and focal amplification are a frequent cause of proto-oncogene activation during tumorigenesis, but such mutations do not explain all cases of proto-oncogene activation. Here we show that disruption of local chromosome conformation can also activate proto-oncogenes in human cells. We mapped chromosome structures in T-cell acute lymphoblastic leukemia (T-ALL), and found that active oncogenes and silent proto-oncogenes generally occur within insulated neighborhoods formed by the looping of two interacting CTCF sites co-occupied by cohesin. Recurrent microdeletions frequently overlap neighborhood boundary sites in T-ALL genomes, and we demonstrate that site-specific perturbation of loop boundaries is sufficient to activate the respective proto-oncogenes in non-malignant cells. We found somatic genomic rearrangements affecting loop boundaries in many cancers. These results suggest that chromosome structural organization is fundamental to identify functional somatic alterations in cancer genomes.
Project description:Analyses of new genomic, transcriptomic or proteomic data commonly result in trashing many unidentified data escaping the ‘canonical’ DNA-RNA-protein scheme. Testing systematic exchanges of nucleotides over long stretches produces inversed RNA pieces (here named “swinger” RNA) differing from their template DNA. These may explain some trashed data. Here analyses of genomic, transcriptomic and proteomic data of the pathogenic Tropheryma whipplei according to canonical genomic, transcriptomic and translational 'rules' resulted in trashing 58.9% of DNA, 37.7% RNA and about 85% of mass spectra (corresponding to peptides). In the trash, we found numerous DNA/RNA fragments compatible with “swinger” polymerization. Genomic sequences covered by «swinger» DNA and RNA are 3X more frequent than expected by chance and explained 12.4 and 20.8% of the rejected DNA and RNA sequences, respectively. As for peptides, several match with “swinger” RNAs, including some chimera, translated from both regular, and «swinger» transcripts, notably for ribosomal RNAs. Congruence of DNA, RNA and peptides resulting from the same swinging process suggest that systematic nucleotide exchanges increase coding potential, and may add to evolutionary diversification of bacterial populations.
Project description:Mutations such as gene fusion, translocation and focal amplification are a frequent cause of proto-oncogene activation during tumorigenesis, but such mutations do not explain all cases of proto-oncogene activation. Here we show that disruption of local chromosome conformation can also activate proto-oncogenes in human cells. We mapped chromosome structures in T-cell acute lymphoblastic leukemia (T-ALL), and found that active oncogenes and silent proto-oncogenes generally occur within insulated neighborhoods formed by the looping of two interacting CTCF sites co-occupied by cohesin. Recurrent microdeletions frequently overlap neighborhood boundary sites in T-ALL genomes, and we demonstrate that site-specific perturbation of loop boundaries is sufficient to activate the respective proto-oncogenes in non-malignant cells. We found somatic genomic rearrangements affecting loop boundaries in many cancers. These results suggest that chromosome structural organization is fundamental to identify functional somatic alterations in cancer genomes.