Project description:DNA methylation have crucial roles in regulating the expression of developmental genes during mammalian pre-implantation embryonic development (PED). However, the DNA methylation dynamic pattern of long noncoding RNA (lncRNA) genes, one type of epigenetic regulators, in human PED have not yet been demonstrated. Here, we performed a comprehensive analysis of lncRNA genes in human PED based on public reduced representation bisulphite sequencing (RRBS) data. We observed that both lncRNA and protein-coding genes complete the major demethylation wave at the 2-cell stage, whereas the promoters of lncRNA genes show higher methylation level than protein-coding genes during PED. Similar methylation distribution was observed across the transcription start sites (TSS) of lncRNA and protein-coding genes, contrary to previous observations in tissues. Besides, not only the gamete-specific differentially methylated regions (G-DMRs) but also the embryonic developmental-specific DMRs (D-DMRs) showed more paternal bias, especially in promoter regions in lncRNA genes. Moreover, coding-non-coding gene co-expression network analysis of genes containing D-DMRs suggested that lncRNA genes involved in PED are associated with gene expression regulation through several means, such as mRNA splicing, translational regulation and mRNA catabolic. This firstly provides study provides the methylation profiles of lncRNA genes in human PED and improves the understanding of lncRNA genes involvement in human PED.
Project description:The compact genome of the unicellular eukaryote Paramecium tetraurelia contains noncoding DNA (ncDNA) distributed into >39,000 intergenic sequences and >90,000 introns of 390 base pairs (bp) and 25 bp on average, respectively. Here we analyzed the molecular features of the ncRNA genes, introns, and intergenic sequences of this genome. We mainly used computational programs and comparative genomics possible because the P. tetraurelia genome had formed throughout whole-genome duplications (WGDs). We characterized 417 5S rRNA, snRNA, snoRNA, SRP RNA, and tRNA putative genes, 415 of which map within intergenic sequences, and two, within introns. The evolution of these ncRNA genes appears to have mainly involved purifying selection and gene deletion. We then compared the introns that interrupt the protein-coding gene duplicates arisen from the recent WGD and identified a population of a few thousands of introns having evolved under most stringent constraints (>95% of identity). We also showed that low nucleotide substitution levels characterize the 50 and 80-115 base pairs flanking, respectively, the stop and start codons of the protein-coding genes. Lower substitution levels mark the base pairs flanking the highly transcribed genes, or the start codons of the genes of the sets with a high number of WGD-related sequences. Finally, adjacent to protein-coding genes, we characterized 32 DNA motifs able to encode stable and evolutionary conserved RNA secondary structures and defining putative expression controlling elements. Fourteen DNA motifs with similar properties map distant from protein-coding genes and may encode regulatory ncRNAs.
Project description:Replication, heredity, and evolution are characteristic of Life. We and others have postulated that the reconstruction of a synthetic living system in the laboratory will be contingent on the development of a genetic self-replicator capable of undergoing Darwinian evolution. Although DNA-based life dominates, the in vitro reconstitution of an evolving DNA self-replicator has remained challenging. We hereby emulate in liposome compartments the principles according to which life propagates information and evolves. Using two different experimental configurations supporting intermittent or semi-continuous evolution (i.e., with or without DNA extraction, PCR, and re-encapsulation), we demonstrate sustainable replication of a linear DNA template - encoding the DNA polymerase and terminal protein from the Phi29 bacteriophage - expressed in the 'protein synthesis using recombinant elements' (PURE) system. The self-replicator can survive across multiple rounds of replication-coupled transcription-translation reactions in liposomes and, within only ten evolution rounds, accumulates mutations conferring a selection advantage. Combined data from next-generation sequencing with reverse engineering of some of the enriched mutations reveal nontrivial and context-dependent effects of the introduced mutations. The present results are foundational to build up genetic complexity in an evolving synthetic cell, as well as to study evolutionary processes in a minimal cell-free system.
Project description:Viruses are known for their extremely compact genomes composed almost entirely of protein-coding genes. Nonetheless, four long noncoding RNAs (lncRNAs) are encoded by human cytomegalovirus (HCMV). Although these RNAs accumulate to high levels during lytic infection, their functions remain largely unknown. Here, we show that HCMV-encoded lncRNA4.9 localizes to the viral nuclear replication compartment, and that its depletion restricts viral DNA replication and viral growth. RNA4.9 is transcribed from the HCMV origin of replication (oriLyt) and forms an RNA-DNA hybrid (R-loop) through its G+C-rich 5' end, which may be important for the initiation of viral DNA replication. Furthermore, targeting the RNA4.9 promoter with CRISPR-Cas9 or genetic relocalization of oriLyt leads to reduced levels of the viral single-stranded DNA-binding protein (ssDBP), suggesting that the levels of ssDBP are coupled to the oriLyt activity. We further identified a similar, oriLyt-embedded, G+C-rich lncRNA in murine cytomegalovirus (MCMV). These results indicate that HCMV RNA4.9 plays an important role in regulating viral DNA replication, that the levels of ssDBP are coupled to the oriLyt activity, and that these regulatory features may be conserved among betaherpesviruses.
Project description:Genomic imprinting is an epigenetic phenomenon where autosomal genes display uniparental expression depending on whether they are maternally or paternally inherited. Genomic imprinting can arise from parental conflicts over resource allocation to the offspring, which could drive imprinted loci to evolve by positive selection. We investigate whether positive selection is associated with genomic imprinting in the inbreeding species Arabidopsis thaliana. Our analysis of 140 genes regulated by genomic imprinting in the A. thaliana seed endosperm demonstrates they are evolving more rapidly than expected. To investigate whether positive selection drives this evolutionary acceleration, we identified orthologs of each imprinted gene across 34 plant species and elucidated their evolutionary trajectories. Increased positive selection was sought by comparing its incidence among imprinted genes with nonimprinted controls. Strikingly, we find a statistically significant enrichment of imprinted paternally expressed genes (iPEGs) evolving under positive selection, 50.6% of the total, but no such enrichment for positive selection among imprinted maternally expressed genes (iMEGs). This suggests that maternally- and paternally expressed imprinted genes are subject to different selective pressures. Almost all positively selected amino acids were fixed across 80 sequenced A. thaliana accessions, suggestive of selective sweeps in the A. thaliana lineage. The imprinted genes under positive selection are involved in processes important for seed development including auxin biosynthesis and epigenetic regulation. Our findings support a genomic imprinting model for plants where positive selection can affect paternally expressed genes due to continued conflict with maternal sporophyte tissues, even when parental conflict is reduced in predominantly inbreeding species.
Project description:Although the Human Genome Project was completed 4 years ago, the catalog of human protein-coding genes remains a matter of controversy. Current catalogs list a total of approximately 24,500 putative protein-coding genes. It is broadly suspected that a large fraction of these entries are functionally meaningless ORFs present by chance in RNA transcripts, because they show no evidence of evolutionary conservation with mouse or dog. However, there is currently no scientific justification for excluding ORFs simply because they fail to show evolutionary conservation: the alternative hypothesis is that most of these ORFs are actually valid human genes that reflect gene innovation in the primate lineage or gene loss in the other lineages. Here, we reject this hypothesis by carefully analyzing the nonconserved ORFs-specifically, their properties in other primates. We show that the vast majority of these ORFs are random occurrences. The analysis yields, as a by-product, a major revision of the current human catalogs, cutting the number of protein-coding genes to approximately 20,500. Specifically, it suggests that nonconserved ORFs should be added to the human gene catalog only if there is clear evidence of an encoded protein. It also provides a principled methodology for evaluating future proposed additions to the human gene catalog. Finally, the results indicate that there has been relatively little true innovation in mammalian protein-coding genes.
Project description:The three-dimensional molecular structure of DNA, specifically the shape of the backbone and grooves of genomic DNA, can be dramatically affected by nucleotide changes, which can cause differences in protein-binding affinity and phenotype. We developed an algorithm to measure constraint on the basis of similarity of DNA topography among multiple species, using hydroxyl radical cleavage patterns to interrogate the solvent-accessible surface area of DNA. This algorithm found that 12% of bases in the human genome are evolutionarily constrained-double the number detected by nucleotide sequence-based algorithms. Topography-informed constrained regions correlated with functional noncoding elements, including enhancers, better than did regions identified solely on the basis of nucleotide sequence. These results support the idea that the molecular shape of DNA is under selection and can identify evolutionary history.
Project description:We earlier suggested that type A human influenza virus genes undergo positive Darwinian selection through immune surveillance. This requires more favorable amino acid replacements fixed in antigenic sites among the surviving lineages than among the extinct lineages. We now show that viral hemagglutinins fix proportionately more amino acid replacements in antigenic sites in the trunk of the evolutionary tree (survivors) than in the branches (nonsurvivors), demonstrating that type A human influenza virus is undergoing positive Darwinian evolution. The hemagglutinin gene is evolving 3 times faster than the nonstructural gene and the average age of the sampled nonsurvivors is only 1.6 years, so that extinction is not only common but rapid.
Project description:All cells are equipped with intricate signaling networks to meet the energy demands and respond to the nutrient availability in the body. AMP-activated protein kinase (AMPK) is among the most potent regulators of cellular energy balance. Under ATP -deprived conditions, AMPK phosphorylates substrates and affects various biological processes, such as lipid/glucose metabolism and protein synthesis. These actions further affect the cell growth, death, and functions, altering the cellular outcomes in energy-restricted environments. AMPK plays vital roles in maintaining good health. AMPK dysfunction is observed in various chronic diseases, making it a promising target for preventing and alleviating such diseases. Herein, we highlight the different AMPK functions, especially in allergy, aging, and cancer, to facilitate the development of new therapeutic approaches in the future.
Project description:Dopaminergic (DA) neurons derived from human pluripotent stem cells (hPSCs) represent a renewable and available source of cells useful for understanding development, developing disease models, and stem-cell therapies for Parkinson's disease (PD). To assess the utility of stem cell cultures as an in vitro model system of human DA neurogenesis, we performed high-throughput transcriptional profiling of ~20,000 ventral midbrain (VM)-patterned stem cells at different stages of maturation using droplet-based single-cell RNA sequencing (scRNAseq). Using this dataset, we defined the cellular composition of human VM cultures at different timepoints and found high purity DA progenitor formation at an early stage of differentiation. DA neurons sharing similar molecular identities to those found in authentic DA neurons derived from human fetal VM were the major cell type after two months in culture. We also developed a bioinformatic pipeline that provided a comprehensive long noncoding RNA landscape based on temporal and cell-type specificity, which may contribute to unraveling the intricate regulatory network of coding and noncoding genes in DA neuron differentiation. Our findings serve as a valuable resource to elucidate the molecular steps of development, maturation, and function of human DA neurons, and to identify novel candidate coding and noncoding genes driving specification of progenitors into functionally mature DA neurons.