Transcriptomics

Dataset Information

378

The evolution of lncRNA repertoires and expression patterns in tetrapods


ABSTRACT: Only a minuscule fraction of long non-coding RNAs (lncRNAs) are well characterized. The evolutionary history of lncRNAs can provide insights into their functionality, but comparative analyses have been precluded by our ignorance of lncRNAs in non-model organisms. Here, we use RNA sequencing to identify lncRNAs in eleven tetrapod species and we present the first large-scale evolutionary study of lncRNA repertoires and expression patterns. We identify ~11,000 primate- specific lncRNA families, which show evidence for selective constraint during recent evolution, and ~2,400 highly conserved lncRNAs (including ~400 genes that likely originated more than 300 million years ago). We find that lncRNAs, in particular ancient ones, are generally actively regulated and may predominantly function in embryonic development. lncRNA X-inactivation patterns reveal an extremely female-biased monotreme-specific lncRNA, which may partially compensate X-dosage in this lineage. Most lncRNAs evolve rapidly in terms of sequence and expression levels, but global patterns like tissue specificities are often conserved. We compared expression patterns of homologous lncRNA and protein-coding families across tetrapods to reconstruct an evolutionarily conserved co-expression network. This network, which surprisingly contains many lncRNA hubs, suggests potential functions for lncRNAs in fundamental processes like spermatogenesis or synaptic transmission, but also in more specific mechanisms such as placenta growth suppression through miRNA production. [Batch 1 and 2] To broaden our understanding of lncRNA evolution, we used an extensive RNA-seq dataset to establish lncRNA repertoires and homologous gene families in 11 tetrapod species. We analyzed the poly- adenylated transcriptomes of 8 organs (cortex/whole brain without cerebellum, cerebellum, heart, kidney, liver, placenta, ovary and testis) and 11 species (human, chimpanzee, bonobo, gorilla, orangutan, macaque, mouse, opossum, platypus, chicken and the frog Xenopus tropicalis), which shared a common ancestor ~370 millions of years (MY) ago. Our dataset included 47 strand-specific samples, which allowed us to confirm the orientation of gene predictions and to address the evolution of sense-antisense transcripts. See also GSE43721 (Soumillon et al, Cell Reports, 2013) for three strand-specific samples for mouse brain, liver and testis.

SUBMITTER: Julie Baker   Henrik Kaessmann  Anamaria Necsulea  Frank Grutzner  Angélica Liechti  Magali Soumillon  Tasman Daish  Ulrich Zeller  Maria Warnefors 

PROVIDER: E-GEOD-43520 | ArrayExpress | 2014-01-19

SECONDARY ACCESSION(S): GSE43520SRP017959PRJNA186646

REPOSITORIES: GEO, ArrayExpress, ENA

altmetric image

Publications


Only a very small fraction of long noncoding RNAs (lncRNAs) are well characterized. The evolutionary history of lncRNAs can provide insights into their functionality, but the absence of lncRNA annotations in non-model organisms has precluded comparative analyses. Here we present a large-scale evolutionary study of lncRNA repertoires and expression patterns, in 11 tetrapod species. We identify approximately 11,000 primate-specific lncRNAs and 2,500 highly conserved lncRNAs, including approximatel  ...[more]

Similar Datasets

2013-01-01 | S-EPMC3636048 | BioStudies
1000-01-01 | S-EPMC4132921 | BioStudies
2019-01-01 | S-EPMC6438004 | BioStudies
2009-01-01 | S-EPMC2801517 | BioStudies
2013-06-20 | E-GEOD-43721 | ArrayExpress
2013-06-20 | E-GEOD-43717 | ArrayExpress
2010-01-01 | S-EPMC2877528 | BioStudies
2018-01-01 | S-EPMC5834995 | BioStudies
2016-01-07 | E-GEOD-64818 | ArrayExpress
2012-01-01 | S-EPMC3406015 | BioStudies