Systematic identification and characterization of regulatory elements derived from human endogenous retroviruses.
ABSTRACT: Human endogenous retroviruses (HERVs) and other long terminal repeat (LTR)-type retrotransposons (HERV/LTRs) have regulatory elements that possibly influence the transcription of host genes. We systematically identified and characterized these regulatory elements based on publicly available datasets of ChIP-Seq of 97 transcription factors (TFs) provided by ENCODE and Roadmap Epigenomics projects. We determined transcription factor-binding sites (TFBSs) using the ChIP-Seq datasets and identified TFBSs observed on HERV/LTR sequences (HERV-TFBSs). Overall, 794,972 HERV-TFBSs were identified. Subsequently, we identified "HERV/LTR-shared regulatory element (HSRE)," defined as a TF-binding motif in HERV-TFBSs, shared within a substantial fraction of a HERV/LTR type. HSREs could be an indication that the regulatory elements of HERV/LTRs are present before their insertions. We identified 2,201 HSREs, comprising specific associations of 354 HERV/LTRs and 84 TFs. Clustering analysis showed that HERV/LTRs can be grouped according to the TF binding patterns; HERV/LTR groups bounded to pluripotent TFs (e.g., SOX2, POU5F1, and NANOG), embryonic endoderm/mesendoderm TFs (e.g., GATA4/6, SOX17, and FOXA1/2), hematopoietic TFs (e.g., SPI1 (PU1), GATA1/2, and TAL1), and CTCF were identified. Regulatory elements of HERV/LTRs tended to locate nearby and/or interact three-dimensionally with the genes involved in immune responses, indicating that the regulatory elements play an important role in controlling the immune regulatory network. Further, we demonstrated subgroup-specific TF binding within LTR7, LTR5B, and LTR5_Hs, indicating that gains or losses of the regulatory elements occurred during genomic invasions of the HERV/LTRs. Finally, we constructed dbHERV-REs, an interactive database of HERV/LTR regulatory elements (http://herv-tfbs.com/). This study provides fundamental information in understanding the impact of HERV/LTRs on host transcription, and offers insights into the transcriptional modulation systems of HERV/LTRs and ancestral HERVs.
Project description:Human endogenous retroviruses (HERVs) are globally silent in somatic cells. However, some HERVs display high transcription in physiological conditions. In particular, ERVWE1, ERVFRDE1 and ERV3, three proviruses of distinct families, are highly transcribed in placenta and produce envelope proteins associated with placenta development. As silencing of repeated elements is thought to occur mainly by DNA methylation, we compared the methylation of ERVWE1 and related HERVs to appreciate whether HERV methylation relies upon the family, the integration site, the tissue, the long terminal repeat (LTR) function or the associated gene function. CpG methylation of HERV-W LTRs in placenta-associated tissues was heterogeneous but a joint epigenetic control was found for ERVWE1 5'LTR and its juxtaposed enhancer, a mammalian apparent LTR retrotransposon. Additionally, ERVWE1, ERVFRDE1 and ERV3 5'LTRs were all essentially hypomethylated in cytotrophoblasts during pregnancy, but showed distinct and stage-dependent methylation profiles. In non-cytotrophoblastic cells, they also exhibited different methylation profiles, compatible with their respective transcriptional activities. Comparative analyses of transcriptional activity and LTR methylation in cell lines further sustained a role for methylation in the control of functional LTRs. These results suggest that HERV methylation might not be family related but copy-specific, and related to the LTR function and the tissue. In particular, ERVWE1 and ERV3 could be developmentally epigenetically regulated HERVs.
Project description:BACKGROUND:Human Endogenous Retroviruses (HERVs) and Mammalian apparent LTR-retrotransposons (MaLRs) represent the 8% of our genome and are distributed among our 46 chromosomes. These LTR-retrotransposons are thought to be essentially silent except in cancer, autoimmunity and placental development. Their Long Terminal Repeats (LTRs) constitute putative promoter or polyA regulatory sequences. In this study, we used a recently described high-density microarray which can be used to study HERV/MaLR transcriptome including 353,994 HERV/MaLR loci and 1559 immunity-related genes. RESULTS:We described, for the first time, the HERV transcriptome in peripheral blood mononuclear cells (PBMCs) using a cellular model mimicking inflammatory response and monocyte anergy observed after septic shock. About 5.6% of the HERV/MaLR repertoire is transcribed in PBMCs. Roughly one-tenth [5.7-13.1%] of LTRs exhibit a putative constitutive promoter or polyA function while one-quarter [19.5-27.6%] may shift from silent to active. Evidence was given that some HERVs/MaLRs and genes may share similar regulation control under lipopolysaccharide (LPS) stimulation conditions. Stimulus-dependent response confirms that HERV expression is tightly regulated in PBMCs. Altogether, these observations make it possible to integrate 62 HERVs/MaLRs and 26 genes in 11 canonical pathways and suggest a link between HERV expression and immune response. The transcriptional modulation of HERVs located close to genes such as OAS2/3 and IFI44/IFI44L or at a great distance from genes was discussed. CONCLUSION:This microarray-based approach revealed the expression of about 47,466 distinct HERV loci and identified 951 putative promoter LTRs and 744 putative polyA LTRs in PBMCs. HERV/MaLR expression was shown to be tightly modulated under several stimuli including high-dose and low-dose LPS and Interferon-? (IFN-?). HERV incorporation at the crossroads of immune response pathways paves the way for further functional studies and analyses of the HERV transcriptome in altered immune responses in vivo such as in sepsis.
Project description:Social behavior and neuronal connectivity in rodents have been shown to be shaped by the prototypical T lymphocyte-derived pro-inflammatory cytokine Interferon-gamma (IFN?). It has also been demonstrated that STAT1 (Signal Transducer And Activator Of Transcription 1), a transcription factor (TF) crucially involved in the IFN? pathway, binds consensus sequences that, in humans, are located with a high frequency in the LTRs (Long Terminal Repeats) of the MER41 family of primate-specific HERVs (Human Endogenous Retroviruses). However, the putative role of an IFN?/STAT1/MER41 pathway in human cognition and/or behavior is still poorly documented. Here, we present evidence that the promoter regions of intellectual disability-associated genes are uniquely enriched in LTR sequences of the MER41 HERVs. This observation is specific to MER41 among more than 130 HERVs examined. Moreover, we have not found such a significant enrichment in the promoter regions of genes that associate with autism spectrum disorder (ASD) or schizophrenia. Interestingly, ID-associated genes exhibit promoter-localized MER41 LTRs that harbor TF binding sites (TFBSs) for not only STAT1 but also other immune TFs such as, in particular, NFKB1 (Nuclear Factor Kappa B Subunit 1) and STAT3 (Signal Transducer And Activator Of Transcription 3). Moreover, IL-6 (Interleukin 6) rather than IFN?, is identified as the main candidate cytokine regulating such an immune/MER41/cognition pathway. Of note, differences between humans and chimpanzees are observed regarding the insertion sites of MER41 LTRs in the promoter regions of ID-associated genes. Finally, a survey of the human proteome has allowed us to map a protein-protein network which links the identified immune/MER41/cognition pathway to FOXP2 (Forkhead Box P2), a key TF involved in the emergence of human speech. Our work suggests that together with the evolution of immune genes, the stepped self-domestication of MER41 in the genomes of primates could have contributed to cognitive evolution. We further propose that non-inherited forms of ID might result from the untimely or quantitatively inappropriate expression of immune signals, notably IL-6, that putatively regulate cognition-associated genes via promoter-localized MER41 LTRs.
Project description:Many phenotypic differences exist between Homo sapiens and its closest relatives, chimpanzees, and these differences can arise as a result of variations in the regulation of certain genes common to these closely related species. Human-specific endogenous retroviruses (HERVs) and their solitary long terminal repeats (LTRs) are probable candidates for such a role due to the presence of regulatory elements, such as enhancers, promoters, splice sites, and polyadenylation signals. In this study we show for the first time that HERVs can participate in the specific antisense regulation of human gene expression owing to their LTR promoter activity. We found that two HERV LTRs situated in the introns of genes SLC4A8 (for sodium bicarbonate cotransporter) and IFT172 (for intraflagellar transport protein 172) in the antisense orientation serve in vivo as promoters for generating RNAs complementary to the exons of enclosing genes. The antisense transcripts formed from LTR promoter were shown to decrease the mRNA level of the corresponding genes. The human-specific regulation of these genes suggests their involvement in the evolutionary process.
Project description:Several distinct families of endogenous retrovirus-like sequences (HERVs) exist in the genomes of humans and other primates. One of these families, the HERV-K group, contains members that encode functional proteins and that have been implicated in the etiology of insulin-dependent diabetes mellitus (IDDM). Because of potential functional and disease relevance, it is important to determine if there are HERV-K-associated genetic differences between individuals. In this study, we have investigated the divergence and evolutionary age of HERV-K long terminal repeats (LTRs). Thirty-seven LTRs, taken primarily from random human clones in GenBank, were aligned and grouped into nine clusters with decreasing sequence divergence. Cluster 1 sequences are 8.6% divergent, on average, whereas cluster 9 LTRs, represented by the LTRs of the fully sequenced HERV-K10 clone, show an average of only 1.1% divergence from each other. The evolutionary age of 18 LTRs from different clusters was then investigated by genomic PCR to determine presence or absence of the retroviral element in different primate species. LTRs from clusters of higher divergence were detected in monkeys and apes, whereas LTRs in clusters with lower divergence were acquired later in evolution. Notably, LTRs of cluster 9 were found only in humans at all nine loci examined. Genomic Southern analysis with an oligonucleotide probe specific for cluster 9 LTRs suggests that HERV-K elements with this type of LTR expanded independently in the genomes of humans and the great apes. This is the first report of endogenous retroviral integrations that are specific to humans and indicates that some HERVs have amplified much later than previously thought. These elements may still be actively transposing and may therefore represent a source of genetic variation linked to disease development.
Project description:Up-regulation of human endogenous retroviruses (HERVs) is associated with many diseases, including cancer. In this study, an H family HERV (HERV-H)-related gene was identified and characterized. Its spliced transcript lacks protein-coding capacity and may belong to the emerging class of noncoding RNAs (ncRNAs). The 1.3-kb RNA consisting of four exons is transcribed from an Alu element upstream of a 5.0-kb structurally incomplete HERV-H element. RT-PCR and quantitative RT-PCR results indicated that expression of this HERV-related transcript was negatively associated with colon, stomach, and kidney cancers. Its expression was induced upon treatment with DNA methylation and histone deacetylation inhibitors. A BLAT search using long terminal repeats (LTRs) identified 50 other LTR homogenous HERV-H elements. Further analysis of these elements revealed that all are structurally incomplete and only five exert transcriptional activity. The results presented here recommend further investigation into a potentially functional HERV-H-related ncRNA.
Project description:Human endogenous retroviruses (HERVs) are a potential source of genetic diversity in the human genome. Although many of these elements have been inactivated over time by the accumulation of deleterious mutations or internal recombination leading to solo-LTR formation, several members of the HERV-K family have been identified that remain nearly intact and probably represent recent integration events. To determine whether HERV-K elements have caused recent changes in the human genome, we have undertaken a study of the level of HERV-K polymorphism that exists in the human population. By using a high-resolution unblotting technique, we analyzed 13 human-specific HERV-K elements in 18 individuals. We found that solo LTRs have formed at five of these loci. These results enable the estimation of HERV solo-LTR formation in the human genome and indicate that these events occur much more frequently than described in inbred mice. Detailed sequence analysis of one provirus shows that solo-LTR formation occurred at least three separate times in recent history. An unoccupied preintegration site also was present at this locus in two individuals, indicating that although the age of this provirus is estimated to be approximately 1.2 million years, it has not yet become fixed in the human population.
Project description:Endogenous retroviruses (ERVs) are an inherited part of the eukaryotic genomes, and represent approximately 400,000 loci in the human genome. Human endogenous retroviruses (HERVs) can be divided into distinct families, composed of phylogenetically related but structurally heterogeneous elements. The majority of HERVs are silent in most physiological contexts, whereas a significant expression is observed in pathological contexts, such as cancers. Owing to their repetitive nature, few of the active HERV elements have been accurately identified. In addition, there are no criteria defining the active promoters among HERV long-terminal repeats (LTRs). Hence, it is difficult to understand the HERV (de)regulation mechanisms and their implication on the physiopathology of the host. We developed a microarray to specifically detect the LTR-containing transcripts from the HERV-H, HERV-E, HERV-W and HERV-K(HML-2) families. HERV transcriptome was analyzed in the placenta and seven normal/tumoral match-pair samples. We identified six HERV-W loci overexpressed in testicular cancer, including a usually placenta-restricted transcript of ERVWE1. For each locus, specific overexpression was confirmed by quantitative RT-PCR, and comparison of the activity of U3 versus U5 regions suggested a U3-promoted transcription coupled with 5'R initiation. The analysis of DNA from tumoral versus normal tissue revealed that hypomethylation of U3 promoters in tumors is a prerequisite for their activation.
Project description:A significant proportion of the human genome consists of stably inherited retroviral sequences. Most human endogenous retroviruses (HERVs) became defective over time. The HERV-K(HML-2) family is exceptional because of its coding capacity and the possible involvement in germ cell tumor (GCT) development. HERV-K(HML-2) transcription is strongly upregulated in GCTs. However, regulation of HERV-K(HML-2) transcription remains poorly understood. We investigated in detail the role of CpG methylation on the transcriptional activity of HERV-K(HML-2) long terminal repeats (LTRs). We find that CpG sites in various HERV-K(HML-2) proviral 5' LTRs are methylated at different levels in the cell line Tera-1. Methylation levels correlate with previously observed transcriptional activities of these proviruses. CpG-mediated silencing of HERV-K(HML-2) LTRs is further corroborated by transcriptional inactivity of in vitro-methylated 5' LTR reporter plasmids. However, CpG methylation levels do not solely regulate HERV-K(HML-2) 5' LTR activity, as evidenced by different LTR activities in the cell line T47D. A significant number of mutated CpG sites in evolutionary old HERV-K(HML-2) 5' LTRs suggests that CpG methylation had already silenced HERV-K(HML-2) proviruses millions of years ago. Direct silencing of HERV-K(HML-2) expression by CpG methylation enlightens upregulated HERV-K(HML-2) expression in usually hypomethylated GCT tissue.
Project description:Human endogenous retroviruses (HERVs) are spread throughout the genome and their long terminal repeats (LTRs) constitute a wide collection of putative regulatory sequences. Phylogenetic similarities and the profusion of integration sites, two inherent characteristics of transposable elements, make it difficult to study individual locus expression in a large-scale approach, and historically apart from some placental and testis-regulated elements, it was generally accepted that HERVs are silent due to epigenetic control. Herein, we have introduced a generic method aiming to optimally characterize individual loci associated with 25-mer probes by minimizing cross-hybridization risks. We therefore set up a microarray dedicated to a collection of 5,573 HERVs that can reasonably be assigned to a unique genomic position. We obtained a first view of the HERV transcriptome by using a composite panel of 40 normal and 39 tumor samples. The experiment showed that almost one third of the HERV repertoire is indeed transcribed. The HERV transcriptome follows tropism rules, is sensitive to the state of differentiation and, unexpectedly, seems not to correlate with the age of the HERV families. The probeset definition within the U3 and U5 regions was used to assign a function to some LTRs (i.e. promoter or polyA) and revealed that (i) autonomous active LTRs are broadly subjected to operational determinism (ii) the cellular gene density is substantially higher in the surrounding environment of active LTRs compared to silent LTRs and (iii) the configuration of neighboring cellular genes differs between active and silent LTRs, showing an approximately 8 kb zone upstream of promoter LTRs characterized by a drastic reduction in sense cellular genes. These gathered observations are discussed in terms of virus/host adaptive strategies, and together with the methods and tools developed for this purpose, this work paves the way for further HERV transcriptome projects.