Project description:Transversions (Tv's) are more likely to alter the amino acid sequence of proteins than transitions (Ts's), and local deviations in the Ts:Tv ratio are indicative of evolutionary selection on genes. Whether the two different types of mutations have different effects in non-protein-coding sequences remains unknown. Here, we provide multiple lines of evidence demonstrating that Tv’s have larger impacts on regulatory DNA including analyses of transcription factor (TF) binding motifs, allele-specific TF binding, and genome-wide Tv density within accessible chromatin. Using massively parallel population-scale reporter assays, we also provide empirical evidence that Tv's have larger effects than Ts's on the activity of human gene regulatory elements. Understanding the features of functional non-coding variation will be valuable for revealing the genetic underpinnings of complex traits and diseases.
Project description:Bidirectional transcription initiates at both coding and non-coding genomic elements, including mRNA and long non-coding RNA (lncRNA) promoters and enhancer RNAs (eRNAs). However, each class has different tissue-specific expression profiles with lncRNAs and eRNAs being the most tissue-specific. How these complex differences in expression profiles and tissue-specificities are encoded in a single DNA sequence, however, remains an open question. Here, we address this question using multiple computational and experimental approaches, including massively parallel reporter assays (MPRA). As most transcription factors (TFs) are enriched near the transcription start sites (TSSs) of both promoters and enhancers, we focus our analyses on these core promoter regions. We find that divergent lncRNA and mRNA core promoters have higher capacities to drive transcription than non-divergent lncRNA and mRNA core promoters, respectively. Conversely, lincRNAs and eRNAs are more tissue-specific than divergent genes. This higher tissue-specificity is strongly associated with having less complex TF motif profiles at the core promoter. We confirm these findings using single-nucleotide deletions in MPRA and we identify specific TFs regulating a set of disease-related lncRNAs. Finally, we assess the effects of genetic variation at core promoters and find that 22% of common single nucleotide polymorphisms show significant regulatory effects. Collectively, our findings characterize the important role of core promoter sequences in determining expression levels across both coding and non-coding gene classes and highlight an unexpected role of TF motif architecture in explaining the more restricted expression patterns of lncRNAs and eRNAs.
2019-01-08 | GSE117594 | GEO
Project description:CoolMPS: evaluation of antibody labeling based massively-parallel non-coding RNA sequencing