Prioritizing long range interactions in noncoding regions using GWAS and deletions perturbed TADs.
ABSTRACT: Genome-wide association studies (GWAS) have contributed significantly to predisposing the disease etiology by associating single nucleotide polymorphisms (SNPs) with complex diseases. However, most GWAS-SNPs are in the noncoding regions that may affect distal genes via long range enhancer-promoter interactions. Thus, the common practice on GWAS discoveries cannot fully reveal the molecular mechanisms underpinning complex diseases. It is known that perturbations of topological associated domains (TADs) lead to long range interactions which underlie disease etiology. To identify the probable long range interactions in noncoding regions via GWAS and TADs perturbed by deletions, we integrated datasets from GWAS-SNPs, enhancers, TADs, and deletions. After ranking and clustering, we prioritized 201,132 high confident pairs of GWAS-SNPs and target genes. In this study, we performed a systematic inference on noncoding regions via GWAS-SNPs and deletion-perturbed TADs to boost GWAS discovery power. The high confident pairs of GWAS-SNPs and target genes (SE-Gs) provide the promising candidates to understand the molecular mechanisms underlying complex diseases with emphasis on the three-dimensional genome.
Project description:Genome-wide association studies (GWASs) for many complex diseases, including inflammatory bowel disease (IBD), produced hundreds of disease-associated loci-the majority of which are noncoding. The number of GWAS loci is increasing very rapidly, but the process of translating single nucleotide polymorphisms (SNPs) from these loci to genomic medicine is lagging. In this study, we investigated 4,734 variants from 152 IBD associated GWAS loci (IBD associated 152 lead noncoding SNPs identified from pooled GWAS results + 4,582 variants in strong linkage-disequilibrium (LD) (r2 ?0.8) for EUR population of 1K Genomes Project) using four publicly available bioinformatics tools, e.g. dbPSHP, CADD, GWAVA, and RegulomeDB, to annotate and prioritize putative regulatory variants. Of the 152 lead noncoding SNPs, around 11% are under strong negative selection (GERP++ RS ?2); and ~30% are under balancing selection (Tajima's D score >2) in CEU population (1K Genomes Project)--though these regions are positively selected (GERP++ RS <0) in mammalian evolution. The analysis of 4,734 variants using three integrative annotation tools produced 929 putative functional SNPs, of which 18 SNPs (from 15 GWAS loci) are in concordance with all three classifiers. These prioritized noncoding SNPs may contribute to IBD pathogenesis by dysregulating the expression of nearby genes. This study showed the usefulness of integrative annotation for prioritizing fewer functional variants from a large number of GWAS markers.
Project description:Single nucleotide polymorphisms (SNPs) occurring in noncoding sequences have largely been ignored in genome-wide association studies (GWAS). Yet, amounting evidence suggests that many noncoding SNPs especially those that are in the vicinity of protein coding genes play important roles in shaping chromatin structure and regulate gene expression and, as such, are implicated in a wide variety of diseases. One of such regulatory SNPs (rSNPs) is the E-cadherin (CDH1) promoter -160C/A SNP (rs16260) which is known to affect E-cadherin promoter transcription by displacing transcription factor binding and has been extensively scrutinized for its association with several diseases especially malignancies. Findings from studying this SNP highlight important clinical relevance of rSNPs and justify their inclusion in future GWAS to identify novel disease causing SNPs.
Project description:Many disease-related single nucleotide polymorphisms (SNPs) have been inferred from genome-wide association studies (GWAS) in recent years. Numerous studies have shown that some SNPs located in protein-coding regions are associated with numerous diseases by affecting gene expression. However, in noncoding regions, the mechanism of how SNPs contribute to disease susceptibility remains unclear. Enhancer elements are functional segments of DNA located in noncoding regions that play an important role in regulating gene expression. The SNPs located in enhancer elements may affect gene expression and lead to disease. We presented a method for identifying liver cancer-related enhancer SNPs through integrating GWAS and histone modification ChIP-seq data. We identified 22 liver cancer-related enhancer SNPs, 9 of which were regulatory SNPs involved in distal transcriptional regulation. The results highlight that these enhancer SNPs may play important roles in liver cancer.
Project description:There is increasing evidence that a history of childhood abuse and neglect is not uncommon among individuals who experience mental disorder and that childhood trauma experiences are associated with adult psychopathology. Although several interview and self-report instruments for retrospective trauma assessment have been developed, many focus on sexual abuse (SexAb) rather than on multiple types of trauma or adversity.Within the European Prediction of Psychosis Study, the Trauma and Distress Scale (TADS) was developed as a new self-report assessment of multiple types of childhood trauma and distressing experiences. The TADS includes 43 items and, following previous measures including the Childhood Trauma Questionnaire, focuses on five core domains: emotional neglect (EmoNeg), emotional abuse (EmoAb), physical neglect (PhyNeg), physical abuse (PhyAb), and SexAb.This study explores the psychometric properties of the TADS (internal consistency and concurrent validity) in 692 participants drawn from the general population who completed a mailed questionnaire, including the TADS, a depression self-report and questions on help-seeking for mental health problems. Inter-method reliability was examined in a random sample of 100 responders who were reassessed in telephone interviews.After minor revisions of PhyNeg and PhyAb, internal consistencies were good for TADS totals and the domain raw score sums. Intra-class coefficients for TADS total score and the five revised core domains were all good to excellent when compared to the interviewed TADS as a gold standard. In the concurrent validity analyses, the total TADS and its all core domains were significantly associated with depression and help-seeking for mental problems as proxy measures for traumatisation. In addition, robust cutoffs for the total TADS and its domains were calculated.Our results suggest the TADS as a valid, reliable, and clinically useful instrument for assessing retrospectively reported childhood traumatisation.
Project description:Deciphering the rules of genome folding in the cell nucleus is essential to understand its functions. Recent chromosome conformation capture (Hi-C) studies have revealed that the genome is partitioned into topologically associating domains (TADs), which demarcate functional epigenetic domains defined by combinations of specific chromatin marks. However, whether TADs are true physical units in each cell nucleus or whether they reflect statistical frequencies of measured interactions within cell populations is unclear. Using a combination of Hi-C, three-dimensional (3D) fluorescent in situ hybridization, super-resolution microscopy, and polymer modeling, we provide an integrative view of chromatin folding in Drosophila. We observed that repressed TADs form a succession of discrete nanocompartments, interspersed by less condensed active regions. Single-cell analysis revealed a consistent TAD-based physical compartmentalization of the chromatin fiber, with some degree of heterogeneity in intra-TAD conformations and in cis and trans inter-TAD contact events. These results indicate that TADs are fundamental 3D genome units that engage in dynamic higher-order inter-TAD connections. This domain-based architecture is likely to play a major role in regulatory transactions during DNA-dependent processes.
Project description:The spatial organization of chromatin in the nucleus has been implicated in regulating gene expression. Maps of high-frequency interactions between different segments of chromatin have revealed topologically associating domains (TADs), within which most of the regulatory interactions are thought to occur. TADs are not homogeneous structural units but appear to be organized into a hierarchy. We present OnTAD, an optimized nested TAD caller from Hi-C data, to identify hierarchical TADs. OnTAD reveals new biological insights into the role of different TAD levels, boundary usage in gene regulation, the loop extrusion model, and compartmental domains. OnTAD is available at https://github.com/anlin00007/OnTAD.
Project description:Genome-wide association studies (GWAS) routinely identify risk variants in noncoding DNA, as exemplified by reports of multiple single nucleotide polymorphisms (SNPs) associated with prostate cancer in five independent regions in a gene desert on 8q24. Two of these regions also have been associated with breast and colorectal cancer. These findings implicate functional variation within long-range cis-regulatory elements in disease etiology. We used an in vivo bacterial artificial chromosome (BAC) enhancer-trapping strategy in mice to scan a half-megabase of the 8q24 gene desert encompassing the prostate cancer-associated regions for long-range cis-regulatory elements. These BAC assays identified both prostate and mammary gland enhancer activities within the region. We demonstrate that the 8q24 cancer-associated variant rs6983267 lies within an in vivo prostate enhancer whose expression mimics that of the nearby MYC proto-oncogene. Additionally, we show that the cancer risk allele increases prostate enhancer activity in vivo relative to the non-risk allele. This allele-specific enhancer activity is detectable during early prostate development and throughout prostate maturation, raising the possibility that this SNP could assert its influence on prostate cancer risk before tumorigenesis occurs. Our study represents an efficient strategy to build experimentally on GWAS findings with an in vivo method for rapidly scanning large regions of noncoding DNA for functional cis-regulatory sequences harboring variation implicated in complex diseases.
Project description:One of the formative goals of genetics research is to understand how genetic variation leads to phenotypic differences and human disease. Genome-wide association studies (GWASs) bring us closer to this goal by linking variation with disease faster than ever before. Despite this, GWASs alone are unable to pinpoint disease-causing single nucleotide polymorphisms (SNPs). Noncoding SNPs, which represent the majority of GWAS SNPs, present a particular challenge. To address this challenge, an array of computational tools designed to prioritize and predict the function of noncoding GWAS SNPs have been developed. However, fewer than 40% of GWAS publications from 2015 utilized these tools. We discuss several leading methods for annotating noncoding variants and how they can be integrated into research pipelines in hopes that they will be broadly applied in future GWAS analyses.
Project description:Cancer cell growth is complicated progression which is regulated and controlled by multiple factors including cell cycle, migration and apoptosis. In present study, we report that TADs, a novel derivative of taspine, has an essential role in resisting hepatocellular carcinoma growth (including arrest cell cycle) and migration, and inducing cell apoptosis. Our findings demonstrated that the TADs showed good inhibition on the hepatoma cell growth and migration, and good action on apoptosis induction. Using genome-wide microarray analysis, we found the down-regulated growth and apoptosis factors, and selected down-regulated genes were confirmed by Western blot. Knockdown of a checkpoint c-Myc by siRNA significantly attenuated tumor inhibition and apoptosis effects of TADs. Moreover, our results indicated TADs could simultaneously increase cyclin D1 protein levels and decrease amount of cyclin E, cyclin B1 and cdc2 of the cycle proteins, and also TADs reduced Bcl-2 expression, and upregulated Bad, Bak and Bax activities. In conclusion, these results illustrated that TADs is a key factor in growth and apoptosis signaling inhibitor, has potential in cancer therapy.
Project description:Most variants associated to diseases are located in non-coding regions of the genome, and are thought to be regulatory in nature. Cis-regulatory SNPs (cis-rSNPs) impacting transcription can be identified through the mapping of differences in allelic expression (AE). We mapped common cis-rSNPs of protein coding and non-coding genes in 3 distinct cell-types. We show 70% sharing across tissues and similar genetically controlled transcription for protein-coding genes and lincRNAs. Candidate cis-rSNPs altering the expression of 42 non-coding RNA overlap SNPs underlying GWAS associations for 39 diseases. We uncover a new class of cis-rSNPs leading to disruption of footprint-derived de novo motifs, predominantly bind by repressive factors and implicated in disease susceptibility through overlaps with GWAS SNPs. Finally, we provide proof-of-principle for a new approach for genome-wide functional validation of transcription factor – SNP interactions. We perturbed NFκB action in lymphoblasts and identified 489 cis-regulated transcripts with altered AE after NFkB perturbation. Altogether, we performed a comprehensive analysis of cis-variation in four cell-populations, and provide new tools for the identification of functional variants associated to complex diseases. Mapping cis-rSNPs across 3 distinct cell types in humans