Unknown

Dataset Information

0

Ranking of non-coding pathogenic variants and putative essential regions of the human genome.


ABSTRACT: A gene is considered essential if loss of function results in loss of viability, fitness or in disease. This concept is well established for coding genes; however, non-coding regions are thought less likely to be determinants of critical functions. Here we train a machine learning model using functional, mutational and structural features, including new genome essentiality metrics, 3D genome organization and enhancer reporter data to identify deleterious variants in non-coding regions. We assess the model for functional correlates by using data from tiling-deletion-based and CRISPR interference screens of activity of cis-regulatory elements in over 3 Mb of genome sequence. Finally, we explore two user cases that involve indels and the disruption of enhancers associated with a developmental disease. We rank variants in the non-coding genome according to their predicted deleteriousness. The model prioritizes non-coding regions associated with regulation of important genes and with cell viability, an in vitro surrogate of essentiality.

SUBMITTER: Wells A 

PROVIDER: S-EPMC6868241 | biostudies-literature | 2019 Nov

REPOSITORIES: biostudies-literature

altmetric image

Publications

Ranking of non-coding pathogenic variants and putative essential regions of the human genome.

Wells Alex A   Heckerman David D   Torkamani Ali A   Yin Li L   Sebat Jonathan J   Ren Bing B   Telenti Amalio A   di Iulio Julia J  

Nature communications 20191120 1


A gene is considered essential if loss of function results in loss of viability, fitness or in disease. This concept is well established for coding genes; however, non-coding regions are thought less likely to be determinants of critical functions. Here we train a machine learning model using functional, mutational and structural features, including new genome essentiality metrics, 3D genome organization and enhancer reporter data to identify deleterious variants in non-coding regions. We assess  ...[more]

Similar Datasets

| S-EPMC5550444 | biostudies-literature
| S-EPMC8754628 | biostudies-literature
| S-EPMC9295495 | biostudies-literature
| S-EPMC9094564 | biostudies-literature
| S-EPMC7027195 | biostudies-literature
| S-EPMC4572001 | biostudies-literature
| S-EPMC6817816 | biostudies-literature
| S-EPMC3349421 | biostudies-other
| S-EPMC5929547 | biostudies-literature
| S-EPMC9896476 | biostudies-literature