Browse
Submit Data
Databases
API
Help

Dataset Information

0 Views

0 Connections

0 Citations

0 Reanalyses

0 Downloads

Omics score: 0

Analysis of the Drosophila and Human DPR Elements Reveals a Distinct Human Variant Whose Specificity Can Be Enhanced by Machine Learning

ABSTRACT: Analysis of the Drosophila and Human DPR Elements Reveals a Distinct Human Variant Whose Specificity Can Be Enhanced by Machine Learning

PROVIDER: PRJNA936415 | ENA |

REPOSITORIES: ENA

ACCESS DATA

Json Xml

Dataset's files

Source:

			Action	DRS
	SRR23516359.fastq.gz	Fastqsanger.gz
	SRR23516360.fastq.gz	Fastqsanger.gz

Items per page:

1 - 2 of 2

Similar Datasets

Analysis of the Drosophila and Human DPR Elements Reveals a Distinct Human Variant Whose Specificity Can Be Enhanced by Machine Learning

Project description:The RNA polymerase II core promoter is the site of convergence of the signals that lead to the initiation of transcription. Here, we perform a comparative analysis of the downstream core promoter region (DPR) in Drosophila and humans by using machine learning. These studies revealed a distinct human-specific version of the DPR and led to the use of the machine learning models for the identification of synthetic extreme DPR motifs with specificity for human transcription factors relative to Drosophila factors, and vice versa. More generally, machine learning models could be analogously used to design synthetic promoter elements with customized functional properties.

2023-04-06 | GSE225570 | GEO

Identification of the human DPR core promoter element using machine learning

Project description:The RNA polymerase II (Pol II) core promoter is the strategic site of convergence of the signals that lead to the initiation of DNA transcription, but the downstream core promoter in humans has been difficult to understand. Here we analyse the human Pol II core promoter and use machine learning to generate predictive models for the downstream core promoter region (DPR) and the TATA box. We developed a method termed HARPE (high-throughput analysis of randomized promoter elements) to create hundreds of thousands of DPR (or TATA box) variants, each with known transcriptional strength. We then analysed the HARPE data by support vector regression (SVR) to provide comprehensive models for the sequence motifs, and found that the SVR-based approach is more effective than a consensus-based method for predicting transcriptional activity. These results show that the DPR is a functionally important core promoter element that is widely used in human promoters. Notably, there appears to be a duality between the DPR and the TATA box, as many promoters contain one or the other element. More broadly, these findings show that functional DNA motifs can be identified by machine learning analysis of a comprehensive set of sequence variants.

2020-06-04 | GSE139635 | GEO

Identification of the Human DPR Promoter Element by using Machine Learning

Project description:Identification of the Human DPR Promoter Element by using Machine Learning

| PRJNA583471 | ENA

Machine Learning Analysis of the Human Initiator Region Reveals Key Features of Different Types of Core Promoters

Project description:The initiator (Inr) is the starting point for the transcription of many genes. Here, we generated highly predictive machine learning models of the human Inr region, and determined that the Inr is present in about 60% of focused human promoters, identified a novel TATA-specific Inr, and detected the overlapping but functionally distinct TCT motif. Quantitative genome-wide analyses revealed a strict and synergistic interaction between the Inr and DPR, an inverse relationship between the TATA and DPR, a flexible and sometimes independent function of the TATA box in relation to the Inr, and different properties of the TCT motif in humans versus Drosophila.

2026-04-01 | GSE314877 | GEO

Modifying inhibitor specificity for homologous enzymes by machine learning

Project description:Modifying inhibitor specificity for homologous enzymes by machine learning

| PRJNA1230874 | ENA

Machine learning-based prediction of the activity and specificity of Cas9 variants in gene editing

Project description:Machine learning-based prediction of the activity and specificity of Cas9 variants in gene editing

2024-08-22 | GSE231840 | GEO

Modifying inhibitor specificity for homologous enzymes by machine learning

Project description:Selective inhibitors are essential for targeted therapeutics and for probing enzyme functions in various biological systems. The two main challenges in identifying such inhibitors lie in the extensive experimental effort required, including the generation of large libraries, and in tailoring the selectivity of inhibitors to enzymes with homologous structures. To address these challenges, machine learning (ML) is being used to improve protein design by training on targeted libraries and identifying key interface mutations that enhance affinity and specificity. However, such ML-based methods are limited by inaccurate energy calculations and difficulties in predicting the structural impacts of multiple mutations. Here, we present an ML-based method that leverages HTS data to streamline the design of selective inhibitors. To demonstrate its utility, we applied our new method to finding inhibitors of matrix metalloproteinases (MMPs), a family of homologous enzymes involved in both physiological and pathological processes. By training ML models on binding data for three MMPs (MMP-1, MMP-3, and MMP-9), we successfully designed a novel N-TIMP2 variant with a differential specificity profile, namely, high affinity for MMP-9, moderate affinity for MMP-3, and low affinity for MMP-1. Our experimental validation showed that this novel variant exhibited a significant specificity shift and enhanced selectivity compared to wild-type N-TIMP2. Through molecular modeling and energy minimization, we obtained structural insights into the variant’s enhanced selectivity. Our findings highlight the power of ML-based methods to reduce experimental workloads, facilitate the rational design of selective inhibitors, and advance the understanding of specific inhibitor-enzyme interactions in homologous enzyme systems.

2025-03-10 | GSE290918 | GEO

Prediction of Breast Cancer Estrogen Receptor Status using Machine Learning

Project description:Gene expression profiles were generated from 199 primary breast cancer patients. Samples 1-176 were used in another study, GEO Series GSE22820, and form the training data set in this study. Sample numbers 200-222 form a validation set. This data is used to model a machine learning classifier for Estrogen Receptor Status. RNA was isolated from 199 primary breast cancer patients. A machine learning classifier was built to predict ER status using only three gene features.

2013-01-01 | E-GEOD-29210 | biostudies-arrayexpress

Machine learning-based prediction of the activity and specificity of Cas9 variants in gene editing

Project description:Machine learning-based prediction of the activity and specificity of Cas9 variants in gene editing

| PRJNA968010 | ENA

Epigenetic signature of human induced pluripotent stem cells identified with the linear machine learning model

Project description:Human induced pluripotent stem cells (iPSCs) were established as an artificial embryonic stem cells (ESCs) to avoid immune rejection, for ethical issues in regenerative medicine, and for biological research. Comparison analyses in previous studies revealed that there is no hot spot that distinguishes iPSCs from ESCs. We herewith established a learning model using Jubatus, as a machine learning platform, with linear model for classification to distinguish human iPSCs from ESCs based on DNA methylation profiles. We found that the linear model classification is most suitable for the analysis of human iPSCs whose line number is practically 10 to 100. The learning models discriminated ESCs and iPSCs with an accuracy of ≥ 85.71 % and ≥ 90.91 %, respectively. In addition, the epigenetic signature of iPSCs was identified by component analysis of the learning models. The iPSC-specific fluctuated methylation regions were abundant at chromosome 7, 8, 12, and 22. The method can be utilized with comprehensive data and can also be widely applied to many aspects of molecular biology research.

2019-12-06 | GSE141521 | GEO

OmicsDI is part of the ELIXIR infrastructure

OmicsDI is an Elixir interoperability service. Learn more ›

Tweets

OmicsDI Databases

PRIDE
PeptideAtlas
MassIVE
JPOST Repository
Physiome Model Repository

EGA
EVA
ENA
LINCS
PAXDB
Cell Collective

MetaboLights
Metabolomics Workbench
MetabolomeExpress
GNPS
BioModels
FAIRDOMHub

ArrayExpress
dbGaP
ExpressionAtlas
GEO
NODE

Information

Databases
Help
API
Contact us
Code on GitHub
Terms of use
Submit Data