Project description:The DNA sequence preferences of the vast majority of eukaryotic transcription factors (TFs) are unknown. Using an approach designed to broadly sample both DNA-binding domain types and eukaryotic clades, we have determined DNA-binding motifs for 1,033 TFs from 131 diverse eukaryotes, encompassing 54 domain types. Closely related orthologs and paralogs typically have very similar sequence preferences; this property allows inference of motifs for roughly one third of the 166,851 known or predicted eukaryotic TFs. While the origins of most motifs can be dated to hundreds of millions of years ago, we also characterize more recent TF expansions. Sequences matching the motifs are enriched upstream of TSS in most eukaryotic lineages, and at informative eQTL SNPs in Arabidopsis promoters, demonstrating their utility in mapping transcriptional networks. The motifs are housed at http://cisbp.ccbr.utoronto.ca Protein binding microarray (PBM) experiments were performed for a set of 1048 diverse eukaryotic transcription factors. Briefly, the PBMs involved binding GST-tagged DNA-binding proteins to two double-stranded 44K Agilent microarrays, each containing a different DeBruijn sequence design, in order to determine their sequence preferences. Details of the PBM protocol are described in Berger et al., Nature Biotechnology 2006.

2014-08-01 | E-GEOD-53348 | biostudies-arrayexpress

Determination and Inference of Eukaryotic Transcription Factor Sequence Specificity

2014-08-01 | GSE53348 | GEO

Gut metagenome/FR 2002 (Salosensaari et al. Nature Comms 2021)

Project description:The main goal of the project is the study the associations between the gut metagenome and human health. The dataset contains data for n=7211 FINRISK 2002 participants who underwent fecal sampling. Demultiplexed shallow shotgun metagenomic sequences were quality filtered and adapter trimmed using Atropos (Didion et al., 2017), and human filtered using Bowtie2 (Langmead and Salzberg, 2012).

| EGAS00001005038 | EGA

Gut metagenome/FINRISK 2002 (Salosensaari et al. Nature Comms 2021)

| EGAS00001005020 | EGA

OmicsDI is part of the ELIXIR infrastructure

OmicsDI is an Elixir interoperability service. Learn more ›

OmicsDI Databases

PRIDE
PeptideAtlas
MassIVE
JPOST Repository
Physiome Model Repository

EGA
EVA
ENA
LINCS
PAXDB
Cell Collective

MetaboLights
Metabolomics Workbench
MetabolomeExpress
GNPS
BioModels
FAIRDOMHub

ArrayExpress
dbGaP
ExpressionAtlas
GEO
NODE

Information

Databases
Help
API
Contact us
Code on GitHub
Terms of use
Submit Data