Unknown

Dataset Information

0

An algorithm for identifying novel targets of transcription factor families: application to hypoxia-inducible factor 1 targets.


ABSTRACT: Efficient and effective analysis of the growing genomic databases requires the development of adequate computational tools. We introduce a fast method based on the suffix tree data structure for predicting novel targets of hypoxia-inducible factor 1 (HIF-1) from huge genome databases. The suffix tree data structure has two powerful applications here: one is to extract unknown patterns from multiple strings/sequences in linear time; the other is to search multiple strings/sequences using multiple patterns in linear time. Using 15 known HIF-1 target gene sequences as a training set, we extracted 105 common patterns that all occur in the 15 training genes using suffix trees. Using these 105 common patterns along with known subsequences surrounding HIF-1 binding sites from the literature, the algorithm searches a genome database that contains 2,078,786 DNA sequences. It reported 258 potentially novel HIF-1 targets including 25 known HIF-1 targets. Based on microarray studies from the literature, 17 putative genes were confirmed to be upregulated by HIF-1 or hypoxia inside these 258 genes. We further studied one of the potential targets, COX-2, in the biological lab; and showed that it was a biologically relevant HIF-1 target. These results demonstrate that our methodology is an effective computational approach for identifying novel HIF-1 targets.

SUBMITTER: Jiang Y 

PROVIDER: S-EPMC2664698 | biostudies-literature | 2009

REPOSITORIES: biostudies-literature

altmetric image

Publications

An algorithm for identifying novel targets of transcription factor families: application to hypoxia-inducible factor 1 targets.

Jiang Yue Y   Cukic Bojan B   Adjeroh Donald A DA   Skinner Heath D HD   Lin Jie J   Shen Qingxi J QJ   Jiang Bing-Hua BH  

Cancer informatics 20090304


Efficient and effective analysis of the growing genomic databases requires the development of adequate computational tools. We introduce a fast method based on the suffix tree data structure for predicting novel targets of hypoxia-inducible factor 1 (HIF-1) from huge genome databases. The suffix tree data structure has two powerful applications here: one is to extract unknown patterns from multiple strings/sequences in linear time; the other is to search multiple strings/sequences using multiple  ...[more]

Similar Datasets

| S-EPMC1567694 | biostudies-literature
| S-EPMC3910014 | biostudies-literature
| S-EPMC6025987 | biostudies-literature
| S-EPMC2662309 | biostudies-literature
| S-EPMC3454324 | biostudies-literature