Unknown

Dataset Information

0

Identification of DNA-binding proteins using support vector machine with sequence information.


ABSTRACT: DNA-binding proteins are fundamentally important in understanding cellular processes. Thus, the identification of DNA-binding proteins has the particularly important practical application in various fields, such as drug design. We have proposed a novel approach method for predicting DNA-binding proteins using only sequence information. The prediction model developed in this study is constructed by support vector machine-sequential minimal optimization (SVM-SMO) algorithm in conjunction with a hybrid feature. The hybrid feature is incorporating evolutionary information feature, physicochemical property feature, and two novel attributes. These two attributes use DNA-binding residues and nonbinding residues in a query protein to obtain DNA-binding propensity and nonbinding propensity. The results demonstrate that our SVM-SMO model achieves 0.67 Matthew's correlation coefficient (MCC) and 89.6% overall accuracy with 88.4% sensitivity and 90.8% specificity, respectively. Performance comparisons on various features indicate that two novel attributes contribute to the performance improvement. In addition, our SVM-SMO model achieves the best performance than state-of-the-art methods on independent test dataset.

SUBMITTER: Ma X 

PROVIDER: S-EPMC3787635 | biostudies-literature | 2013

REPOSITORIES: biostudies-literature

altmetric image

Publications

Identification of DNA-binding proteins using support vector machine with sequence information.

Ma Xin X   Wu Jiansheng J   Xue Xiaoyun X  

Computational and mathematical methods in medicine 20130916


DNA-binding proteins are fundamentally important in understanding cellular processes. Thus, the identification of DNA-binding proteins has the particularly important practical application in various fields, such as drug design. We have proposed a novel approach method for predicting DNA-binding proteins using only sequence information. The prediction model developed in this study is constructed by support vector machine-sequential minimal optimization (SVM-SMO) algorithm in conjunction with a hy  ...[more]

Similar Datasets

| S-EPMC2216048 | biostudies-literature
| S-EPMC4331676 | biostudies-literature
| S-EPMC2638146 | biostudies-literature
| S-EPMC4670226 | biostudies-literature
| S-EPMC1802617 | biostudies-literature
| S-EPMC4057401 | biostudies-literature
| S-EPMC3264588 | biostudies-other
| S-EPMC2948896 | biostudies-literature