Unknown

Dataset Information

0

Accurate prediction of functional effects for variants by combining gradient tree boosting with optimal neighborhood properties.


ABSTRACT: Single amino acid variations (SAVs) potentially alter biological functions, including causing diseases or natural differences between individuals. Identifying the relationship between a SAV and certain disease provides the starting point for understanding the underlying mechanisms of specific associations, and can help further prevention and diagnosis of inherited disease.We propose PredSAV, a computational method that can effectively predict how likely SAVs are to be associated with disease by incorporating gradient tree boosting (GTB) algorithm and optimally selected neighborhood features. A two-step feature selection approach is used to explore the most relevant and informative neighborhood properties that contribute to the prediction of disease association of SAVs across a wide range of sequence and structural features, especially some novel structural neighborhood features. In cross-validation experiments on the benchmark dataset, PredSAV achieves promising performances with an AUC score of 0.908 and a specificity of 0.838, which are significantly better than that of the other existing methods. Furthermore, we validate the capability of our proposed method by an independent test and gain a competitive advantage as a result. PredSAV, which combines gradient tree boosting with optimally selected neighborhood features, can return reliable predictions in distinguishing between disease-associated and neutral variants. Compared with existing methods, PredSAV shows improved specificity as well as increased overall performance.

SUBMITTER: Pan Y 

PROVIDER: S-EPMC5470696 | biostudies-literature | 2017

REPOSITORIES: biostudies-literature

altmetric image

Publications

Accurate prediction of functional effects for variants by combining gradient tree boosting with optimal neighborhood properties.

Pan Yuliang Y   Liu Diwei D   Deng Lei L  

PloS one 20170614 6


Single amino acid variations (SAVs) potentially alter biological functions, including causing diseases or natural differences between individuals. Identifying the relationship between a SAV and certain disease provides the starting point for understanding the underlying mechanisms of specific associations, and can help further prevention and diagnosis of inherited disease.We propose PredSAV, a computational method that can effectively predict how likely SAVs are to be associated with disease by  ...[more]

Similar Datasets

| S-EPMC5849212 | biostudies-literature
| S-EPMC6031027 | biostudies-literature
| S-EPMC5908232 | biostudies-literature
| S-EPMC6555260 | biostudies-literature
| S-EPMC7724862 | biostudies-literature
| S-EPMC6162650 | biostudies-literature