Dataset Information


In silico Identification and Taxonomic Distribution of Plant Class C GH9 Endoglucanases.

ABSTRACT: The glycoside hydrolase 9 superfamily, mainly comprising the endoglucanases, is represented in all three domains of life. The current division of GH9 enzymes, into three subclasses, namely A, B, and C, is centered on parameters derived from sequence information alone. However, this classification is ambiguous, and is limited by the paralogous ancestry of classes B and C endoglucanases, and paucity of biochemical and structural data. Here, we extend this classification schema to putative GH9 endoglucanases present in green plants, with an emphasis on identifying novel members of the class C subset. These enzymes cleave the ?(1 ? 4) linkage between non-terminal adjacent D-glucopyranose residues, in both, amorphous and crystalline regions of cellulose. We utilized non redundant plant GH9 enzymes with characterized molecular data, as the training set to construct Hidden Markov Models (HMMs) and train an Artificial Neural Network (ANN). The parameters that were used for predicting dominant enzyme function, were derived from this training set, and subsequently refined on 147 sequences with available expression data. Our knowledge-based approach, can ascribe differential endoglucanase activity (A, B, or C) to a query sequence with high confidence, and was used to construct a local repository of class C GH9 endoglucanases (GH9C = 241) from 32 sequenced green plants.


PROVIDER: S-EPMC4981690 | BioStudies | 2016-01-01

REPOSITORIES: biostudies

Similar Datasets

2019-01-01 | S-EPMC7385011 | BioStudies
2018-01-01 | S-EPMC5977491 | BioStudies
2018-01-01 | S-EPMC6096475 | BioStudies
2011-01-01 | S-EPMC3047435 | BioStudies
1000-01-01 | S-EPMC2737977 | BioStudies
1985-01-01 | S-EPMC1152705 | BioStudies
2018-01-01 | S-EPMC6132078 | BioStudies
2013-01-01 | S-EPMC3856469 | BioStudies
2013-01-01 | S-EPMC3865294 | BioStudies
2016-01-01 | S-EPMC4954948 | BioStudies