In silico Identification and Taxonomic Distribution of Plant Class C GH9 Endoglucanases.
ABSTRACT: The glycoside hydrolase 9 superfamily, mainly comprising the endoglucanases, is represented in all three domains of life. The current division of GH9 enzymes, into three subclasses, namely A, B, and C, is centered on parameters derived from sequence information alone. However, this classification is ambiguous, and is limited by the paralogous ancestry of classes B and C endoglucanases, and paucity of biochemical and structural data. Here, we extend this classification schema to putative GH9 endoglucanases present in green plants, with an emphasis on identifying novel members of the class C subset. These enzymes cleave the ?(1 ? 4) linkage between non-terminal adjacent D-glucopyranose residues, in both, amorphous and crystalline regions of cellulose. We utilized non redundant plant GH9 enzymes with characterized molecular data, as the training set to construct Hidden Markov Models (HMMs) and train an Artificial Neural Network (ANN). The parameters that were used for predicting dominant enzyme function, were derived from this training set, and subsequently refined on 147 sequences with available expression data. Our knowledge-based approach, can ascribe differential endoglucanase activity (A, B, or C) to a query sequence with high confidence, and was used to construct a local repository of class C GH9 endoglucanases (GH9C = 241) from 32 sequenced green plants.
Project description:Biofuels such as ?-valerolactone, bioethanol, and biodiesel are derived from potentially fermentable cellulose and vegetable oils. Plant class C GH9 endoglucanases are CBM49-encompassing hydrolases that cleave the ? (1???4) glycosidic linkage of contiguous D-glucopyranose residues of crystalline cellulose. Here, I analyse 3D-homology models of characterised and putative class C enzymes to glean insights into the contribution of the GH9, linker, and CBM49 to the mechanism(s) of crystalline cellulose digestion. Crystalline cellulose may be accommodated in a surface groove which is imperfectly bounded by the GH9_CBM49, GH9_linker, and linker_CBM49 surfaces and thence digested in a solvent accessible subsurface cavity. The physical dimensions and distortions thereof, of the groove, are mediated in part by the bulky side chains of aromatic amino acids that comprise it and may also result in a strained geometry of the bound cellulose polymer. These data along with an almost complete absence of measurable cavities, along with poorly conserved, hydrophobic, and heterogeneous amino acid composition, increased atomic motion of the CBM49_linker junction, and docking experiements with ligands of lower degrees of polymerization suggests a modulatory rather than direct role for CBM49 in catalysis. Crystalline cellulose is the de facto substrate for CBM-containing plant and non-plant GH9 enzymes, a finding supported by exceptional sequence- and structural-homology. However, despite the implied similarity in general acid-base catalysis of crystalline cellulose, this study also highlights qualitative differences in substrate binding and glycosidic bond cleavage amongst class C members. Results presented may aid the development of novel plant-based GH9 endoglucanases that could extract and utilise potential fermentable carbohydrates from biomass. Graphical Abstract Crystalline cellulose digestion by plant class C GH9 endoglucanases - an in silico assessment of function.
Project description:BACKGROUND:Glycoside hydrolases of the GH9 family encode cellulases that predominantly function as endoglucanases and have wide applications in the food, paper, pharmaceutical, and biofuel industries. The partitioning of plant GH9 endoglucanases, into classes A, B, and C, is based on the differential presence of transmembrane, signal peptide, and the carbohydrate binding module (CBM49). There is considerable debate on the distribution and the functions of these enzymes which may vary in different organisms. In light of these findings we examined the origin, emergence, and subsequent divergence of plant GH9 endoglucanases, with an emphasis on elucidating the role of CBM49 in the digestion of crystalline cellulose by class C members. RESULTS:Since, the digestion of crystalline cellulose mandates the presence of a well-defined set of aromatic and polar amino acids and/or an attributable domain that can mediate this conversion, we hypothesize a vertical mode of transfer of genes that could favour the emergence of class C like GH9 endoglucanase activity in land plants from potentially ancestral non plant taxa. We demonstrated the concomitant occurrence of a GH9 domain with CBM49 and other homologous carbohydrate binding modules, in putative endoglucanase sequences from several non-plant taxa. In the absence of comparable full length CBMs, we have characterized several low strength patterns that could approximate the CBM49, thereby, extending support for digestion of crystalline cellulose to other segments of the protein. We also provide data suggestive of the ancestral role of putative class C GH9 endoglucanases in land plants, which includes detailed phylogenetics and the presence and subsequent loss of CBM49, transmembrane, and signal peptide regions in certain populations of early land plants. These findings suggest that classes A and B of modern vascular land plants may have emerged by diverging directly from CBM49 encompassing putative class C enzymes. CONCLUSION:Our detailed phylogenetic and bioinformatics analysis of putative GH9 endoglucanase sequences across major taxa suggests that plant class C enzymes, despite their recent discovery, could function as the last common ancestor of classes A and B. Additionally, research into their ability to digest or inter-convert crystalline and amorphous forms of cellulose could make them lucrative candidates for engineering biofuel feedstock.
Project description:Glycoside hydrolase family 9 (GH9) of carbohydrate-processing enzymes primarily consists of inverting endoglucanases. A subgroup of GH9 enzymes are believed to act as exo-glucosidases or exo-glucosaminidases, with many being found in organisms of the family Vibrionaceae, where they are proposed to function within the chitin-catabolism pathway. Here, it is shown that the GH9 enzyme from the pathogen Vibrio cholerae (hereafter referred to as VC0615) is active on both chitosan-derived and β-glucoside substrates. The structure of VC0615 at 3.17 Å resolution is reported from a crystal form with poor diffraction and lattice disorder. VC0615 was highly refractory to crystallization efforts, with crystals only appearing using a high protein concentration under conditions containing the precipitant poly-γ-glutamic acid (PGA). The structure is highly mobile within the crystal lattice, which is likely to reflect steric clashes between symmetry molecules which destabilize crystal packing. The overall tertiary structure of VC0615 is well resolved even at 3.17 Å resolution, which has allowed the structural basis for the exo-glucosidase/glucosaminidase activity of this enzyme to be investigated.
Project description:BACKGROUND: Endoglucanases are usually considered to be synergistically involved in the initial stages of cellulose breakdown-an essential step in the bioprocessing of lignocellulosic plant materials into bioethanol. Despite their economic importance, we currently lack a basic understanding of how some endoglucanases can sustain their ability to function at elevated temperatures required for bioprocessing, while others cannot. In this study, we present a detailed comparative analysis of both thermophilic and mesophilic endoglucanases in order to gain insights into origins of thermostability. We analyzed the sequences and structures for sets of endoglucanase proteins drawn from the Carbohydrate-Active enZymes (CAZy) database. RESULTS: Our results demonstrate that thermophilic endoglucanases and their mesophilic counterparts differ significantly in their amino acid compositions. Strikingly, these compositional differences are specific to protein folds and enzyme families, and lead to differences in intramolecular interactions in a fold-dependent fashion. CONCLUSIONS: Here, we provide fold-specific guidelines to control thermostability in endoglucanases that will aid in making production of biofuels from plant biomass more efficient.
Project description:Bacteria and fungi are thought to degrade cellulose through the activity of either a complexed or a noncomplexed cellulolytic system composed of endoglucanases and cellobiohydrolases. The marine bacterium Saccharophagus degradans 2-40 produces a multicomponent cellulolytic system that is unusual in its abundance of GH5-containing endoglucanases. Secreted enzymes of this bacterium release high levels of cellobiose from cellulosic materials. Through cloning and purification, the predicted biochemical activities of the one annotated cellobiohydrolase Cel6A and the GH5-containing endoglucanases were evaluated. Cel6A was shown to be a classic endoglucanase, but Cel5H showed significantly higher activity on several types of cellulose, was the highest expressed, and processively released cellobiose from cellulosic substrates. Cel5G, Cel5H, and Cel5J were found to be members of a separate phylogenetic clade and were all shown to be processive. The processive endoglucanases are functionally equivalent to the endoglucanases and cellobiohydrolases required for other cellulolytic systems, thus providing a cellobiohydrolase-independent mechanism for this bacterium to convert cellulose to glucose.
Project description:The properties and enzymic activity of endoglucanases (EC 188.8.131.52) of the fungus Trichoderma reesei were studied by means of immunological methods and by using polyglycosidic substrates. Endoglucanases exist in the culture liquid as a series of immunologically related components. The most active endoglucanase component has an Mr of 43 000 and pI value of 4.0. The most abundant components have a value of pI about 5.0, an Mr of 56 000-67 000 and specific activity only one-fifth of that of the pI-4.0 component. During purification and storage the endoglucanases are spontaneously modified; the relative proportion of components having greater Mr values, more alkaline pI values and lower specific activities is increased. The hexose content of the endoglucanase components is 2-7%. Endoglucanases hydrolyse soluble beta-1,4 glycans. The enzymes described here differ from endoglucanase preparations described previously in not showing activity towards insoluble substrates. The role of endoglucanases in wood hydrolysis is consequently limited to the stage where wood constituents are already in soluble form.
Project description:We report here the annotated draft genome sequence of the thermophilic zygomycete <i>Rhizomucor pusillus</i> strain FCH 5.7, isolated from compost soil in Vietnam. The genome assembly contains 25.59 Mb with an overall GC content of 44.95%, and comprises 10,898 protein coding genes. Genes encoding putative cellulose-, xylan- and chitin-degrading proteins were identified, including two putative endoglucanases (EC 184.108.40.206) from glycoside hydrolase family 9, which have so far been mostly assigned to bacteria and plants.
Project description:Obtaining bioethanol from cellulosic biomass involves numerous steps, among which the enzymatic conversion of the polymer to individual sugar units has been a main focus of the biotechnology industry. Among the cellulases that break down the polymeric cellulose are endoglucanases that act synergistically for subsequent hydrolytic reactions. The endoglucanases that have garnered relatively more attention are those that can withstand high temperatures, i.e., are thermostable. Although our understanding of thermostability in endoglucanases is incomplete, some molecular features that are responsible for increased thermostability have been recently identified. This review focuses on the investigations of endoglucanases and their implications for biofuel applications.
Project description:During growth on crystalline cellulose, the thermophilic bacterium Caldicellulosiruptor bescii secretes several cellulose-degrading enzymes. Among these enzymes is CelA (CbCel9A/Cel48A), which is reported as the most highly secreted cellulolytic enzyme in this bacterium. CbCel9A/Cel48A is a large multi-modular polypeptide, composed of an N-terminal catalytic glycoside hydrolase family 9 (GH9) module and a C-terminal GH48 catalytic module that are separated by a family 3c carbohydrate-binding module (CBM3c) and two identical CBM3bs. The wild-type CbCel9A/Cel48A and its truncational mutants were expressed in Bacillus megaterium and Escherichia coli, respectively. The wild-type polypeptide released twice the amount of glucose equivalents from Avicel than its truncational mutant that lacks the GH48 catalytic module. The truncational mutant harboring the GH9 module and the CBM3c was more thermostable than the wild-type protein, likely due to its compact structure. The main hydrolytic activity was present in the GH9 catalytic module, while the truncational mutant containing the GH48 module and the three CBMs was ineffective in degradation of either crystalline or amorphous cellulose. Interestingly, the GH9 and/or GH48 catalytic modules containing the CBM3bs form low-density particles during hydrolysis of crystalline cellulose. Moreover, TM3 (GH9/CBM3c) and TM2 (GH48 with three CBM3 modules) synergistically hydrolyze crystalline cellulose. Deletion of the CBM3bs or mutations that compromised their binding activity suggested that these CBMs are important during hydrolysis of crystalline cellulose. In agreement with this observation, seven of nine genes in a C. bescii gene cluster predicted to encode cellulose-degrading enzymes harbor CBM3bs. Based on our results, we hypothesize that C. bescii uses the GH48 module and the CBM3bs in CbCel9A/Cel48A to destabilize certain regions of crystalline cellulose for attack by the highly active GH9 module and other endoglucanases produced by this hyperthermophilic bacterium.
Project description:Ruminococcus albus 8 is a specialist plant cell wall degrading ruminal bacterium capable of utilizing hemicellulose and cellulose. Cellulose degradation requires a suite of enzymes including endoglucanases, exoglucanases, and ?-glucosidases. The enzymes employed by R. albus 8 in degrading cellulose are yet to be completely elucidated. Through bioinformatic analysis of a draft genome sequence of R. albus 8, seventeen putatively cellulolytic genes were identified. The genes were heterologously expressed in E. coli, and purified to near homogeneity. On biochemical analysis with cellulosic substrates, seven of the gene products (Ra0185, Ra0259, Ra0325, Ra0903, Ra1831, Ra2461, and Ra2535) were identified as endoglucanases, releasing predominantly cellobiose and cellotriose. Each of the R. albus 8 endoglucanases, except for Ra0259 and Ra0325, bound to the model crystalline cellulose Avicel, confirming functional carbohydrate binding modules (CBMs). The polypeptides for Ra1831 and Ra2535 were found to contain distantly related homologs of CBM65. Mutational analysis of residues within the CBM65 of Ra1831 identified key residues required for binding. Phylogenetic analysis of the endoglucanases revealed three distinct subfamilies of glycoside hydrolase family 5 (GH5). Our results demonstrate that this fibrolytic bacterium uses diverse GH5 catalytic domains appended with different CBMs, including novel forms of CBM65, to degrade cellulose.