Unknown

Dataset Information

0

Modernized uniform representation of carbohydrate molecules in the Protein Data Bank.


ABSTRACT: Since 1971, the Protein Data Bank (PDB) has served as the single global archive for experimentally determined 3D structures of biological macromolecules made freely available to the global community according to the FAIR principles of Findability-Accessibility-Interoperability-Reusability. During the first 50 years of continuous PDB operations, standards for data representation have evolved to better represent rich and complex biological phenomena. Carbohydrate molecules present in more than 14,000 PDB structures have recently been reviewed and remediated to conform to a new standardized format. This machine-readable data representation for carbohydrates occurring in the PDB structures and the corresponding reference data improves the findability, accessibility, interoperability and reusability of structural information pertaining to these molecules. The PDB Exchange MacroMolecular Crystallographic Information File data dictionary now supports (i) standardized atom nomenclature that conforms to International Union of Pure and Applied Chemistry-International Union of Biochemistry and Molecular Biology (IUPAC-IUBMB) recommendations for carbohydrates, (ii) uniform representation of branched entities for oligosaccharides, (iii) commonly used linear descriptors of carbohydrates developed by the glycoscience community and (iv) annotation of glycosylation sites in proteins. For the first time, carbohydrates in PDB structures are consistently represented as collections of standardized monosaccharides, which precisely describe oligosaccharide structures and enable improved carbohydrate visualization, structure validation, robust quantitative and qualitative analyses, search for dendritic structures and classification. The uniform representation of carbohydrate molecules in the PDB described herein will facilitate broader usage of the resource by the glycoscience community and researchers studying glycoproteins.

SUBMITTER: Shao C 

PROVIDER: S-EPMC8457362 | biostudies-literature | 2021 Sep

REPOSITORIES: biostudies-literature

altmetric image

Publications

Modernized uniform representation of carbohydrate molecules in the Protein Data Bank.

Shao Chenghua C   Feng Zukang Z   Westbrook John D JD   Peisach Ezra E   Berrisford John J   Ikegawa Yasuyo Y   Kurisu Genji G   Velankar Sameer S   Burley Stephen K SK   Young Jasmine Y JY  

Glycobiology 20210901 9


Since 1971, the Protein Data Bank (PDB) has served as the single global archive for experimentally determined 3D structures of biological macromolecules made freely available to the global community according to the FAIR principles of Findability-Accessibility-Interoperability-Reusability. During the first 50 years of continuous PDB operations, standards for data representation have evolved to better represent rich and complex biological phenomena. Carbohydrate molecules present in more than 14,  ...[more]

Similar Datasets

| S-EPMC3992913 | biostudies-literature
| S-EPMC7139665 | biostudies-literature
| S-EPMC3843158 | biostudies-literature
| S-EPMC2792347 | biostudies-literature
| S-EPMC10693035 | biostudies-literature
| S-EPMC2808887 | biostudies-literature
| S-EPMC3013808 | biostudies-literature
| S-EPMC3965016 | biostudies-literature
| S-EPMC3245096 | biostudies-literature
| S-EPMC4068610 | biostudies-literature