ABSTRACT: PDBsum (http://www.ebi.ac.uk/pdbsum) provides summary information about each experimentally determined structural model in the Protein Data Bank (PDB). Here we describe some of its most recent features, including figures from the structure's key reference, citation data, Pfam domain diagrams, topology diagrams and protein-protein interactions. Furthermore, it now accepts users' own PDB format files and generates a private set of analyses for each uploaded structure.
Project description:PDBsum is a web-based database providing a largely pictorial summary of the key information on each macromolecular structure deposited at the Protein Data Bank (PDB). It includes images of the structure, annotated plots of each protein chain's secondary structure, detailed structural analyses generated by the PROMOTIF program, summary PROCHECK results and schematic diagrams of protein-ligand and protein-DNA interactions. RasMol scripts highlight key aspects of the structure, such as the protein's domains, PROSITE patterns and protein-ligand interactions, for interactive viewing in 3D. Numerous links take the user to related sites. PDBsum is updated whenever any new structures are released by the PDB and is freely accessible via http://www.biochem.ucl.ac.uk/bsm/pdbsum.
Project description:PDBsum, http://www.ebi.ac.uk/pdbsum, is a website providing numerous pictorial analyses of each entry in the Protein Data Bank. It portrays the structural features of all proteins, DNA and ligands in the entry, as well as depicting the interactions between them. The latest features, described here, include annotation of human protein sequences with their naturally occurring amino acid variants, dynamic graphs showing the relationships between related protein domain architectures, analyses of ligand binding clusters across different experimental determinations of the same protein, analyses of tunnels in proteins and new search options.
Project description:PDBsum is a database of mainly pictorial summaries of the 3D structures of proteins and nucleic acids in the Protein Data Bank. Its pages aim to provide an at-a-glance view of the contents of every 3D structure, plus detailed structural analyses of each protein chain, DNA-RNA chain and any bound ligands and metals. In the past year, the database has been significantly improved, in terms of both appearance and new content. Moreover, it has moved to its new address at http://www.ebi.ac.uk/thornton-srv/databases/pdbsum.
Project description:BACKGROUND: The Structural Descriptor Database (SDDB) is a web-based tool that predicts the function of proteins and functional site positions based on the structural properties of related protein families. Structural alignments and functional residues of a known protein set (defined as the training set) are used to build special Hidden Markov Models (HMM) called HMM descriptors. SDDB uses previously calculated and stored HMM descriptors for predicting active sites, binding residues, and protein function. The database integrates biologically relevant data filtered from several databases such as PDB, PDBSUM, CSA and SCOP. It accepts queries in fasta format and predicts functional residue positions, protein-ligand interactions, and protein function, based on the SCOP database. RESULTS: To assess the SDDB performance, we used different data sets. The Trypsion-like Serine protease data set assessed how well SDDB predicts functional sites when curated data is available. The SCOP family data set was used to analyze SDDB performance by using training data extracted from PDBSUM (binding sites) and from CSA (active sites). The ATP-binding experiment was used to compare our approach with the most current method. For all evaluations, significant improvements were obtained with SDDB. CONCLUSION: SDDB performed better when trusty training data was available. SDDB worked better in predicting active sites rather than binding sites because the former are more conserved than the latter. Nevertheless, by using our prediction method we obtained results with precision above 70%.
Project description:HHpred is a fast server for remote protein homology detection and structure prediction and is the first to implement pairwise comparison of profile hidden Markov models (HMMs). It allows to search a wide choice of databases, such as the PDB, SCOP, Pfam, SMART, COGs and CDD. It accepts a single query sequence or a multiple alignment as input. Within only a few minutes it returns the search results in a user-friendly format similar to that of PSI-BLAST. Search options include local or global alignment and scoring secondary structure similarity. HHpred can produce pairwise query-template alignments, multiple alignments of the query with a set of templates selected from the search results, as well as 3D structural models that are calculated by the MODELLER software from these alignments. A detailed help facility is available. As a demonstration, we analyze the sequence of SpoVT, a transcriptional regulator from Bacillus subtilis. HHpred can be accessed at http://protevo.eb.tuebingen.mpg.de/hhpred.
Project description:Since 2009 the EMBL-EBI Job Dispatcher framework has provided free access to a range of mainstream sequence analysis applications. These include sequence similarity search services (https://www.ebi.ac.uk/Tools/sss/) such as BLAST, FASTA and PSI-Search, multiple sequence alignment tools (https://www.ebi.ac.uk/Tools/msa/) such as Clustal Omega, MAFFT and T-Coffee, and other sequence analysis tools (https://www.ebi.ac.uk/Tools/pfa/) such as InterProScan. Through these services users can search mainstream sequence databases such as ENA, UniProt and Ensembl Genomes, utilising a uniform web interface or systematically through Web Services interfaces (https://www.ebi.ac.uk/Tools/webservices/) using common programming languages, and obtain enriched results with novel visualisations. Integration with EBI Search (https://www.ebi.ac.uk/ebisearch/) and the dbfetch retrieval service (https://www.ebi.ac.uk/Tools/dbfetch/) further expands the usefulness of the framework. New tools and updates such as NCBI BLAST+, InterProScan 5 and PfamScan, new categories such as RNA analysis tools (https://www.ebi.ac.uk/Tools/rna/), new databases such as ENA non-coding, WormBase ParaSite, Pfam and Rfam, and new workflow methods, together with the retirement of depreciated services, ensure that the framework remains relevant to today's biological community.
Project description:PROCOGNATE is a database of protein cognate ligands for the domains in enzyme structures as described by CATH, SCOP and Pfam, and is available as an interactive website or a flat file. This article gives an overview of the database and its generation and presents a new website front end, as well as recent increased coverage in our dataset via inclusion of Pfam domains. We also describe navigation of the website and its features. The current version (1.3) of PROCOGNATE covers 4123, 4536, 5876 structures and 377, 326, 695 superfamilies/families in CATH, SCOP and Pfam, respectively. PROCOGNATE can be accessed at: http://www.ebi.ac.uk/thornton-srv/databases/procognate/
Project description:Many entries in the protein data bank (PDB) are annotated to show their component protein domains according to the Pfam classification, as well as their biological function through the enzyme commission (EC) numbering scheme. However, despite the fact that the biological activity of many proteins often arises from specific domain-domain and domain-ligand interactions, current on-line resources rarely provide a direct mapping from structure to function at the domain level. Since the PDB now contains many tens of thousands of protein chains, and since protein sequence databases can dwarf such numbers by orders of magnitude, there is a pressing need to develop automatic structure-function annotation tools which can operate at the domain level.This article presents ECDomainMiner, a novel content-based filtering approach to automatically infer associations between EC numbers and Pfam domains. ECDomainMiner finds a total of 20,728 non-redundant EC-Pfam associations with a F-measure of 0.95 with respect to a "Gold Standard" test set extracted from InterPro. Compared to the 1515 manually curated EC-Pfam associations in InterPro, ECDomainMiner infers a 13-fold increase in the number of EC-Pfam associations.These EC-Pfam associations could be used to annotate some 58,722 protein chains in the PDB which currently lack any EC annotation. The ECDomainMiner database is publicly available at http://ecdm.loria.fr/ .