Project description:The Protein Data Bank (PDB) has been a key resource for learning general rules of sequence-structure relationships in proteins. Quantitative insights have been gained by defining geometric descriptors of structure (e.g., distances, dihedral angles, solvent exposure, etc.) and observing their distributions and sequence preferences. Here we argue that as the PDB continues to grow, it may become unnecessary to reduce structure into a set of elementary descriptors. Instead, it could be possible to deduce quantitative sequence-structure relationships in the context of precisely-defined complex structural motifs by mining the PDB for closely matching backbone geometries. To validate this idea, we turned to the the task of predicting changes in protein stability upon amino-acid substitution-a difficult problem of broad significance. We defined non-contiguous tertiary motifs (TERMs) around a protein site of interest and extracted sequence preferences from ensembles of closely-matching substructures in the PDB to predict mutational stability changes at the site, ΔΔGm. We demonstrate that these ensemble statistics predict ΔΔGm on par with state-of-the-art statistical and machine-learning methods on large thermodynamic datasets, and outperform these, along with a leading structure-based modeling approach, when tested in the context of unbiased diverse mutations. Further, we show that the performance of the TERM-based method is directly related to the amount of available relevant structural data, automatically improving with the growing PDB. This enables a means of estimating prediction accuracy. Our results clearly demonstrate that: 1) statistics of non-contiguous structural motifs in the PDB encode fundamental sequence-structure relationships related to protein thermodynamic stability, and 2) the PDB is now large enough that such statistics are already useful in practice, with their accuracy expected to continue increasing as the database grows. These observations suggest new ways of using structural data towards addressing problems of computational structural biology.
Project description:Use of atomic force microscopy (AFM) has recently led to a better understanding of the molecular mechanisms of the unfolding process by mechanical forces; however, the rational design of novel proteins with specific mechanical strength remains challenging. We have approached this problem from a new perspective that generates linear physical-chemical properties (PCP) motifs from a limited AFM data set. Guided by our linear sequence analysis, we designed and analyzed four new mutants of the titin I1 domain with the goal of increasing the domain's mechanical strength. All four mutants could be cloned and expressed as soluble proteins. AFM data indicate that at least two of the mutants have increased molecular mechanical strength. This observation suggests that the PCP method is useful to graft sequences specific for high mechanical stability to weak proteins to increase their mechanical stability, and represents an additional tool in the design of novel proteins besides steered molecular dynamics calculations, coarse grained simulations, and ?-value analysis of the transition state.
Project description:Representation and analysis of complex biological and engineered systems as directed networks is useful for understanding their global structure/function organization. Enrichment of network motifs, which are over-represented subgraphs in real networks, can be used for topological analysis. Because counting network motifs is computationally expensive, only characterization of 3- to 5-node motifs has been previously reported. In this study we used a supercomputer to analyze cyclic motifs made of 3-20 nodes for 6 biological and 3 technological networks. Using tools from statistical physics, we developed a theoretical framework for characterizing the ensemble of cyclic motifs in real networks. We have identified a generic property of real complex networks, antiferromagnetic organization, which is characterized by minimal directional coherence of edges along cyclic subgraphs, such that consecutive links tend to have opposing direction. As a consequence, we find that the lack of directional coherence in cyclic motifs leads to depletion in feedback loops, where the number of nodes affected by feedback loops appears to be at a local minimum compared with surrogate shuffled networks. This topology provides more dynamic stability in large networks.
Project description:Coat protein I (COPI)-coated vesicles mediate retrograde transport from the Golgi to the endoplasmic reticulum (ER), as well as transport within the Golgi. Major progress has been made in defining the structure of COPI coats, in vitro and in vivo, at resolutions as high as 9 Å. Nevertheless, important questions remain unanswered, including what specific interactions stabilize COPI coats, how COPI vesicles recognize their target membranes, and how coat disassembly is coordinated with vesicle fusion and cargo delivery. Here, we use X-ray crystallography to identify a conserved site on the COPI subunit α-COP that binds to flexible, acidic sequences containing a single tryptophan residue. One such sequence, found within α-COP itself, mediates α-COP homo-oligomerization. Another such sequence is contained within the lasso of the ER-resident Dsl1 complex, where it helps mediate the tethering of Golgi-derived COPI vesicles at the ER membrane. Together, our findings suggest that α-COP homo-oligomerization plays a key role in COPI coat stability, with potential implications for the coordination of vesicle tethering, uncoating, and fusion.
Project description:The use of DNA-based nanomaterials in biomedical applications is continuing to grow, yet more emphasis is being put on the need for guaranteed structural stability of DNA nanostructures in physiological conditions. Various methods have been developed to stabilize DNA origami against low concentrations of divalent cations and the presence of nucleases. However, existing strategies typically require the complete encapsulation of nanostructures, which makes accessing the encased DNA strands difficult, or chemical modification, such as covalent crosslinking of DNA strands. We present a stabilization method involving the synthesis of DNA brick nanostructures with dendritic oligonucleotides attached to the outer surface. We find that nanostructures assembled from DNA brick motifs remain stable against denaturation without any chemical modifications. Furthermore, densely coating the outer surface of DNA brick nanostructures with dendritic oligonucleotides prevents nuclease digestion.
Project description:Podocytes are kidney cells with specialized morphology that is required for glomerular filtration. Diseases, such as diabetes, or drug exposure that causes disruption of the podocyte foot process morphology results in kidney pathophysiology. Proteomic analysis of glomeruli isolated from rats with puromycin-induced kidney disease and control rats indicated that protein kinase A (PKA), which is activated by adenosine 3',5'-monophosphate (cAMP), is a key regulator of podocyte morphology and function. In podocytes, cAMP signaling activates cAMP response element-binding protein (CREB) to enhance expression of the gene encoding a differentiation marker, synaptopodin, a protein that associates with actin and promotes its bundling. We constructed and experimentally verified a ?-adrenergic receptor-driven network with multiple feedback and feedforward motifs that controls CREB activity. To determine how the motifs interacted to regulate gene expression, we mapped multicompartment dynamical models, including information about protein subcellular localization, onto the network topology using Petri net formalisms. These computational analyses indicated that the juxtaposition of multiple feedback and feedforward motifs enabled the prolonged CREB activation necessary for synaptopodin expression and actin bundling. Drug-induced modulation of these motifs in diseased rats led to recovery of normal morphology and physiological function in vivo. Thus, analysis of regulatory motifs using network dynamics can provide insights into pathophysiology that enable predictions for drug intervention strategies to treat kidney disease.
Project description:In yeast, beta-oxidation of fatty acids (FAs) takes place in the peroxisome, an organelle whose size and number are controlled in response to environmental cues. The expression of genes required for peroxisome assembly and function is controlled by a transcriptional regulatory network that is induced by FAs such as oleate. The core FA-responsive transcriptional network consists of carbon source-sensing transcription factors that regulate key target genes through an overlapping feed-forward network motif (OFFNM). However, a systems-level understanding of the function of this network architecture in regulating dynamic FA-induced gene expression is lacking. The specific role of the OFFNM in regulating the dynamic and cell-population transcriptional response to oleate was investigated using a kinetic model comprised of four core transcription factor genes (ADR1, OAF1, PIP2, and OAF3) and two reporter genes (CTA1 and POT1) that are indicative of peroxisome induction. Simulations of the model suggest that 1), the intrinsic Adr1p-driven feed-forward loop reduces the steady-state expression variability of target genes; 2), the parallel Oaf3p-driven inhibitory feed-forward loop modulates the dynamic response of target genes to a transiently varying oleate concentration; and 3), heterodimerization of Oaf1p and Pip2p does not appear to have a noise-reducing function in the context of oleate-dependent expression of target genes. The OFFNM is highly overrepresented in the yeast regulome, suggesting that the specific functions described for the OFFNM, or other properties of this motif, provide a selective advantage.
Project description:Toll-like receptors (TLRs) are essential for host defense. Although several TLRs reside on the cell surface, nucleic acid recognition of TLRs occurs intracellularly. For example, the receptor for CpG containing bacterial and viral DNA, TLR9, is retained in the endoplasmic reticulum. Recent evidence suggests that the localization of TLR9 is critical for appropriate ligand recognition. Here we have defined which structural features of the TLR9 molecule control its intracellular localization. Both the cytoplasmic and ectodomains of TLR9 contain sufficient information, whereas the transmembrane domain plays no role in intracellular localization. We identify a 14-amino acid stretch that directs TLR9 intracellularly and confers intracellular localization to the normally cell surface-expressed TLR4. Truncation or mutation of the cytoplasmic tail of TLR9 reveals a vesicle localization motif that targets early endosomes. We propose a model whereby modification of the cytoplasmic tail of TLR9 results in trafficking to early endosomes where it encounters CpG DNA.
Project description:RNA-binding proteins (RBPs) regulate splicing according to position-dependent principles, which can be exploited for analysis of regulatory motifs. Here we present RNAmotifs, a method that evaluates the sequence around differentially regulated alternative exons to identify clusters of short and degenerate sequences, referred to as multivalent RNA motifs. We show that diverse RBPs share basic positional principles, but differ in their propensity to enhance or repress exon inclusion. We assess exons differentially spliced between brain and heart, identifying known and new regulatory motifs, and predict the expression pattern of RBPs that bind these motifs. RNAmotifs is available at https://bitbucket.org/rogrro/rna_motifs.
Project description:In multicellular organisms, cell types must be produced and maintained in appropriate proportions. One way this is achieved is through committed progenitor cells that produce specific sets of descendant cell types. However, cell fate commitment is probabilistic in most contexts, making it difficult to infer progenitor states and understand how they establish overall cell type proportions. Here, we introduce Lineage Motif Analysis (LMA), a method that recursively identifies statistically overrepresented patterns of cell fates on lineage trees as potential signatures of committed progenitor states. Applying LMA to published datasets reveals spatial and temporal organization of cell fate commitment in zebrafish and rat retina and early mouse embryo development. Comparative analysis of vertebrate species suggests that lineage motifs facilitate adaptive evolutionary variation of retinal cell type proportions. LMA thus provides insight into complex developmental processes by decomposing them into simpler underlying modules.