AccessFold: predicting RNA-RNA interactions with consideration for competing self-structure.
ABSTRACT: There are numerous examples of RNA-RNA complexes, including microRNA-mRNA and small RNA-mRNA duplexes for regulation of translation, guide RNA interactions with target RNA for post-transcriptional modification and small nuclear RNA duplexes for splicing. Predicting the base pairs formed between two interacting sequences remains difficult, at least in part because of the competition between unimolecular and bimolecular structure.Two algorithms were developed for improved prediction of bimolecular RNA structure that consider the competition between self-structure and bimolecular structure. These algorithms utilize two novel approaches to evaluate accessibility: free energy density minimization and pseudo-energy minimization. Free energy density minimization minimizes the folding free energy change per nucleotide involved in an intermolecular secondary structure. Pseudo-energy minimization (called AccessFold) minimizes the sum of free energy change and a pseudo-free energy penalty for bimolecular pairing of nucleotides that are unlikely to be accessible for bimolecular structure. The pseudo-free energy, derived from unimolecular pairing probabilities, is applied per nucleotide in bimolecular pairs, and this approach is able to predict binding sites that are split by unimolecular structures. A benchmark set of 17 bimolecular RNA structures was assembled to assess structure prediction. Pseudo-energy minimization provides a statistically significant improvement in sensitivity over the method that was found in a benchmark to be the most accurate previously available method, with an improvement from 36.8% to 57.8% in mean sensitivity for base pair prediction.Pseudo-energy minimization is available for download as AccessFold, under an open-source license and as part of the RNAstructure package, at: http://rna.urmc.rochester.edu/RNAstructure.email@example.comSupplementary data are available at Bioinformatics online.
Project description:Almost all RNAs can fold to form extensive base-paired secondary structures. Many of these structures then modulate numerous fundamental elements of gene expression. Deducing these structure-function relationships requires that it be possible to predict RNA secondary structures accurately. However, RNA secondary structure prediction for large RNAs, such that a single predicted structure for a single sequence reliably represents the correct structure, has remained an unsolved problem. Here, we demonstrate that quantitative, nucleotide-resolution information from a SHAPE experiment can be interpreted as a pseudo-free energy change term and used to determine RNA secondary structure with high accuracy. Free energy minimization, by using SHAPE pseudo-free energies, in conjunction with nearest neighbor parameters, predicts the secondary structure of deproteinized Escherichia coli 16S rRNA (>1,300 nt) and a set of smaller RNAs (75-155 nt) with accuracies of up to 96-100%, which are comparable to the best accuracies achievable by comparative sequence analysis.
Project description:Trinucleotide bulges in RNA commonly occur in nature. Yet, little data exists concerning the thermodynamic parameters of this motif. Algorithms that predict RNA secondary structure from sequence currently attribute a constant free energy value of 3.2 kcal/mol to all trinucleotide bulges, regardless of bulge sequence. To test the accuracy of this model, RNA duplexes that contain frequent naturally occurring trinucleotide bulges were optically melted, and their thermodynamic parameters-enthalpy, entropy, free energy, and melting temperature-were determined. The thermodynamic data were used to derive a new model to predict the free energy contribution of trinucleotide bulges to RNA duplex stability: ΔG°37, trint bulge = ΔG°37, bulge + ΔG°37, AU + ΔG°37, GU. The parameter ΔG°37, bulge is variable depending upon the purine and pyrimidine composition of the bulge, ΔG°37, AU is a 0.49 kcal/mol penalty for an A-U closing pair, and ΔG° 37, GU is a -0.56 kcal/mol bonus for a G-U closing pair. With both closing pair and bulge sequence taken into account, this new model predicts free energy values within 0.30 kcal/mol of the experimental value. The new model can be used by algorithms that predict RNA free energies as well as algorithms that use free energy minimization to predict RNA secondary structure from sequence.
Project description:Predicting the secondary structure of RNA is an intermediate in predicting RNA three-dimensional structure. Commonly, determining RNA secondary structure from sequence uses free energy minimization and nearest neighbor parameters. Current algorithms utilize a sequence-independent model to predict free energy contributions of dinucleotide bulges. To determine if a sequence-dependent model would be more accurate, short RNA duplexes containing dinucleotide bulges with different sequences and nearest neighbor combinations were optically melted to derive thermodynamic parameters. These data suggested energy contributions of dinucleotide bulges were sequence-dependent, and a sequence-dependent model was derived. This model assigns free energy penalties based on the identity of nucleotides in the bulge (3.06 kcal/mol for two purines, 2.93 kcal/mol for two pyrimidines, 2.71 kcal/mol for 5'-purine-pyrimidine-3', and 2.41 kcal/mol for 5'-pyrimidine-purine-3'). The predictive model also includes a 0.45 kcal/mol penalty for an A-U pair adjacent to the bulge and a -0.28 kcal/mol bonus for a G-U pair adjacent to the bulge. The new sequence-dependent model results in predicted values within, on average, 0.17 kcal/mol of experimental values, a significant improvement over the sequence-independent model. This model and new experimental values can be incorporated into algorithms that predict RNA stability and secondary structure from sequence.
Project description:Thermodynamic processes with free energy parameters are often used in algorithms that solve the free energy minimization problem to predict secondary structures of single RNA sequences. While results from these algorithms are promising, an observation is that single sequence-based methods have moderate accuracy and more information is needed to improve on RNA secondary structure prediction, such as covariance scores obtained from multiple sequence alignments. We present in this paper a new approach to predicting the consensus secondary structure of a set of aligned RNA sequences via pseudo-energy minimization. Our tool, called RSpredict, takes into account sequence covariation and employs effective heuristics for accuracy improvement. RSpredict accepts, as input data, a multiple sequence alignment in FASTA or ClustalW format and outputs the consensus secondary structure of the input sequences in both the Vienna style Dot Bracket format and the Connectivity Table format. Our method was compared with some widely used tools including KNetFold, Pfold and RNAalifold. A comprehensive test on different datasets including Rfam sequence alignments and a multiple sequence alignment obtained from our study on the Drosophila X chromosome reveals that RSpredict is competitive with the existing tools on the tested datasets. RSpredict is freely available online as a web server and also as a jar file for download at http://datalab.njit.edu/biology/RSpredict.
Project description:Non-coding RNAs perform a wide range of functions inside the living cells that are related to their structures. Several algorithms have been proposed to predict RNA secondary structure based on minimum free energy. Low prediction accuracy of these algorithms indicates that free energy alone is not sufficient to predict the functional secondary structure. Recently, the obtained information from the SHAPE experiment greatly improves the accuracy of RNA secondary structure prediction by adding this information to the thermodynamic free energy as pseudo-free energy.In this paper, a new method is proposed to predict RNA secondary structure based on both free energy and SHAPE pseudo-free energy. For each RNA sequence, a population of secondary structures is constructed and their SHAPE data are simulated. Then, an evolutionary algorithm is used to improve each structure based on both free and pseudo-free energies. Finally, a structure with minimum summation of free and pseudo-free energies is considered as the predicted RNA secondary structure.Computationally simulating the SHAPE data for a given RNA sequence requires its secondary structure. Here, we overcome this limitation by employing a population of secondary structures. This helps us to simulate the SHAPE data for any RNA sequence and consequently improves the accuracy of RNA secondary structure prediction as it is confirmed by our experiments. The source code and web server of our proposed method are freely available at http://mostafa.ut.ac.ir/ESD-Fold/.
Project description:A computer program, OligoWalk, is reported that predicts the equilibrium affinity of complementary DNA or RNA oligonucleotides to an RNA target. This program considers the predicted stability of the oligonucleotide-target helix and the competition with predicted secondary structure of both the target and the oligonucleotide. Both unimolecular and bimolecular oligonucleotide self structure are considered with a user-defined concentration. The application of OligoWalk is illustrated with three comparisons to experimental results drawn from the literature.
Project description:Arginine kinase belongs to the family of enzymes, including creatine kinase, that catalyze the buffering of ATP in cells with fluctuating energy requirements and that has been a paradigm for classical enzymological studies. The 1.86-A resolution structure of its transition-state analog complex, reported here, reveals its active site and offers direct evidence for the importance of precise substrate alignment in the catalysis of bimolecular reactions, in contrast to the unimolecular reactions studied previously. In the transition-state analog complex studied here, a nitrate mimics the planar gamma-phosphoryl during associative in-line transfer between ATP and arginine. The active site is unperturbed, and the reactants are not constrained covalently as in a bisubstrate complex, so it is possible to measure how precisely they are pre-aligned by the enzyme. Alignment is exquisite. Entropic effects may contribute to catalysis, but the lone-pair orbitals are also aligned close enough to their optimal trajectories for orbital steering to be a factor during nucleophilic attack. The structure suggests that polarization, strain toward the transition state, and acid-base catalysis also contribute, but, in contrast to unimolecular enzyme reactions, their role appears to be secondary to substrate alignment in this bimolecular reaction.
Project description:CAG trinucleotide repeats are known to cause 10 late-onset progressive neurodegenerative disorders as the repeats expand beyond a threshold, whereas GAC repeats are associated with skeletal dysplasias and expand from the normal five to a maximum of seven repeats. The TR secondary structure is believed to play a role in CAG expansions. We have carried out free energy and molecular dynamics studies to determine the preferred conformations of the A-A noncanonical pairs in (CAG)n and (GAC)n trinucleotide repeats (n = 1, 4) and the consequent changes in the overall structure of the RNA and DNA duplexes. We find that the global free energy minimum corresponds to A-A pairs stacked inside the core of the helix with anti-anti conformations in RNA and (high-anti)-(high-anti) conformations in DNA. The next minimum corresponds to anti-syn conformations, whereas syn-syn conformations are higher in energy. Transition rates of the A-A conformations are higher for RNA than DNA. Mechanisms for these various transitions are identified. Additional structural and dynamical aspects of the helical conformations are explored, with a focus on contrasting CAG and GAC duplexes. The neutralizing ion distribution around the noncanonical pairs is described.
Project description:Mechanochemistry, i.e. the application of forces, F, at the molecular level, has attracted significant interest as a means of controlling chemical reactions. The present study uses quantum chemical calculations to explore the abilities to mechanically eliminate activation energies, ?E(‡), for unimolecular and bimolecular reactions. The results demonstrate that ?E(‡) can be eliminated for unimolecular reactions by applying sufficiently large F along directions that move the reactant and/or transition state (TS) structures parallel to the zero-F reaction coordinate, S0. In contrast, eliminating ?E(‡) for bimolecular reactions requires the reactant to undergo a force-induced shift parallel to S0 irrespective of changes in the TS. Meeting this requirement depends upon the coupling between F and S0 in the reactant. The insights regarding the differences in eliminating ?E(‡) for unimolecular and bimolecular reactions, and the requirements for eliminating ?E(‡), may be useful in practical efforts to control reactions mechanochemically.
Project description:Scorpion primers can be used to detect PCR products in homogeneous solution. Their structure promotes a unimolecular probing mechanism. We compare their performance with that of the same probe sequence forced to act in a bimolecular manner. The data suggest that Scorpions indeed probe by a unimolecular mechanism which is faster and more efficient than the bimolecular mechanism. This mechanism is not dependent on enzymatic cleavage of the probe. A direct comparison between Scorpions, TaqMan and Molecular Beacons on a Roche LightCycler indicates that Scorpions perform better, particularly under fast cycling conditions. Development of a cystic fibrosis mutation detection assay shows that Scorpion primers are selective enough to detect single base mutations and give good sensitivity in all cases. Simultaneous detection of both normal and mutant alleles in a single reaction is possible by combining two Scorpions in a multiplex reaction. Such favourable properties of Scorpion primers should make the technology ideal in numerous applications.