Structure and Biophysics for a Six Letter DNA Alphabet that Includes Imidazo[1,2-a]-1,3,5-triazine-2(8H)-4(3H)-dione (X) and 2,4-Diaminopyrimidine (K).
ABSTRACT: A goal of synthetic biology is to develop new nucleobases that retain the desirable properties of natural nucleobases at the same time as expanding the genetic alphabet. The nonstandard Watson-Crick pair between imidazo[1,2-a]-1,3,5-triazine-2(8H)-4(3H)-dione (X) and 2,4-diaminopyrimidine (K) does exactly this, pairing via complementary arrangements of hydrogen bonding in these two nucleobases, which do not complement any natural nucleobase. Here, we report the crystal structure of a duplex DNA oligonucleotide in B-form including two consecutive X:K pairs in GATCXK DNA determined as a host-guest complex at 1.75 Å resolution. X:K pairs have significant propeller twist angles, similar to those observed for A:T pairs, and a calculated hydrogen bonding pairing energy that is weaker than that of A:T. Thus, although inclusion of X:K pairs results in a duplex DNA structure that is globally similar to that of an analogous G:C structure, the X:K pairs locally and energetically more closely resemble A:T pairs.
Project description:Expanded genetic systems are most likely to work with natural enzymes if the added nucleotides pair with geometries that are similar to those displayed by standard duplex DNA. Here, we present crystal structures of 16-mer duplexes showing this to be the case with two nonstandard nucleobases (Z, 6-amino-5-nitro-2(1H)-pyridone and P, 2-amino-imidazo[1,2-a]-1,3,5-triazin-4(8H)one) that were designed to form a Z:P pair with a standard "edge on" Watson-Crick geometry, but joined by rearranged hydrogen bond donor and acceptor groups. One duplex, with four Z:P pairs, was crystallized with a reverse transcriptase host and adopts primarily a B-form. Another contained six consecutive Z:P pairs; it crystallized without a host in an A-form. In both structures, Z:P pairs fit canonical nucleobase hydrogen-bonding parameters and known DNA helical forms. Unique features include stacking of the nitro group on Z with the adjacent nucleobase ring in the A-form duplex. In both B- and A-duplexes, major groove widths for the Z:P pairs are approximately 1 Å wider than those of comparable G:C pairs, perhaps to accommodate the large nitro group on Z. Otherwise, ZP-rich DNA had many of the same properties as CG-rich DNA, a conclusion supported by circular dichroism studies in solution. The ability of standard duplexes to accommodate multiple and consecutive Z:P pairs is consistent with the ability of natural polymerases to biosynthesize those pairs. This, in turn, implies that the GACTZP synthetic genetic system can explore the entire expanded sequence space that additional nucleotides create, a major step forward in this area of synthetic biology.
Project description:The base-pairing fidelity of oligonucleotides depends on the identity of the nucleobases involved and the position of matched or mismatched base pairs in the duplex. Nucleobases forming weak base pairs, as well as a terminal position favor mispairing. We have searched for 5'-appended acylamido caps that enhance the stability and base-pairing fidelity of oligonucleotides with a 5'-terminal 2'-deoxyadenosine residue using combinatorial synthesis and MALDI-monitored nuclease selections. This provided the residue of 4-(pyren-1-yl)butyric acid as a lead. Lead optimization gave (S)-N-(pyren-1-ylmethyl)pyrrolidine-3-phosphate as a cap that increases duplex stability and base-pairing fidelity. For the duplex of 5'-AGGTTGAC-3' with its fully complementary target, this cap gives an increase in the UV melting point T(m) of +10.9 degrees C. The T(m) is 6.3-8.3 degrees C lower when a mismatched nucleobase faces the 5'-terminal dA residue. The optimized cap can be introduced via automated DNA synthesis. It was combined with an anthraquinone carboxylic acid residue as a cap for the 3'-terminal residue. A doubly capped dodecamer thus prepared gives a melting point decrease for double-terminal mismatches that is 5.7-5.9 degrees C greater than that for the unmodified control duplex.
Project description:Expansion of the genetic alphabet has been a long-time goal of chemical biology. A third DNA base pair that is stable and replicable would have a great number of practical applications and would also lay the foundation for a semisynthetic organism. We have reported that DNA base pairs formed between deoxyribonucleotides with large aromatic, predominantly hydrophobic nucleobase analogues, such as propynylisocarbostyril (dPICS), are stable and efficiently synthesized by DNA polymerases. However, once incorporated into the primer, these analogues inhibit continued primer elongation. More recently, we have found that DNA base pairs formed between nucleobase analogues that have minimal aromatic surface area in addition to little or no hydrogen-bonding potential, such as 3-fluorobenzene (d3FB), are synthesized and extended by DNA polymerases with greatly increased efficiency. Here we show that the rate of synthesis and extension of the self-pair formed between two d3FB analogues is sufficient for in vitro DNA replication. To better understand the origins of efficient replication, we examined the structure of DNA duplexes containing either the d3FB or dPICS self-pairs. We find that the large aromatic rings of dPICS pair in an intercalative manner within duplex DNA, while the d3FB nucleobases interact in an edge-on manner, much closer in structure to natural base pairs. We also synthesized duplexes containing the 5-methyl-substituted derivatives of d3FB (d5Me3FB) paired opposite d3FB or the unsubstituted analogue (dBEN). In all, the data suggest that the structure, electrostatics, and dynamics can all contribute to the extension of unnatural primer termini. The results also help explain the replication properties of many previously examined unnatural base pairs and should help design unnatural base pairs that are better replicated.
Project description:The crystal structure of an 8-mer (S)-GNA duplex is presented. As a tool for phasing, the anomalous diffraction of two copper(II) ions within two artificial metallo-base pairs was employed. The duplex structure confirms a canonical Watson-Crick base pairing scheme of GNA with antiparallel strands. The duplex secondary structure is distinct from canonical A- and B-form nucleic acids and can be described as a right-handed helical ribbon wrapped around the helix axis, resulting in a large hollow core. Most intriguingly, neighboring base pairs slide strongly against each other, resulting in extensive interstrand base-base hydrophobic interactions along with unusual hydrophobic intrastrand interactions of nucleobases with their backbone. These results reveal how a minimal nucleic acid backbone can support highly stable Watson-Crick-like duplex formation.
Project description:BACKGROUND:Many synthetic biologists seek to increase the degree of autonomy in the assembly of long DNA (L-DNA) constructs from short synthetic DNA fragments, which are today quite inexpensive because of automated solid-phase synthesis. However, the low information density of DNA built from just four nucleotide "letters", the presence of strong (G:C) and weak (A:T) nucleobase pairs, the non-canonical folded structures that compete with Watson-Crick pairing, and other features intrinsic to natural DNA, generally prevent the autonomous assembly of short single-stranded oligonucleotides greater than a dozen or so. RESULTS:We describe a new strategy to autonomously assemble L-DNA constructs from fragments of synthetic single-stranded DNA. This strategy uses an artificially expanded genetic information system (AEGIS) that adds nucleotides to the four (G, A, C, and T) found in standard DNA by shuffling hydrogen-bonding units on the nucleobases, all while retaining the overall Watson-Crick base-pairing geometry. The added information density allows larger numbers of synthetic fragments to self-assemble without off-target hybridization, hairpin formation, and non-canonical folding interactions. The AEGIS pairs are then converted into standard pairs to produce a fully natural L-DNA product. Here, we report the autonomous assembly of a gene encoding kanamycin resistance using this strategy. Synthetic fragments were built from a six-letter alphabet having two AEGIS components, 5-methyl-2'-deoxyisocytidine and 2'-deoxyisoguanosine (respectively S and B), at their overlapping ends. Gaps in the overlapped assembly were then filled in using DNA polymerases, and the nicks were sealed by ligase. The S:B pairs in the ligated construct were then converted to T:A pairs during PCR amplification. When cloned into a plasmid, the product was shown to make Escherichia coli resistant to kanamycin. A parallel study that attempted to assemble similarly sized genes with optimally designed standard nucleotides lacking AEGIS components gave successful assemblies of up to 16 fragments, but generally failed when larger autonomous assemblies were attempted. CONCLUSION:AEGIS nucleotides, by increasing the information density of DNA, allow larger numbers of DNA fragments to autonomously self-assemble into large DNA constructs. This technology can therefore increase the size of DNA constructs that might be used in synthetic biology.
Project description:In 1957, a unique pattern of hydrogen bonding between N3 and O4 on uracil and N7 and N6 on adenine was proposed to explain how poly(rU) strands can associate with poly(rA)-poly(rU) duplexes to form triplexes. Two years later, Karst Hoogsteen visualized such a noncanonical A-T base-pair through X-ray analysis of co-crystals containing 9-methyladenine and 1-methylthymine. Subsequent X-ray analyses of guanine and cytosine derivatives yielded the expected Watson-Crick base-pairing, but those of adenine and thymine (or uridine) did not yield Watson-Crick base-pairs, instead favoring "Hoogsteen" base-pairing. More than two decades ensued without experimental "proof" for A-T Watson-Crick base-pairs, while Hoogsteen base-pairs continued to surface in AT-rich sequences, closing base-pairs of apical loops, in structures of DNA bound to antibiotics and proteins, damaged and chemically modified DNA, and in polymerases that replicate DNA via Hoogsteen pairing. Recently, NMR studies have shown that base-pairs in duplex DNA exist as a dynamic equilibrium between Watson-Crick and Hoogsteen forms. There is now little doubt that Hoogsteen base-pairs exist in significant abundance in genomic DNA, where they can expand the structural and functional versatility of duplex DNA beyond that which can be achieved based only on Watson-Crick base-pairing. Here, we provide a historical account of the discovery and characterization of Hoogsteen base-pairs, hoping that this will inform future studies exploring the occurrence and functional importance of these alternative base-pairs.
Project description:The formation of canonical base pairs through Watson-Crick hydrogen bonding sits at the heart of the genetic apparatus. The specificity of the base pairing of adenine with thymine/uracil and guanine with cytosine preserves accurate information for the biochemical blueprint and replicates the instructions necessary for carrying out biological function. The chemical evolution question of how these five canonical nucleobases were selected over various other possibilities remains intriguing. Since these and alternative nucleobases would have been available for chemical evolution, the reasons for the emergence of this system appear to be primarily functional. While investigating the base-pairing properties of structural nucleic acid analogs, we encountered a relationship between the pK(a) of a series of nonstandard (and canonical) nucleobases and the pH of the aqueous medium. This relationship appeared to correspond with the propensity of these molecules to self-assemble via Watson-Crick-type base-pairing interactions. A simple correlation of the "magnitude of the difference between the pK(a) and pH" (pK(a)-pH correlation) enables a general prediction of which types of heterocyclic recognition elements form hydrogen-bonded base pairs in aqueous media. Using the pK(a)-pH relationship, we can rationalize why nature chose the canonical nucleobases in terms of hydrophobic and hydrophilic interactions, and further extrapolate its significance within the context of chemical evolution. The connection between the physicochemical properties of bioorganic compounds and the interactions with their aqueous environment directly affects structure and function, at both a molecular and a supramolecular level. A general structure-function pattern emerges in biomolecules and biopolymers in aqueous media near neutral pH. A pK(a) - pH < 2 generally prompts catalytic functions, central to metabolism, but a difference in pK(a) - pH > 2 seems to result in the emergence of structure, central to replication. While this general trend is observed throughout extant biology, it could have also been an important factor in chemical evolution.
Project description:Structural variations of the well-known guanine quartet (G4) motif in nucleic acid structures, namely substitution of two guanine bases (G) by two adenine (A) nucleobases in mutual trans positions, are discussed and studied by density functional theory (DFT) methods. This work was initiated by three findings, namely (1) that GA mismatches are compatible with complementary pairing patterns in duplex-DNA structures and can, in principle, be extended to quartet structures, (2) that GA pairs can come in several variations, including with a N1 protonated adeninium moiety (AH), and (3) that cross-linking of the major donor sites of purine nucleobases (N1 and N7) by transition metal ions of linear coordination geometries produces planar purine quartets, as demonstrated by some of us in the past. Here, possible structures of mixed AGAG quartets both in the presence of protons and alkali metal ions are discussed, and in particular, the existence of a putative four-purine, two-metal motif.
Project description:Evolution has yielded biopolymers that are constructed from exactly four building blocks and are able to support Darwinian evolution. Synthetic biology aims to extend this alphabet, and we recently showed that 8-letter (hachimoji) DNA can support rule-based information encoding. One source of replicative error in non-natural DNA-like systems, however, is the occurrence of alternative tautomeric forms, which pair differently. Unfortunately, little is known about how structural modifications impact free-energy differences between tautomers of the non-natural nucleobases used in the hachimoji expanded genetic alphabet. Determining experimental tautomer ratios is technically difficult, and so, strategies for improving hachimoji DNA replication efficiency will benefit from accurate computational predictions of equilibrium tautomeric ratios. We now report that high-level quantum-chemical calculations in aqueous solution by the embedded cluster reference interaction site model, benchmarked against free-energy molecular simulations for solvation thermodynamics, provide useful quantitative information on the tautomer ratios of both Watson-Crick and hachimoji nucleobases. In agreement with previous computational studies, all four Watson-Crick nucleobases adopt essentially only one tautomer in water. This is not the case, however, for non-natural nucleobases and their analogues. For example, although the enols of isoguanine and a series of related purines are not populated in water, these heterocycles possess N1-H and N3-H keto tautomers that are similar in energy, thereby adversely impacting accurate nucleobase pairing. These robust computational strategies offer a firm basis for improving experimental measurements of tautomeric ratios, which are currently limited to studying molecules that exist only as two tautomers in solution.
Project description:If two nucleobases are crosslinked by trans-a(2)Pt(II), self-association via H bonding may take place either through individual bases or jointly through both bases. Due to the blockage of an acceptor site by the metal, the number of feasible pairing patterns can be reduced, and the preferred ones altered. If the metalated base pair as a whole undergoes association, base quartets can form. Various scenarios resulting from the application of guanine, hypoxanthine, and cytosine model nucleobases are discussed. Unconventional CH em leader N hydrogen bonding has been observed in several instances.