Generation of DNA cleavage specificities of type II restriction endonucleases by reassortment of target recognition domains.
ABSTRACT: Type II restriction endonucleases (REases) cleave double-stranded DNA at specific sites within or close to their recognition sequences. Shortly after their discovery in 1970, REases have become one of the primary tools in molecular biology. However, the list of available specificities of type II REases is relatively short despite the extensive search for them in natural sources and multiple attempts to artificially change their specificity. In this study, we examined the possibility of generating cleavage specificities of REases by swapping putative target recognition domains (TRDs) between the type IIB enzymes AloI, PpiI, and TstI. Our results demonstrate that individual TRDs recognize distinct parts of the bipartite DNA targets of these enzymes and are interchangeable. Based on these properties, we engineered a functional type IIB REase having previously undescribed DNA specificity. Our study suggests that the TRD-swapping approach may be used as a general technique for the generation of type II enzymes with predetermined specificities.
Project description:For a very long time, Type II restriction enzymes (REases) have been a paradigm of ORFans: proteins with no detectable similarity to each other and to any other protein in the database, despite common cellular and biochemical function. Crystallographic analyses published until January 2008 provided high-resolution structures for only 28 of 1637 Type II REase sequences available in the Restriction Enzyme database (REBASE). Among these structures, all but two possess catalytic domains with the common PD-(D/E)XK nuclease fold. Two structures are unrelated to the others: R.BfiI exhibits the phospholipase D (PLD) fold, while R.PabI has a new fold termed 'half-pipe'. Thus far, bioinformatic studies supported by site-directed mutagenesis have extended the number of tentatively assigned REase folds to five (now including also GIY-YIG and HNH folds identified earlier in homing endonucleases) and provided structural predictions for dozens of REase sequences without experimentally solved structures. Here, we present a comprehensive study of all Type II REase sequences available in REBASE together with their homologs detectable in the nonredundant and environmental samples databases at the NCBI. We present the summary and critical evaluation of structural assignments and predictions reported earlier, new classification of all REase sequences into families, domain architecture analysis and new predictions of three-dimensional folds. Among 289 experimentally characterized (not putative) Type II REases, whose apparently full-length sequences are available in REBASE, we assign 199 (69%) to contain the PD-(D/E)XK domain. The HNH domain is the second most common, with 24 (8%) members. When putative REases are taken into account, the fraction of PD-(D/E)XK and HNH folds changes to 48% and 30%, respectively. Fifty-six characterized (and 521 predicted) REases remain unassigned to any of the five REase folds identified so far, and may exhibit new architectures. These enzymes are proposed as the most interesting targets for structure determination by high-resolution experimental methods. Our analysis provides the first comprehensive map of sequence-structure relationships among Type II REases and will help to focus the efforts of structural and functional genomics of this large and biotechnologically important class of enzymes.
Project description:Restriction endonucleases (REases) are DNA-cleaving enzymes that have become indispensable tools in molecular biology. Type II REases are highly divergent in sequence despite their common structural core, function and, in some cases, common specificities towards DNA sequences. This makes it difficult to identify and classify them functionally based on sequence, and has hampered the efforts of specificity-engineering. Here, we define novel REase sequence motifs, which extend beyond the PD-(D/E)XK hallmark, and incorporate secondary structure information. The automated search using these motifs is carried out with a newly developed fast regular expression matching algorithm that accommodates long patterns with optional secondary structure constraints. Using this new tool, named Scan2S, motifs derived from REases with specificity towards GATC- and CGGG-containing DNA sequences successfully identify REases of the same specificity. Notably, some of these sequences are not identified by standard sequence detection tools. The new motifs highlight potential specificity-determining positions that do not fully overlap for the GATC- and the CCGG-recognizing REases and are candidates for specificity re-engineering.
Project description:BACKGROUND: We previously defined a family of restriction endonucleases (REases) from Thermus sp., which share common biochemical and biophysical features, such as the fusion of both the nuclease and methyltransferase (MTase) activities in a single polypeptide, cleavage at a distance from the recognition site, large molecular size, modulation of activity by S-adenosylmethionine (SAM), and incomplete cleavage of the substrate DNA. Members include related thermophilic REases with five distinct specificities: TspGWI, TaqII, Tth111II/TthHB27I, TspDTI and TsoI. RESULTS: TspDTI, TsoI and isoschizomers Tth111II/TthHB27I recognize different, but related sequences: 5'-ATGAA-3', 5'-TARCCA-3' and 5'-CAARCA-3' respectively. Their amino acid sequences are similar, which is unusual among REases of different specificity. To gain insight into this group of REases, TspDTI, the prototype member of the Thermus sp. enzyme family, was cloned and characterized using a recently developed method for partially cleaving REases. CONCLUSIONS: TspDTI, TsoI and isoschizomers Tth111II/TthHB27I are closely related bifunctional enzymes. They comprise a tandem arrangement of Type I-like domains, like other Type IIC enzymes (those with a fusion of a REase and MTase domains), e.g. TspGWI, TaqII and MmeI, but their sequences are only remotely similar to these previously characterized enzymes. The characterization of TspDTI, a prototype member of this group, extends our understanding of sequence-function relationships among multifunctional restriction-modification enzymes.
Project description:Type II restriction endonucleases (REases) are deoxyribonucleases that cleave DNA sequences with remarkable specificity. Type II REases are highly divergent in sequence as well as in topology, i.e. the connectivity of secondary structure elements. A widely held assumption is that a structural core of five beta-strands flanked by two alpha-helices is common to these enzymes. We introduce a systematic procedure to enumerate secondary structure elements in an unambiguous and reproducible way, and use it to analyze the currently available X-ray structures of Type II REases. Based on this analysis, we propose an alternative definition of the core, which we term the alphabetaalpha-core. The alphabetaalpha-core includes the most frequently observed secondary structure elements and is not a sandwich, as it consists of a five-strand beta-sheet and two alpha-helices on the same face of the beta-sheet. We use the alphabetaalpha-core connectivity as a basis for grouping the Type II REases into distinct structural classes. In these new structural classes, the connectivity correlates with the angles between the secondary structure elements and with the cleavage patterns of the REases. We show that there exists a substructure of the alphabetaalpha-core, namely a common conserved core, ccc, defined here as one alpha-helix and four beta-strands common to all Type II REase of known structure.
Project description:Restriction endonucleases (REases) protect bacteria from invading foreign DNAs and are endowed with exquisite sequence specificity. REases have originated from the ancestral proteins and evolved new sequence specificities by genetic recombination, gene duplication, replication slippage, and transpositional events. They are also speculated to have evolved from nonspecific endonucleases, attaining a high degree of sequence specificity through point mutations. We describe here an example of generation of exquisitely site-specific REase from a highly-promiscuous one by a single point mutation.
Project description:BACKGROUND: In continuing our research into the new family of bifunctional restriction endonucleases (REases), we describe the cloning of the tsoIRM gene. Currently, the family includes six thermostable enzymes: TaqII, Tth111II, TthHB27I, TspGWI, TspDTI, TsoI, isolated from various Thermus sp. and two thermolabile enzymes: RpaI and CchII, isolated from mesophilic bacteria Rhodopseudomonas palustris and Chlorobium chlorochromatii, respectively. The enzymes have several properties in common. They are large proteins (molecular size app. 120 kDa), coded by fused genes, with the REase and methyltransferase (MTase) in a single polypeptide, where both activities are affected by S-adenosylmethionine (SAM). They recognize similar asymmetric cognate sites and cleave at a distance of 11/9 nt from the recognition site. Thus far, we have cloned and characterised TaqII, Tth111II, TthHB27I, TspGWI and TspDTI. RESULTS: TsoI REase, which originate from thermophilic Thermus scotoductus RFL4 (T. scotoductus), was cloned in Escherichia coli (E. coli) using two rounds of biochemical selection of the T. scotoductus genomic library for the TsoI methylation phenotype. DNA sequencing of restriction-resistant clones revealed the common open reading frame (ORF) of 3348 bp, coding for a large polypeptide of 1116 aminoacid (aa) residues, which exhibited a high level of similarity to Tth111II (50% identity, 60% similarity). The ORF was PCR-amplified, subcloned into a pET21 derivative under the control of a T7 promoter and was subjected to the third round of biochemical selection in order to isolate error-free clones. Induction experiments resulted in synthesis of an app. 125 kDa protein, exhibiting TsoI-specific DNA cleavage. Also, the wild-type (wt) protein was purified and reaction optima were determined. CONCLUSIONS: Previously we identified and cloned the Thermus family RM genes using a specially developed method based on partial proteolysis of thermostable REases. In the case of TsoI the classic biochemical selection method was successful, probably because of the substantially lower optimal reaction temperature of TsoI (app. 10-15°C). That allowed for sufficient MTase activity in vivo in recombinant E. coli. Interestingly, TsoI originates from bacteria with a high optimum growth temperature of 67°C, which indicates that not all bacterial enzymes match an organism's thermophilic nature, and yet remain functional cell components. Besides basic research advances, the cloning and characterisation of the new prototype REase from the Thermus sp. family enzymes is also of practical importance in gene manipulation technology, as it extends the range of available DNA cleavage specificities.
Project description:Thus far, identification of functionally important residues in Type II restriction endonucleases (REases) has been difficult using conventional methods. Even though known REase structures share a fold and marginally recognizable active site, the overall sequence similarities are statistically insignificant, unless compared among proteins that recognize identical or very similar sequences. Bsp6I is a Type II REase, which recognizes the palindromic DNA sequence 5'GCNGC and cleaves between the cytosine and the unspecified nucleotide in both strands, generating a double-strand break with 5'-protruding single nucleotides. There are no solved structures of REases that recognize similar DNA targets or generate cleavage products with similar characteristics. In straightforward comparisons, the Bsp6I sequence shows no significant similarity to REases with known structures. However, using a fold-recognition approach, we have identified a remote relationship between Bsp6I and the structure of PvuII. Starting from the sequence-structure alignment between Bsp6I and PvuII, we constructed a homology model of Bsp6I and used it to predict functionally significant regions in Bsp6I. The homology model was supported by site-directed mutagenesis of residues predicted to be important for dimerization, DNA binding and catalysis. Completing the picture of sequence-structure-function relationships in protein superfamilies becomes an essential task in the age of structural genomics and our study may serve as a paradigm for future analyses of superfamilies comprising strongly diverged members with little or no sequence similarity.
Project description:Type II restriction endonucleases (REases) form a large and highly diverse group of enzymes. Even REases specific for a common recognition site often vary in their oligomeric structure, domain organization and DNA cleavage mechanisms. Here we report biochemical and structural characterization of the monomeric restriction endonuclease UbaLAI, specific for the pseudosymmetric DNA sequence 5'-CC/WGG-3' (where W = A/T, and '/' marks the cleavage position). We present a 1.6 Å co-crystal structure of UbaLAI N-terminal domain (UbaLAI-N) and show that it resembles the B3-family domain of EcoRII specific for the 5'-CCWGG-3' sequence. We also find that UbaLAI C-terminal domain (UbaLAI-C) is closely related to the monomeric REase MvaI, another enzyme specific for the 5'-CCWGG-3' sequence. Kinetic studies of UbaLAI revealed that it requires two recognition sites for optimal activity, and, like other type IIE enzymes, uses one copy of a recognition site to stimulate cleavage of a second copy. We propose that during the reaction UbaLAI-N acts as a handle that tethers the monomeric UbaLAI-C domain to the DNA, thereby helping UbaLAI-C to perform two sequential DNA nicking reactions on the second recognition site during a single DNA-binding event. A similar reaction mechanism may be characteristic to other monomeric two-domain REases.
Project description:To explore the possibility of using restriction enzymes in a synthetic biology based on artificially expanded genetic information systems (AEGIS), 24 type-II restriction endonucleases (REases) were challenged to digest DNA duplexes containing recognition sites where individual Cs and Gs were replaced by the AEGIS nucleotides Z and P [respectively, 6-amino-5-nitro-3-(1'-β-D-2'-deoxyribofuranosyl)-2(1H)-pyridone and 2-amino-8-(1'-β-D-2'-deoxyribofuranosyl)-imidazo[1,2-a]-1,3,5-triazin-4(8H)-one]. These AEGIS nucleotides implement complementary hydrogen bond donor-donor-acceptor and acceptor-acceptor-donor patterns. Results allowed us to classify type-II REases into five groups based on their performance, and to infer some specifics of their interactions with functional groups in the major and minor grooves of the target DNA. For three enzymes among these 24 where crystal structures are available (BcnI, EcoO109I and NotI), these interactions were modeled. Further, we applied a type-II REase to quantitate the fidelity polymerases challenged to maintain in a DNA duplex C:G, T:A and Z:P pairs through repetitive PCR cycles. This work thus adds tools that are able to manipulate this expanded genetic alphabet in vitro, provides some structural insights into the working of restriction enzymes, and offers some preliminary data needed to take the next step in synthetic biology to use an artificial genetic system inside of living bacterial cells.
Project description:BACKGROUND: Restriction-modification (RM) systems appear to play key roles in modulating gene flow among bacteria and archaea. Because the restriction endonuclease (REase) is potentially lethal to unmethylated new host cells, regulation to ensure pre-expression of the protective DNA methyltransferase (MTase) is essential to the spread of RM genes. This is particularly true for Type IIP RM systems, in which the REase and MTase are separate, independently-active proteins. A substantial subset of Type IIP RM systems are controlled by an activator-repressor called C protein. In these systems, C controls the promoter for its own gene, and for the downstream REase gene that lacks its own promoter. Thus MTase is expressed immediately after the RM genes enter a new cell, while expression of REase is delayed until sufficient C protein accumulates. To study the variation in and evolution of this regulatory mechanism, we searched for RM systems closely related to the well-studied C protein-dependent PvuII RM system. Unexpectedly, among those found were several in which the C protein and REase genes were fused. RESULTS: The gene for CR.NsoJS138I fusion protein (nsoJS138ICR, from the bacterium Niabella soli) was cloned, and the fusion protein produced and partially purified. Western blots provided no evidence that, under the conditions tested, anything other than full-length fusion protein is produced. This protein had REase activity in vitro and, as expected from the sequence similarity, its specificity was indistinguishable from that for PvuII REase, though the optimal reaction conditions were different. Furthermore, the fusion was active as a C protein, as revealed by in vivo activation of a lacZ reporter fusion to the promoter region for the nsoJS138ICR gene. CONCLUSIONS: Fusions between C proteins and REases have not previously been characterized, though other fusions have (such as between REases and MTases). These results reinforce the evidence for impressive modularity among RM system proteins, and raise important questions about the implications of the C-REase fusions on expression kinetics of these RM systems.