Metabolomics,Unknown,Transcriptomics,Genomics,Proteomics

Dataset Information

0

Cas9 target DNA specificity


ABSTRACT: To study target sequence specificity, selectivity, and reaction kinetics of Streptococcus pyogenes Cas9 activity, we challenged libraries of random variant targets with purified Cas9::guide RNA complexes in vitro. Cleavage kinetics were nonlinear, with a burst of initial activity followed by slower sustained cleavage. Consistent with other recent analyses of Cas9 sequence specificity, we observe considerable (albeit incomplete) impairment of cleavage for targets mutated in the PAM sequence or in "seed" sequences matching the proximal 8 bp of the guide. A second target region requiring close homology was located at the other end of the guide::target duplex (positions 13-18 relative to the PAM). Strikingly, a subset of variants which broke homology in the intervening region consistently increased the capacity of Cas9 to cleave in extended reactions. Sequences flanking the guide+PAM region had measurable (albeit modest) effects on cleavage. Taken together, these studies provide both a basis for predicting effective cleavage targets and a basis for potential optimization of guide RNAs to yield efficiency beyond that of the simple perfect-match guides. 118 samples anaylzed. Controls have con in sample name. To quantitatively measure cleavage efficiency of a single gRNA, we created a population of random variant target sequences to two gRNA targets. The targets used were "unc-22A", [a sequence from the well-characterized unc-22 gene of Caenorhabditis elegans], and "protospacer 4" (ps4), a previously characterized sequence from a natural spacer from S. pyogenes MGAS10750 . Using custom mixtures of oligonucleotide precursors for each base during chemical synthesis, a set of polymorphic target libraries ('Random Variant Libraries') were designed to have a baseline variation rate at each position. On each side of the gRNA homology and PAM regions, 6 bps of random sequence were added. The first base of intended gRNA homology is designated base 1 . The entire 35 bp random variant library mixture was cloned into a standard plasmid vector (pHRL-TK). Several thousand colonies from plates were washed in pools and prepared by standard plasmid preparation methods. The complexity of the libraries were estimated based on Illumina sequencing of the uncut libraries and filtering for minimum representation expected from the pooling. Approximately 1500-3000 unique species were obtained in the unc-22A libraries and 5000 unique sequences in the ps4 library (see Materials and Methods). To assay cleavage, purified Cas9 was first incubated with gRNA, followed by incubation with the variant library for various time points and under various conditions. DNA template is among the conditions varied in the experiments. After protein removal, flanking sequences outside of the target region are used for PCR amplification and plasmid cleavage was measured through loss of PCR products that span the region of interest. A set of perfectly matched targets and highly mutated versions present in the random variant library served as internal positive and negative controls respectively. A log retention score for each sequence in each experiment was calculated by quantifying the representation of each sequence before and after addition of the Cas9 protein. Two approaches were used for normalization: first we used a population of ps4 targets "spiked" into the library as an uncleaved control, second, we used a population of unc-22A targets with large numbers of variations from the perfect target (between 4 and 7), and hence likely limited if any cleavage. Equivalent results are obtained with these two normalization approaches (see Computational Methods for details). Retention scores are expressed as the log2 of the normalized ratio, so that a more negative retention score indicates efficient cleavage of substrate while a less negative score indicates less cleavage. Templates which are uncleaved will yield a retention score at or near zero. Comparisons between multiple experiments indicate strong correlation between independent retention measurements. GSM1410678-GSM1410761; AF_SOL*.dat' files contain the calculated final retentions for each experiment. Each experiment labeled: M-bM-^@M-^\AF_SOL_###_t###M-bM-^@M-^]. M-bM-^@M-^\AF_SOL_###M-bM-^@M-^] corresponds to the experiment run ID and M-bM-^@M-^\t###M-bM-^@M-^] corresponds to the incubation time of the experiment. For example AF_SOL_513_t360, corresponds to experiment 513 on the protospacer 4 guide and DNA target and the incubation time was 360 mins. The experimental conditions and ID can be found in the associated publication. GSM1544297-GSM1544332; unc*.dat file is a tab-delimited file of all considered sequences in each experiment. The names of the files and the AF_SOL_# run number can be found in the associated publication (Supplementary Materials) with the details of the conditions. Each filename starts with the type of gRNA used (either unc-22WT or the mutant version unc22C11G). The next number (#min) is indication of the time of incubation for the experiment and this is either followed by #pcr_AF_SOL_# or just AF_SOL_#. If followed by #pcr, that is the indication of the number of PCR cycles used in the experiments. Finally, AF_SOL_# denotes the sequencing run ID number.

ORGANISM(S): synthetic construct

SUBMITTER: Andrew Fire 

PROVIDER: E-GEOD-58426 | biostudies-arrayexpress |

REPOSITORIES: biostudies-arrayexpress

Similar Datasets

2016-06-10 | E-GEOD-79667 | biostudies-arrayexpress
2015-03-04 | E-GEOD-61099 | biostudies-arrayexpress
2014-05-18 | E-GEOD-55887 | biostudies-arrayexpress
2022-04-11 | E-MTAB-11597 | biostudies-arrayexpress
2013-07-25 | E-GEOD-47114 | biostudies-arrayexpress
2017-02-28 | E-MTAB-5205 | biostudies-arrayexpress
2015-11-25 | E-GEOD-75189 | biostudies-arrayexpress
2015-05-11 | E-GEOD-66755 | biostudies-arrayexpress
2022-01-24 | GSE167466 | GEO
2022-02-09 | E-MTAB-10924 | biostudies-arrayexpress