Project description:We developed a single-cell massively parallel reporter assay (scMPRA) to measure the activity of libraries of cis-regulatory sequences (CRSs) across multiple cell-types simultaneously. As a proof of concept, we assayed a library of core promoters in a mixture of HEK293 and K562 cells and showed that scMPRA is a reproducible, highly parallel, single-cell reporter gene assay. Our results show that housekeeping promoters and CpG island promoters have lower activity in K562 cells relative to HEK293, which likely reflects developmental differences between the cell lines. Within K562 cells, scMPRA identified a subset of developmental promoters that are upregulated in the CD34+/CD38- sub-state, confirming this state as more “stem-like.” Finally, we deconvolved the intrinsic and extrinsic components of cell-to-cell variability and found that developmental promoters have a higher proportion of extrinsic noise compared to housekeeping promoters. We anticipate scMPRA will be widely applicable for studying the role of CRSs across diverse cell types. Overall design: single-cell massively parallel reporter assay of a core promoter library with 676 members and each sequence is barcoded 3 times with a 16 bp DNA barcode. A second 25 bp random barcodes was added during cloning. The library was sequencd using bulk RNA-seq and single-cell RNA-seq in K562 and HEK293 cells. Each experiment contains 2 replicates.
Project description:A gene's position in the genome can profoundly affect its expression because regional differences in chromatin modulate the activity of locally acting cis-regulatory sequences (CRSs). Here we study how CRSs and regional chromatin act in concert on a genome-wide scale. We present a massively parallel reporter gene assay that measures the activities of hundreds of different CRSs, each integrated at many specific genomic locations. Although genome location strongly affected CRS activity, the relative strengths of CRSs were maintained at all chromosomal locations. The intrinsic activities of CRSs also correlated with their activities in plasmid-based assays. We explain our data with a quantitative model in which expression levels are set by independent contributions from local CRSs and the regional chromatin environment, rather than by more complex sequence- or protein-specific interactions between these two factors. The methods we present will help investigators determine when regulatory information is integrated in a modular fashion and when regulatory sequences interact in more complex ways.
Project description:Recent large-scale genomics efforts to characterize the cis-regulatory sequences that orchestrate genome-wide expression patterns have produced impressive catalogues of putative regulatory elements. Most of these sequences have not been functionally tested, and our limited understanding of the non-coding genome prevents us from predicting which sequences are bona fide cis-regulatory elements. Recently, massively parallel reporter assays (MPRAs) have been deployed to measure the activity of putative cis-regulatory sequences in several biological contexts, each with specific advantages and distinct limitations. We developed LV-MPRA, a novel lentiviral-based, massively parallel reporter gene assay, to study the function of genome-integrated regulatory elements in any mammalian cell type; thus, making it possible to apply MPRAs in more biologically relevant contexts. We measured the activity of 2,600 sequences in U87 glioblastoma cells and human neural progenitor cells (hNPCs) and explored how regulatory activity is encoded in DNA sequence. We demonstrate that LV-MPRA can be applied to estimate the effects of local DNA sequence and regional chromatin on regulatory activity. Our data reveal that primary DNA sequence features, such as GC content and dinucleotide composition, accurately distinguish sequences with high activity from sequences with low activity in a full chromosomal context, and may also function in combination with different transcription factor binding sites to determine cell type specificity. We conclude that LV-MPRA will be an important tool for identifying cis-regulatory elements and stimulating new understanding about how the non-coding genome encodes information.
Project description:We designed 4 oligonucleotide libraries containing either a retained intron, a cassette exon, tandem 5' or tandem 3' splice sites, cloned them into dedicated reporter constructs, transfected and integrated these constructs in the genome of K562 cells, and performed targeted RNA sequencing to determine RNA splicing ratios and a FACSseq approach to determine protein isoform ratios. Overall design: Massively parallel reporter assay for alternative splicing
Project description:Despite extensive research, the sequence features affecting microRNA-mediated regulation are not well understood, limiting our ability to predict gene expression levels in both native and synthetic sequences. Here we employed a massively parallel reporter assay to investigate the effect of over 14,000 rationally designed 3' UTR sequences on reporter construct repression. We found that multiple factors, including microRNA identity, hybridization energy, target accessibility, and target multiplicity, can be manipulated to achieve a predictable, up to 57-fold, change in protein repression. Moreover, we predict protein repression and RNA levels with high accuracy (R?=?0.84 and R?=?0.80, respectively) using only 3' UTR sequence, as well as the effect of mutation in native 3' UTRs on protein repression (R?=?0.63). Taken together, our results elucidate the effect of different sequence features on miRNA-mediated regulation and demonstrate the predictability of their effect on gene expression with applications in regulatory genomics and synthetic biology.
Project description:We designed an oligonucleotide libraries containing a potential frameshifting site, cloned the library variants into a dedicated reporter construct, transfected and integrated these constructs in the genome of K562 cells, and performed FACSseq (as well as targeted RNA sequencing) to determine if and to what extent frameshifting occurs. Overall design: Massively parallel reporter assay for programmed ribosomal frameshifting
Project description:We employ a massively parallel reporter assay (MPRA) to measure the ex vivo activities of hundreds of K562 and HepG2 enhancers with known transcription factor motif instances. For seven selected motifs that correspond to known or predicted activators and repressors in the two cell types, we make directed modifications of the bases corresponding to these motifs and observe the changes in enhancer activity. Reporter mRNA-seq from HepG2 and K562 cells transfected with a ~55,000-plex MPRA plasmid pool containing 5,418 mutated human enhancer sequences, each linked to 10 distinct 10-nt tags. The reporter mRNA tags facilitate quantitation of their abundances. The same tags were also sequenced from the transfected MPRA plasmid pool to facilitate normalization to plasmid copy numbers.
Project description:We apply a massively parallel reporter assay (MPRA) that relies on mRNA and plasmid tag sequencing (Tag-Seq) to compare the regulatory activities of more than 27,000 distinct variants of two inducible enhancers in human cells: a synthetic cAMP-regulated enhancer and the virus-inducible interferon beta enhancer. The resulting data define accurate maps of functional transcription factor binding sites in both enhancers at single-nucleotide resolution and can be used the to train quantitative sequence-activity models (QSAMs). Reporter Tag-Seq from HEK293 cells transfected with each of six MPRA plasmid pools, with and without stimulation (forskolin or Sendai virus). The reporter mRNAs contain unique 10 nucleotide tags that facilitates quantitation of their abundances. The same tags were also sequenced from each ransfected plasmid pool to facilitate normalization to plasmid copy numbers. The reporter constructs were designed according to two different mutagenesis strategies: 'single-hit scanning' and 'multi-hit sampling'. The specific variants are included in the processed data files.
Project description:Massively parallel reporter assays (MPRAs) can measure the regulatory function of thousands of DNA sequences in a single experiment. Despite growing popularity, MPRA studies are limited by a lack of a unified framework for analyzing the resulting data. Here we present MPRAnalyze: a statistical framework for analyzing MPRA count data. Our model leverages the unique structure of MPRA data to quantify the function of regulatory sequences, compare sequences' activity across different conditions, and provide necessary flexibility in an evolving field. We demonstrate the accuracy and applicability of MPRAnalyze on simulated and published data and compare it with existing methods.