{"database":"biostudies-arrayexpress","file_versions":[],"scores":null,"additional":{"submitter":["Jernej Ule"],"organism":["synthetic construct"],"full_dataset_link":["https://www.ebi.ac.uk/biostudies/studies/E-MTAB-13330"],"description":["These experiments use a barcoded pool of reporter transcripts, each of which encode the same mScarlet-PPIG_LCD fusion protein, but using different degrees of GA-multivalency via codon bias, and containing a different number of constitutive introns. In order to be able to perform experiments using this pool, it was necessary to perform long-read sequencing of the plasmid pool to relate the barcodes in the 3' ends of the reporter to their gene structure. Therefore, we performed long-read sequencing of the plasmid pool (both the original pool used for transfection and the ePB plasmid used for PiggyBac integration). Furthermore, to determine the splicing patterns of the reporter genes, we transfected the plasmid pool into HeLa cells for 16 hours, then performed targeted long-read sequencing of the reporter plasmids via RT-PCR.  Note: the Nanopore adapter ligation strategy means that reads can come in either orientation.  To determine the gene architectures and barcodes, we used fuzzy string matching. First we matched to various fixed sequences throughout the reporter transcripts to determine the orientation of the read and that the read spanned the full length of the transcript. Then we used the same string matching strategy to detect the presence of the different intronic or exonic sequences - the gene architecture. Then we extracted the associated unique plasmid barcode associated with that gene architecture.  Example reporter sequences can be found here: https://benchling.com/faraway/f_/kXCfddtQ-public-reporter-plasmid-maps/  or alternatively, in Supplemental Table 2 of the bioRxiv submission here: https://www.biorxiv.org/content/10.1101/2023.08.21.554177v1.supplementary-material"],"repository":["biostudies-arrayexpress"],"sample_protocol":["Growth Protocol - In the case of the transfected cells, cells were grown in DMEM+Glutamax+10% FBS and seeded such that they achieved 70% confluency on the morning of the experiment, with each sample being derived from 2 wells of a 6 well plate.","Sample Collection - Samples were derived from plasmid DNA, or from whole cells transfected for 16 hours with the plasmid pool.","Sample Treatment - In the case of the transfected cells, cells were transfected with the plasmid pool using Lipofectamine 3000 according to manufacturer instructions.","Nucleic Acid Extraction - In the case of the transfected cells, RNA extractions were performed from cell pellets using the Maxwell RSC simplyRNA kit (Promega) using a Maxwell RSC Instrument (Promega).","Library Construction - Targeted PCR of the whole transcript, followed by Nanopore adapter ligation. In the case of the RNA sequencing, we performed oligo-dT based RT using Superscript IV","Sequencing - A Nanopore 9.4.1 flow cell was used to sequence the libraries."],"figure_sub":["Organization","MINSEQE Score","Assays and Data","Processed Data","MAGE-TAB Files"],"data_protocol":["Data Transformation - No normalisations were performed on the data provided.","Sequence Alignment - We aligned the reads using our own fuzzy string matching approach in python. Nanopore data was basecalled using Guppy 6.3.7."],"omics_type":["Metabolomics","Unknown","Transcriptomics","Genomics","Proteomics"],"instrument_platform":["MinION"],"study_type":["RNA-seq of coding RNA"],"species":["synthetic construct"],"pubmed_authors":["Rupert Faraway","Jernej Ule"],"additional_accession":[]},"is_claimable":false,"name":"Long read sequencing of reporter plasmids and RNA","description":"These experiments use a barcoded pool of reporter transcripts, each of which encode the same mScarlet-PPIG_LCD fusion protein, but using different degrees of GA-multivalency via codon bias, and containing a different number of constitutive introns. In order to be able to perform experiments using this pool, it was necessary to perform long-read sequencing of the plasmid pool to relate the barcodes in the 3' ends of the reporter to their gene structure. Therefore, we performed long-read sequencing of the plasmid pool (both the original pool used for transfection and the ePB plasmid used for PiggyBac integration). Furthermore, to determine the splicing patterns of the reporter genes, we transfected the plasmid pool into HeLa cells for 16 hours, then performed targeted long-read sequencing of the reporter plasmids via RT-PCR.  Note: the Nanopore adapter ligation strategy means that reads can come in either orientation.  To determine the gene architectures and barcodes, we used fuzzy string matching. First we matched to various fixed sequences throughout the reporter transcripts to determine the orientation of the read and that the read spanned the full length of the transcript. Then we used the same string matching strategy to detect the presence of the different intronic or exonic sequences - the gene architecture. Then we extracted the associated unique plasmid barcode associated with that gene architecture.  Example reporter sequences can be found here: https://benchling.com/faraway/f_/kXCfddtQ-public-reporter-plasmid-maps/  or alternatively, in Supplemental Table 2 of the bioRxiv submission here: https://www.biorxiv.org/content/10.1101/2023.08.21.554177v1.supplementary-material","dates":{"release":"2025-07-29T00:00:00Z","modification":"2025-07-30T00:01:51.313Z","creation":"2023-09-04T14:38:49.231Z"},"accession":"E-MTAB-13330","cross_references":{"ENA":["ERP150950"],"EFO":["EFO_0002944","EFO_0004170","EFO_0003789","EFO_0004917","EFO_0005518","EFO_0003816","EFO_0003738","EFO_0004184","EFO_0003969"]}}