Project description:Circular RNA (circRNA) is mainly generated by the splice donor of a downstream exon joining to an upstream splice acceptor, a phenomenon known as backsplicing. It has been reported that circRNA can function as microRNA (miRNA) sponges, transcriptional regulators, or potential biomarkers. The availability of massive non-polyadenylated transcriptomes data has facilitated the genome-wide identification of thousands of circRNAs. Several circRNA detection tools or pipelines have recently been developed, and it is essential to provide useful guidelines on these pipelines for users, including a comprehensive and unbiased comparison. Here, we provide an improved and easy-to-use circRNA read simulator that can produce mimicking backsplicing reads supporting circRNAs deposited in CircBase. Moreover, we compared the performance of 11 circRNA detection tools on both simulated and real datasets. We assessed their performance regarding metrics such as precision, sensitivity, F1 score, and Area under Curve. It is concluded that no single method dominated on all of these metrics. Among all of the state-of-the-art tools, CIRI, CIRCexplorer, and KNIFE, which achieved better balanced performance between their precision and sensitivity, compared favorably to the other methods.
Project description:BackgroundCircular RNAs (circRNAs) have been shown to play important regulatory roles in a range of both pathological and physiological contexts, but their functions in the context of skin aging remain to be clarified. In the present study, we therefore, profiled circRNA expression profiles in four pairs of aged and non-aged skin samples to identify identifying differentially expressed circRNAs that may offer clinical value as biomarkers of the skin aging process.MethodsWe utilized an RNA-seq to profile the levels of circRNAs in eyelid tissue samples, with qRT-PCR being used to confirm these RNA-seq results, and with bioinformatics approaches being used to predict downstream target miRNAs for differentially expressed circRNAs.ResultsIn total, we identified 571 circRNAs with 348 and 223 circRNAs being up and downregulated that were differentially expressed in aged skin samples compared to young skin samples. The top 10 upregulated circRNAs in aged skin sample were hsa_circ_0123543, hsa_circ_0057742, hsa_circ_0088179, hsa_circ_0132428, hsa_circ_0094423, hsa_circ_0008166, hsa_circ_0138184, hsa_circ_0135743, hsa_circ_0114119, and hsa_circ_0131421. The top 10 reduced circRNAs were hsa_circ_0101479, hsa_circ_0003650, hsa_circ_0004249, hsa_circ_0030345, hsa_circ_0047367, hsa_circ_0055629, hsa_circ_0062955, hsa_circ_0005305, hsa_circ_0001627, and hsa_circ_0008531. Functional enrichment analyses revealed the potential functionality of these differentially expressed circRNAs. The top 3 enriched gene ontology (GO) terms of the host genes of differentially expressed circRNAs are regulation of GTPase activity, positive regulation of GTPase activity and autophagy. The top 3 enriched KEGG pathway ID are Lysine degradation, Fatty acid degradation and Inositol phosphate metabolism. The top 3 enriched reactome pathway ID are RAB GEFs exchange GTP for GDP on RABs, Regulation of TP53 Degradation and Regulation of TP53 Expression and Degradation. Six circRNAs were selected for qRT-PCR verification, of which 5 verification results were consistent with the sequencing results. Moreover, targeted miRNAs, such as hsa-miR-588, hsa-miR-612, hsa-miR-4487, hsa-miR-149-5p, hsa-miR-494-5p were predicted for circRna-miRna interaction networks.ConclusionOverall, these results offer new insights into circRNA expression profiles, potentially highlighting future avenues for research regarding the roles of these circRNAs in the context of skin aging.
Project description:CircRNAs are novel members of the non-coding RNA family. For several decades circRNAs have been known to exist, however only recently the widespread abundance has become appreciated. Annotation of circRNAs depends on sequencing reads spanning the backsplice junction and therefore map as non-linear reads in the genome. Several pipelines have been developed to specifically identify these non-linear reads and consequently predict the landscape of circRNAs based on deep sequencing datasets. Here, we use common RNAseq datasets to scrutinize and compare the output from five different algorithms; circRNA_finder, find_circ, CIRCexplorer, CIRI, and MapSplice and evaluate the levels of bona fide and false positive circRNAs based on RNase R resistance. By this approach, we observe surprisingly dramatic differences between the algorithms specifically regarding the highly expressed circRNAs and the circRNAs derived from proximal splice sites. Collectively, this study emphasizes that circRNA annotation should be handled with care and that several algorithms should ideally be combined to achieve reliable predictions.
Project description:ObjectivesThe Network on the Coordination and Harmonisation of European Occupational Cohorts (OMEGA-NET) was set up to enable optimization of the use of industrial and general population cohorts across Europe to advance aetiological research. High-quality harmonized exposure assessment is crucial to derive comparable results and to enable pooled analyses. To facilitate a harmonized research strategy, a concerted effort is needed to catalogue available occupational exposure information. We here aim to provide a first comprehensive overview of exposure assessment tools that could be used for occupational epidemiological studies.MethodsAn online inventory was set up to collect meta-data on exposure assessment tools. Occupational health researchers were invited via newsletters, editorials, and individual e-mails to provide details of job-exposure matrices (JEMs), exposure databases, and occupational coding systems and their associated crosswalks to translate codes between different systems, with a focus on Europe.ResultsMeta-data on 36 general population JEMs, 11 exposure databases, and 29 occupational coding systems from more than 10 countries have been collected up to August 2021. A wide variety of exposures were covered in the JEMs on which data were entered, with dusts and fibres (in 14 JEMs) being the most common types. Fewer JEMs covered organization of work (5) and biological factors (4). Dusts and fibres were also the most common exposures included in the databases (7 out of 11), followed by solvents and pesticides (both in 6 databases).ConclusionsThis inventory forms the basis for a searchable web-based database of meta-data on existing occupational exposure information, to support researchers in finding the available tools for assessing occupational exposures in their cohorts, and future efforts for harmonization of exposure assessment. This inventory remains open for further additions, to enlarge its coverage and include newly developed tools.
Project description:RNA, like DNA and proteins, can undergo modifications. To date, over 170 RNA modifications have been identified, leading to the emergence of a new research area known as epitranscriptomics. RNA editing is the most frequent RNA modification in mammalian transcriptomes, and two types have been identified: (1) the most frequent, adenosine to inosine (A-to-I); and (2) the less frequent, cysteine to uracil (C-to-U) RNA editing. Unlike other epitranscriptomic marks, RNA editing can be readily detected from RNA sequencing (RNA-seq) data without any chemical conversions of RNA before sequencing library preparation. Furthermore, analyzing RNA editing patterns from transcriptomic data provides an additional layer of information about the epitranscriptome. As the significance of epitranscriptomics, particularly RNA editing, gains recognition in various fields of biology and medicine, there is a growing interest in detecting RNA editing sites (RES) by analyzing RNA-seq data. To cope with this increased interest, several bioinformatic tools are available. However, each tool has its advantages and disadvantages, which makes the choice of the most appropriate tool for bench scientists and clinicians difficult. Here, we have benchmarked bioinformatic tools to detect RES from RNA-seq data. We provide a comprehensive view of each tool and its performance using previously published RNA-seq data to suggest recommendations on the most appropriate for utilization in future studies.
Project description:Circular RNAs (circRNAs) are a unique class of single-stranded RNA molecules with a closed-loop structure that confer enhanced stability, extended protein expression, and resistance to exonucleases, making them promising candidates for RNA therapeutics. In recent years, several methods have been developed to generate circRNAs, including the traditional, scar-containing Anabaena Permuted Intron-Exon (Ana-PIE) method and newer “scarless” circularization approaches. This study introduces a novel scarless circularization method, Split-Coxsackievirus B3-Anabaena Permuted Intron-Exon (Split-CVB3 Ana-PIE, SCAP). Scarless circular RNAs generated using the SCAP system were systematically compared to their scarred counterparts produced by the Ana-PIE system in terms of circularization efficiency, protein expression, stability, and immune response. The SCAP system achieved circularization efficiencies comparable to those of the Ana-PIE system while significantly enhancing protein expression. Scarless circular RNAs exhibited similar stability to scarred circular RNAs and did not trigger significant immune responses. These findings highlight the potential of scarless circular RNAs in gene therapy and vaccine development, demonstrating that removing extraneous sequences improves translation efficiency without compromising stability or immunogenicity. This study provides a foundation for the rational design of circular RNAs, with future efforts focusing on diverse target genes, optimized delivery platforms, and in vivo validation.
Project description:BackgroundStructural variations (SVs) are widespread across genome and have a great impact on evolution, disease, and phenotypic diversity. Despite the development of numerous bioinformatic tools, commonly referred to as SV callers, tailored for detecting SVs using whole genome sequence (WGS) data and employing diverse algorithms, their performance necessitates rigorous evaluation with real data and validated SVs. Moreover, a considerable proportion of these tools have been primarily designed and optimized using human genome data. Consequently, their applicability and performance in Avian species, characterized by smaller genomes and distinct genomic architectures, remain inadequately assessed.ResultsWe performed a comprehensive assessment of the performance of ten widely used SV callers using population-level real genomic data with the validated five common types of SVs. The performance of SV callers varies with the types and sizes of SVs. As compared with other tools, GRIDSS, Lumpy, Wham, and Manta present better detection accuracy. Pindel can detect more small SVs than others. CNVnator and CNVkit can detect more medium and large copy number variations. Given the poor consistency among different SV callers, the combination calling strategy is not recommended. All tools show poor ability in the detection of insertions (especially with size > 150 bp). At least 50× read depth is required to detect more than 80% of the SVs for most tools.ConclusionsThis study highlights the importance and necessity of using real sequencing data, rather than simulated data only, with validated SVs for SV caller evaluation. Some practical guidance and suggestions are provided for SV detection in future researches.
Project description:Genome assembly is typically a two-stage process: contig assembly followed by the use of paired sequencing reads to join contigs into scaffolds. Scaffolds are usually the focus of reported assembly statistics; longer scaffolds greatly facilitate the use of genome sequences in downstream analyses, and it is appealing to present larger numbers as metrics of assembly performance. However, scaffolds are highly prone to errors, especially when generated using short reads, which can directly result in inflated assembly statistics.Here we provide the first independent evaluation of scaffolding tools for second-generation sequencing data. We find large variations in the quality of results depending on the tool and dataset used. Even extremely simple test cases of perfect input, constructed to elucidate the behaviour of each algorithm, produced some surprising results. We further dissect the performance of the scaffolders using real and simulated sequencing data derived from the genomes of Staphylococcus aureus, Rhodobacter sphaeroides, Plasmodium falciparum and Homo sapiens. The results from simulated data are of high quality, with several of the tools producing perfect output. However, at least 10% of joins remains unidentified when using real data.The scaffolders vary in their usability, speed and number of correct and missed joins made between contigs. Results from real data highlight opportunities for further improvements of the tools. Overall, SGA, SOPRA and SSPACE generally outperform the other tools on our datasets. However, the quality of the results is highly dependent on the read mapper and genome complexity.
Project description:Despite the scientific relevance of circular RNAs (circRNAs), the study of these RNAs in non-model organisms, especially in sheep, is still in its infancy. On the other hand, while some studies have focused on sheep circRNA identification in a limited number of tissues, there is a lack of comprehensive analysis that profile circRNA expression patterns across the tissues not yet investigated. In this study, 61 public RNA sequencing datasets from 12 different tissues were uniformly analyzed to identify circRNAs, profile their expression and investigate their various characteristics. We reported for the first time a circRNA expression landscape with functional annotation in sheep tissues not yet investigated including hippocampus, BonMarrowMacrophage, left-ventricle, thymus, ileum, reticulum and 23-day-embryo. A stringent computational pipeline was employed and 8919 exon-derived circRNAs with high confidence were identified, including 88 novel circRNAs. Tissue-specificity analysis revealed that 3059 circRNAs were tissue-specific, which were also more specific to the tissues than linear RNAs. The highest number of tissue-specific circRNAs was found in kidney, hippocampus and thymus, respectively. Co-expression analysis revealed that expression of circRNAs may not be affected by their host genes. While most of the host genes produced more than one isoform, only one isoform had dominant expression across the tissues. The host genes of the tissue-specific circRNAs were significantly enriched in biological/pathways terms linked to the important functions of their corresponding tissues, suggesting potential roles of circRNAs in modulating physiological activity of those tissues. Interestingly, functional terms related to the regulation and various signaling pathways were significantly enriched in all tissues, suggesting some common regulatory mechanisms of circRNAs to modulate the physiological functions of tissues. Finding of the present study provide a valuable resource for depicting the complexity of circRNAs expression across tissues of sheep, which can be useful for the field of sheep genomic and veterinary research.