Project description:Deltamethrin is an important pesticide widely used against ectoparasites. Deltamethrin contamination has resulted in a threat to the healthy breeding of the Chinese mitten crab, Eriocheir sinensis. In this study, we investigated transcriptional responses in the hepatopancreas of E. sinensis exposed to deltamethrin. We obtained 99,087,448, 89,086,478, and 100,117,958 raw sequence reads from control 1, control 2, and control 3 groups, and 92,094,972, 92,883,894, and 92,500,828 raw sequence reads from test 1, test 2, and test 3 groups, respectively. After filtering and quality checking of the raw sequence reads, our analysis yielded 79,228,354, 72,336,470, 81,859,826, 77,649,400, 77,194,276, and 75,697,016 clean reads with a mean length of 150 bp from the control and test groups. After deltamethrin treatment, a total of 160 and 167 genes were significantly upregulated and downregulated, respectively. Gene ontology terms "biological process," "cellular component," and "molecular function" were enriched with respect to cell killing, cellular process, other organism part, cell part, binding, and catalytic. Pathway analysis using the Kyoto Encyclopedia of Genes and Genomes showed that the metabolic pathways were significantly enriched. We found that the CYP450 enzyme system, carboxylesterase, glutathione-S-transferase, and material (including carbohydrate, lipid, protein, and other substances) metabolism played important roles in the metabolism of deltamethrin in the hepatopancreas of E. sinensis. This study revealed differentially expressed genes related to insecticide metabolism and detoxification in E. sinensis for the first time and will help in understanding the toxicity and molecular metabolic mechanisms of deltamethrin in E. sinensis.
Project description:The CXCR5 (C-X-C motif chemokine receptor 5) is chemokine transmembrane receptor, acting via its ligand CXCL13 and plays a crucial role in controlling the trafficking of inflammatory cells into and from the sub-retinal space, which contributes to the pathogenesis of AMD. We have previously described the genetic ablation of CXCR5 deficiency causes RPE/choroid abnormalities and retinal degeneration (RD) in aged mice. Here we report the transcriptome data (RNA-Seq) of 24 months old CXCR5 knockout (KO) and age-matched C57BL/6 controls (WT). RNA sequencing was performed on the Illumina HiSeq 2500, providing up to 300 GB of sequence information per flow cell. The quality of RNA-seq libraries, RNA intensity were validated by Agilent Technologies Bioanalyzer-2100. The raw datasets contains on average 292,004,59 reads (after trimming 284,862,43 reads) in retina and 272,527,90 reads (after trimming 266,173,11 reads) in choroid samples. The mapped reads showed that a total of 1586 genes in retina and 1462 genes in choroid are differentially expressed in this experiment. The raw datasets were deposited into NCBI Sequence Read Archive (SRA) database and can be accessed via accession number PRJNA588421.
Project description:Monotropa hypopitys (pinesap) is a non-photosynthetic obligately mycoheterotrophic plant of the family Ericaceae. It obtains the carbon and other nutrients from the roots of surrounding autotrophic trees through the associated mycorrhizal fungi. In order to understand the evolutionary changes in the plant genome associated with transition to a heterotrophic lifestyle, we performed de novo transcriptomic analysis of M. hypopitys using next-generation sequencing. We obtained the RNA-Seq data from flowers, flower bracts and roots with haustoria using Illumina HiSeq2500 platform. The raw data obtained in this study can be available in NCBI SRA database with accession number of SRP069226. A total of 10.3 GB raw sequence data were obtained, corresponding to 103,357,809 raw reads. A total of 103,025,683 reads were filtered after removing low-quality reads and trimming the adapter sequences. The Trinity program was used to de novo assemble 98,349 unigens with an N50 of 1342 bp. Using the TransDecoder program, we predicted 43,505 putative proteins. 38,416 unigenes were annotated in the Swiss-Prot protein sequence database using BLASTX. The obtained transcriptomic data will be useful for further studies of the evolution of plant genomes upon transition to a non-photosynthetic lifestyle and the loss of photosynthesis-related functions.
Project description:To date, biologists have discovered a large amount of valuable information from assembled genomes, but the abundant microbial data that is hidden in the raw genomic sequence data of plants and animals is usually ignored. In this study, the richness and composition of fungal community were determined in the raw genomic sequence data of Ceratosolen solmsi (RGSD-CS).To avoid the interference from sequences of C. solmsi, the unmapped raw data (about 17.1%) was obtained by excluding the assembled genome of C. solmsi from RGSD-CS. Comparing two fungal reference datasets, internal transcribed spacer (ITS) and large ribosomal subunit (LSU) of rRNA, the ITS dataset discovered a more diverse fungal community and was therefore selected as the reference dataset for evaluating the fungal community based on the unmapped raw data. The threshold of 95% sequence identity revealed many more matched fungal reads and fungal richness in the unmapped raw data than those by identities above 95%. Based on the threshold of 95% sequence identity, the fungal community of RGSD-CS was primarily composed of Saccharomycetes (88.4%) and two other classes (Agaricomycetes and Sordariomycetes, 8.3% in total). Compared with the fungal community of other reported fig wasps, Agaricomycetes and Eurotiomycetes were found to be unique to C. solmsi. In addition, the ratio of total fungal reads to RGSD-CS was estimated to be at least 4.8?×?10(-3), which indicated that a large amount of fungal data was contained in RGSD-CS. However, rarefaction measure indicated that a deeper sequencing coverage with RGSD-CS was required to discover the entire fungal community of C. solmsi.This study investigated the richness and composition of fungal community in RGSD-CS and provided new insights into the efficient study of microbial diversity using raw genomic sequence data.
Project description:BACKGROUND:Analyses that use genome assemblies are critically affected by the contiguity, completeness, and accuracy of those assemblies. In recent years single-molecule sequencing techniques generating long-read information have become available and enabled substantial improvement in contig length and genome completeness, especially for large genomes (>100 Mb), although bioinformatic tools for these applications are still limited. FINDINGS:We developed a software tool to close sequence gaps in genome assemblies, TGS-GapCloser, that uses low-depth (?10×) long single-molecule reads. The algorithm extracts reads that bridge gap regions between 2 contigs within a scaffold, error corrects only the candidate reads, and assigns the best sequence data to each gap. As a demonstration, we used TGS-GapCloser to improve the scaftig NG50 value of 3 human genome assemblies by 24-fold on average with only ?10× coverage of Oxford Nanopore or Pacific Biosciences reads, covering with sequence data up to 94.8% gaps with 97.7% positive predictive value. These improved assemblies achieve 99.998% (Q46) single-base accuracy with final inserted sequences having 99.97% (Q35) accuracy, despite the high raw error rate of single-molecule reads, enabling high-quality downstream analyses, including up to a 31-fold increase in the scaftig NGA50 and up to 13.1% more complete BUSCO genes. Additionally, we show that even in ultra-large genome assemblies, such as the ginkgo (?12 Gb), TGS-GapCloser can cover 71.6% of gaps with sequence data. CONCLUSIONS:TGS-GapCloser can close gaps in large genome assemblies using raw long reads quickly and cost-effectively. The final assemblies generated by TGS-GapCloser have improved contiguity and completeness while maintaining high accuracy. The software is available at https://github.com/BGI-Qingdao/TGS-GapCloser.
Project description:Mangosteen (Garcinia mangostana L.) is known for its delectable taste and contains high amount of xanthones which have been reported to possess anti-cancer, anti-inflammatory and other bioactive properties. However, stage-specific regulation of mangosteen fruit ripening has never been studied in detail. We have performed a comparative transcriptomic analysis of three ripening stages (Stage 0, 2 and 6) of mangosteen. We have obtained a raw data from six libraries through Illumina HiSeq 4000. A total of ~ 40 Gb of raw data were generated. Clean reads of 650,887,650 (bp) were obtained from 656,913,570 (bp) raw reads. The raw transcriptome data were deposited to SRA database, with the BioProject accession number of PRJNA339916. These data will be beneficial for transcriptome profiling in order to study the regulation of mangosteen fruit ripening. The lack of a complete sequence database from this species impedes protein identification. These data sets provide a reference data for the exploration of novel genes or proteins to understand mangosteen fruit ripening behaviour.
Project description:Elettaria cardamomum (L.) Maton, known as 'queen of spices, is a perennial herbaceous monocot of the family Zingiberaceae, native to southern India. Cardamom is an economically valuable spice crop and used widely in culinary and medicinal purposes. In the present study, using Ion Proton RNA sequencing technology, we performed transcriptome sequencing and de novo transcriptome assembly of a wild and five cultivar genotypes of cardamom. RNA-seq generated a total of 22,811,983 (92 base) and 24,889,197 (75 base) raw reads accounting for approximately 8.21GB and 7.65GB of sequence data for wild and cultivar genotypes of cardamom respectively. The raw data were submitted to SRA database of NCBI under the accession numbers SRX1141272 (wild) and SRX1141276 (cultivars). The raw reads were quality filtered and assembled using MIRA assembler resulted with 112,208 and 264,161contigs having N50 value 616 and 664 for wild and cultivar cardamom respectively. The assembled unigenes were functionally annotated using several databases including PlantCyc for pathway annotation. This work represents the first report on cardamom transcriptome sequencing. In order to generate a comprehensive reference transcriptome, we further assembled the raw reads of wild and cultivar genotypes which might enrich the plant transcriptome database and trigger advanced research in cardamom genomics.
Project description:Genome sequencing is rapidly being adopted in reference labs and hospitals for bacterial outbreak investigation and diagnostics where time is critical. Seven gene multi-locus sequence typing is a standard tool for broadly classifying samples into sequence types (STs), allowing, in many cases, to rule a sample out of an outbreak, or allowing for general characteristics about a bacterial strain to be inferred. Long-read sequencing technologies, such as from Oxford Nanopore, can produce read data within minutes of an experiment starting, unlike short-read sequencing technologies which require many hours/days. However, the error rates of raw uncorrected long read data are very high. We present Krocus which can predict a ST directly from uncorrected long reads, and which was designed to consume read data as it is produced, providing results in minutes. It is the only tool which can do this from uncorrected long reads. We tested Krocus on over 700 isolates sequenced using long-read sequencing technologies from Pacific Biosciences and Oxford Nanopore. It provides STs for isolates on average within 90 s, with a sensitivity of 94% and specificity of 97% on real sample data, directly from uncorrected raw sequence reads. The software is written in Python and is available under the open source license GNU GPL version 3.
Project description:We have developed an NGS-based deep bisulfite sequencing protocol for the DNA methylation analysis of genomes. This approach allows the rapid and efficient construction of NGS-ready libraries with a large number of PCR products that have been individually amplified from bisulfite-converted DNA. This approach also employs a bioinformatics strategy to sort the raw sequence reads generated from NGS platforms and subsequently to derive DNA methylation levels for individual loci. The results demonstrated that this NGS-based deep bisulfite sequencing approach provide not only DNA methylation levels but also informative DNA methylation patterns that have not been seen through other existing methods.•This protocol provides an efficient method generating NGS-ready libraries from individually amplified PCR products.•This protocol provides a bioinformatics strategy sorting NGS-derived raw sequence reads.•This protocol provides deep bisulfite sequencing results that can measure DNA methylation levels and patterns of individual loci.
Project description:Purpose: In order to understand the functional significance of sperm transcriptome in stallion fertility, the aim of this study was to generate a detailed body of knowledge about the sperm RNA profile that defines a normal fertile stallion. Methods: The 50 bp single-end ABI SOLiD raw reads were directly aligned with the horse reference sequence EcuCab2 using ABI aligner software (NovoalignCS version 1.00.09, novocraft.com) which uses multiple indexes in the reference genome, identifies candidate alignment locations for each primary read, and allows completion of the alignment. Results: Next generation sequencing (NGS) of total RNA from the sperm of two reproductively normal stallions generated about 70 million raw reads and more than 3 Gb of sequence per sample; over half of these aligned with the EcuCab2 reference genome. Altogether, 19,257 sequence tags with average coverage ?1 (normalized number of transcripts) were mapped in the horse genome. Conclusion: The sequence of stallion sperm transcriptome is an important foundation for the discovery of transcripts of known and novel genes, and non-coding RNAs, thus improving the annotation of the horse genome sequence draft and providing markers for evaluating stallion fertility. Reproductively fertile Stallion sperm transcriptome as revealed by RNA sequencing