Project description:We describe an application of deep sequencing and de novo assembly of short RNA reads to investigate small interfering (si)RNAs mediated immunity in leaf samples from eight tree taxa naturally occurring in Wytham Woods, Oxfordshire, UK. BLAST search for homologues of contigs in the GenBank identified siRNA populations against a number of RNA viruses and a Ty1-copia retrotransposons in these tree species. Small RNA sequencing and de novo assembly
Project description:Shotgun protein sequencing with meta-contig assembly.
Full-length de novo sequencing from tandem mass (MS/MS) spectra of unknown proteins such as antibodies or proteins from organisms with unsequenced genomes remains a challenging open problem. Conventional algorithms designed to individually sequence each MS/MS spectrum are limited by incomplete peptide fragmentation or low signal to noise ratios and tend to result in short de novo sequences at low sequencing accuracy. Our shotgun protein sequencing (SPS) approach was developed to ameliorate these limitations by first finding groups of unidentified spectra from the same peptides (contigs) and then deriving a consensus de novo sequence for each assembled set of spectra (contig sequences). But whereas SPS enables much more accurate reconstruction of de novo sequences longer than can be recovered from individual MS/MS spectra, it still requires error-tolerant matching to homologous proteins to group smaller contig sequences into full-length protein sequences, thus limiting its effectiveness on sequences from poorly annotated proteins. Using low and high resolution CID and high resolution HCD MS/MS spectra, we address this limitation with a Meta-SPS algorithm designed to overlap and further assemble SPS contigs into Meta-SPS de novo contig sequences extending as long as 100 amino acids at over 97% accuracy without requiring any knowledge of homologous protein sequences. We demonstrate Meta-SPS using distinct MS/MS data sets obtained with separate enzymatic digestions and discuss how the remaining de novo sequencing limitations relate to MS/MS acquisition settings.
Project description:In this study, we aim to present a global transcriptome analysis of medicinal plant, Catharanthus roseus. We generated about 343 million high-quality reads from three tissues (leaf, root and flower) using Illumina platform. We performed an optimized de novo assembly of the reads and estimated transcript abundance in different tissue samples. The transcriptome dynamics was studied by differential gene expression analyses among tissue samples. We collected different tissue samples from the mature plants. Total RNA isolated from these tissue samples was subjected to Illumina sequencing. The sequence data was further filtered using NGS QC Toolkit to obtain high-quality reads. The filtered reads were used for de novo assembly optimization. The reads were further mapped to the Catharanthus transcripts via CLC Genomics Workbench and differential gene expression analysis was performed using DESeq software.
Project description:For this project, we have sequenced, assembled and annotated a transcriptome of a diploid wheat Triticum urartu accession PI 428198. The sequencing libraries were prepared from shoot and root tissues harvested from 2-3 week old seedlings. All sequencing was carried out on the Illumina HiSeq platform using the 100 bp pair-end protocol (248.5 million reads). The assembly was constructed using a multiple k-mer approach with a de novo assembly algorithm implemented in CLC Genomics Workbench 5.5 and additional redundancy reduction with CD-HIT and blast2cap3 programs. Open reading frames and proteins were predicted using BLASTX searches and a findorf algorithm.
Project description:We first report the use of next-generation massively parallel sequencing technologies and de novo transcriptome assembly to gain insight into the wide range of transcriptome of Hevea brasiliensis. The output of sequenced data showed that more than 12 million sequence reads with average length of 90nt were generated. Totally 48,768 unigenes (mean size = 488 bp) were assembled through transcriptome de novo assembly, which represent more than 3-fold of all the sequences of Hevea brasiliensis deposited in the GenBank. Assembled sequences were annotated with gene descriptions, gene ontology and clusters of orthologous group terms. Total 37,373 unigenes were successfully annotated and more than 10% of unigenes were aligned to known proteins of Euphorbiaceae. The unigenes contain nearly complete collection of known rubber-synthesis-related genes. Our data provides the most comprehensive sequence resource available for study rubber tree and demonstrates the availability of Illumina sequencing and de novo transcriptome assembly in a species lacking genome information. The transcriptome of latex and leaf in Hevea brasiliensis
Project description:RNA-Seq data were targeted for de novo assembly and reconstruction of full-length mouse transcripts. Sequencing of RNA taken from unstimulated DCs.
Project description:Purpose: The goal of this study is to provided a comprehensive genomic information for functional genomic studies in Q. mongolica. Methods:The Quercus mongolica leaves were generated by deep sequencing, using Illumina Hiseq 4000. The high-quality reads were obtained by removing the reads that contained adaptor contamination, low quality bases and undetermined bases.The transcriptome were de novo assembly. Results:A total of 52934562 raw reads were obtained from Illumina sequencing platform. After filtering out the low quality reads, we obtained 52076914 clean reads, which assembled into 39130 transcripts with a mean length of 742 bp and GC content of 42.12%, and 24196 unigenes with a mean length of 732 bp and GC content of 42.34%, based on Trinity assembly platform. Conclusions:RNA-Seq was applied to polyadenylate-enriched mRNAs from leaves of Q. mongolica to obtain the transcriptome. De novo assembly was then applied followed by gene annotation and functional classification. The SSRs and SNPs were also obtained using assembled transcripts as reference sequences. The results of this study lay the foundation for further research on genetic diversity of Quercus.
Project description:Purpose: The goal of this study is to screen the candidate genes involved in drought avoidance of Q. liaotungensis Methods:The Q. liaotungensis leaves were generated by deep sequencing, using Illumina Hiseq 4000. The high-quality reads were obtained by removing the reads that contained adaptor contamination, low quality bases and undetermined bases.The transcriptome were de novo assembly. Results:A total of 54153182 raw reads were obtained from Illumina sequencing platform, and 53021436 clean reads were generated after filtering out the low quality reads. The clean reads were assembled into 41207 transcripts with median length 704 and GC content 42.17%, and 25593 unigenes with median length 687 and GC content 42.31%, based on Trinity assembly platform Conclusions:RNA-Seq was applied to polyadenylate-enriched mRNAs from leaves of Q. liaotungensis to obtain the transcriptome. De novo assembly was then applied followed by gene annotation and functional classification. The SSRs and SNPs were also obtained using assembled transcripts as reference sequences. The results of this study lay the foundation for further research on genetic diversity of Quercus.