Project description:Most known genetic variation in human genomes has been called from comparison of short reads to the reference genome, an approach biased against finding complex variation. We sequenced 150 individuals from 50 parent-offspring trios with multiple insert-size libraries to very high coverage. We show that each genome could be independently de novo assembled into a small number of high-quality scaffolds (median N50 > 21 Mb), each of quality comparable to long read assemblies while being very cost-effective. We show that our variant call set from comparing de novo assemblies is far more complete in terms of complex variation than previous studies. Importantly, even the complex 4-5 Mb extended MHC region was assembled and resolved into haplotypes, revealing >700kb novel sequence in this important region of the genome, and major parts of the Y chromosome including some palindromes were assembled with high accuracy. Finally, we show that our variant call-set allows for the genotyping of many more complex variants when used as a reference-panel for imputation into SNP-chip data or into previously resequenced genomes.
Project description:Purpose: The goal of this study is to provided a comprehensive genomic information for functional genomic studies in Q. mongolica. Methods:The Quercus mongolica leaves were generated by deep sequencing, using Illumina Hiseq 4000. The high-quality reads were obtained by removing the reads that contained adaptor contamination, low quality bases and undetermined bases.The transcriptome were de novo assembly. Results:A total of 52934562 raw reads were obtained from Illumina sequencing platform. After filtering out the low quality reads, we obtained 52076914 clean reads, which assembled into 39130 transcripts with a mean length of 742 bp and GC content of 42.12%, and 24196 unigenes with a mean length of 732 bp and GC content of 42.34%, based on Trinity assembly platform. Conclusions:RNA-Seq was applied to polyadenylate-enriched mRNAs from leaves of Q. mongolica to obtain the transcriptome. De novo assembly was then applied followed by gene annotation and functional classification. The SSRs and SNPs were also obtained using assembled transcripts as reference sequences. The results of this study lay the foundation for further research on genetic diversity of Quercus.
Project description:Purpose: The goal of this study is to screen the candidate genes involved in drought avoidance of Q. liaotungensis Methods:The Q. liaotungensis leaves were generated by deep sequencing, using Illumina Hiseq 4000. The high-quality reads were obtained by removing the reads that contained adaptor contamination, low quality bases and undetermined bases.The transcriptome were de novo assembly. Results:A total of 54153182 raw reads were obtained from Illumina sequencing platform, and 53021436 clean reads were generated after filtering out the low quality reads. The clean reads were assembled into 41207 transcripts with median length 704 and GC content 42.17%, and 25593 unigenes with median length 687 and GC content 42.31%, based on Trinity assembly platform Conclusions:RNA-Seq was applied to polyadenylate-enriched mRNAs from leaves of Q. liaotungensis to obtain the transcriptome. De novo assembly was then applied followed by gene annotation and functional classification. The SSRs and SNPs were also obtained using assembled transcripts as reference sequences. The results of this study lay the foundation for further research on genetic diversity of Quercus.
Project description:These data corresponds to RNA-Seq assays obtained from the body wall of farmed I. badionotus juveniles samples. These raw reads were used to evaluate differential gene expression between wild and farmed Isostichopus badionotus specimens. With this aim, a de-novo transcriptome assembled from wild specimens was used as reference. Further information about de-novo assembled transcriptome is available within the BioProject PRJNA639785.
Project description:We developed a transcriptome resource for Douglas-fir covering key developmental stages of megagametophytes over time: prefertilization, fertilization, embryogenesis, and early, unfertilized abortion. Extracted RNA was sequenced using large-scale sequencing and reads were assembled to generate a de novo reference transcriptome of 105,505 predicted high-confidence transcripts. Expression levels were estimated based on alignment of the original reads to the reference.
2016-06-16 | GSE83425 | GEO
Project description:De novo assembled genomes of Belliella spp. (Cyclobacteriaceae) strains
Project description:itis vinifera cv. Tannat is largely cultivated in Uruguay for the production of high quality red wines. Its most notable characteristic is an elevated content of polyphenolic compounds, which provide an intense purple color and remarkable antioxidant properties to the wine. To characterize the genetic components encoding this important phenotypic characteristic, the genome of the Uruguayan Tannat clone UY11 was sequenced to 134X coverage using the Illumina technology and assembled with a mixed approach of de novo assembly and iterative mapping on the PN40024 reference genome. An approach based on both reference-guided annotation and de novo transcript assembly of RNA-Seq data allowed the definition of 3,673 genes not previously annotated in PN40024 that we consider novel, and the discovery of 2,228 genes not shared with the grapevine reference genome that we consider private to Tannat. Expression analysis showed that private genes contributed substantially (more than 50%) to the overall expression of enzymes involved in phenol and polyphenol biosynthesis indicating that the dispensable portion of the grapevine genome contains many private genes which are likely to contribute to the peculiar phenotypic characteristics of this grapevine variety.
Project description:Background The Lycophyta species are the extant taxa most similar to early vascular plants that were once abundant on Earth. However, their distribution has greatly diminished. So far, the absence of chromosome level assembled lycophyte genomes, has hindered our understanding of evolution and environmental adaption of lycophytes. Findings We present the reference genome of the tetraploid aquatic quillwort, Isoetes sinensis, a lycophyte. This genome represents the first chromosome-level assembled genome of a tetraploid seed-free plant. Comparison of genomes between I. sinensis and the diploid I. taiwanensis revealed of genomic features and polyploid of lycophytes. Comparison of the I. sinensis genome with those of other species representing the evolutionary lineages of green plants revealed the inherited genetic tools for transcriptional regulation and most phytohormones in I. sinensis. The presence and absence of key genes related to development and stress responses provides insights into environmental adaption of lycophytes. Conclusions The high-quality reference genome and genomic analysis presented in this study are crucial for future genetic research and the conservation of not only I. sinensis but also other lycophytes.
Project description:Background The Lycophyta species are the extant taxa most similar to early vascular plants that were once abundant on Earth. However, their distribution has greatly diminished. So far, the absence of chromosome level assembled lycophyte genomes, has hindered our understanding of evolution and environmental adaption of lycophytes. Findings We present the reference genome of the tetraploid aquatic quillwort, Isoetes sinensis, a lycophyte. This genome represents the first chromosome-level assembled genome of a tetraploid seed-free plant. Comparison of genomes between I. sinensis and the diploid I. taiwanensis revealed of genomic features and polyploid of lycophytes. Comparison of the I. sinensis genome with those of other species representing the evolutionary lineages of green plants revealed the inherited genetic tools for transcriptional regulation and most phytohormones in I. sinensis. The presence and absence of key genes related to development and stress responses provides insights into environmental adaption of lycophytes. Conclusions The high-quality reference genome and genomic analysis presented in this study are crucial for future genetic research and the conservation of not only I. sinensis but also other lycophytes.