Project description:Parafilaroides decorus, also known as sea lion lungworm, is a metastrongyloid nematode that infects otariid hosts, such as the charismatic California sea lion, Zalophus californianus. P. decorus causes bronchointerstitial pneumonia, respiratory distress, reduced ability to swim, dive and hunt and as a result, increased mortality particularly in young animals. Respiratory disease is a leading cause of stranding and admission to rehabilitation centers on the Pacific coast. Low-coverage genomic sequencing of four P. decorus individuals analyzed through Galaxy's RepeatExplorer identified a novel repeat DNA family we employed to design a sensitive quantitative PCR (qPCR) assay for diagnosing infections from fecal or sputum samples. The assay detects as little as 10 fg of P. decorus DNA and a linear regression model developed using a standard curve can be used to estimate the concentration of P. decorus DNA in a sample, ± 0.015 ng. This knowledge can be leveraged to estimate the level of parasite burden, which can be used to design improved treatments for animals in rehabilitation. Improved treatment of infections will aid in more animals being successfully released back into the wild.
Project description:Genomic conflicts may play a central role in the evolution of reproductive barriers. Theory predicts that early-onset hybrid inviability may stem from conflict between parents for resource allocation to offspring. Here, we describe M. decorus: a group of cryptic species within the M. guttatus species complex that are largely reproductively isolated by hybrid seed inviability (HSI). HSI between M. guttatus and M. decorus is common and strong, but populations of M. decorus vary in the magnitude and directionality of HSI with M. guttatus. Patterns of HSI between M. guttatus and M. decorus, as well as within M. decorus, conform to the predictions of parental conflict: first, reciprocal F1s exhibit size differences and parent-of-origin-specific endosperm defects; second, the extent of asymmetry between reciprocal F1 seed size is correlated with asymmetry in HSI; and third, inferred differences in the extent of conflict predict the extent of HSI between populations. We also find that HSI is rapidly evolving, as populations that exhibit the most HSI are each others' closest relative. Lastly, although all populations appear largely outcrossing, we find that the differences in the inferred strength of conflict scale positively with ?, suggesting that demographic or life history factors other than transitions to self-fertilization may influence the rate of parental-conflict-driven evolution. Overall, these patterns suggest the rapid evolution of parent-of-origin-specific resource allocation alleles coincident with HSI within and between M. guttatus and M. decorus. Parental conflict may therefore be an important evolutionary driver of reproductive isolation.
Project description:Third-generation sequencing technologies allow to sequence long reads of tens of kbp, that are expected to solve various problems. However, they display high error rates, currently capped around 10%. Self-correction is thus regularly used in long reads analysis projects. We introduce CONSENT, a new self-correction method that relies both on multiple sequence alignment and local de Bruijn graphs. To ensure scalability, multiple sequence alignment computation benefits from a new and efficient segmentation strategy, allowing a massive speedup. CONSENT compares well to the state-of-the-art, and performs better on real Oxford Nanopore data. Specifically, CONSENT is the only method that efficiently scales to ultra-long reads, and allows to process a full human dataset, containing reads reaching up to 1.5 Mbp, in 10 days. Moreover, our experiments show that error correction with CONSENT improves the quality of Flye assemblies. Additionally, CONSENT implements a polishing feature, allowing to correct raw assemblies. Our experiments show that CONSENT is 2-38x times faster than other polishing tools, while providing comparable results. Furthermore, we show that, on a human dataset, assembling the raw data and polishing the assembly is less resource consuming than correcting and then assembling the reads, while providing better results. CONSENT is available at https://github.com/morispi/CONSENT .
Project description:Deltamethrin is an important pesticide widely used against ectoparasites. Deltamethrin contamination has resulted in a threat to the healthy breeding of the Chinese mitten crab, Eriocheir sinensis. In this study, we investigated transcriptional responses in the hepatopancreas of E. sinensis exposed to deltamethrin. We obtained 99,087,448, 89,086,478, and 100,117,958 raw sequence reads from control 1, control 2, and control 3 groups, and 92,094,972, 92,883,894, and 92,500,828 raw sequence reads from test 1, test 2, and test 3 groups, respectively. After filtering and quality checking of the raw sequence reads, our analysis yielded 79,228,354, 72,336,470, 81,859,826, 77,649,400, 77,194,276, and 75,697,016 clean reads with a mean length of 150 bp from the control and test groups. After deltamethrin treatment, a total of 160 and 167 genes were significantly upregulated and downregulated, respectively. Gene ontology terms "biological process," "cellular component," and "molecular function" were enriched with respect to cell killing, cellular process, other organism part, cell part, binding, and catalytic. Pathway analysis using the Kyoto Encyclopedia of Genes and Genomes showed that the metabolic pathways were significantly enriched. We found that the CYP450 enzyme system, carboxylesterase, glutathione-S-transferase, and material (including carbohydrate, lipid, protein, and other substances) metabolism played important roles in the metabolism of deltamethrin in the hepatopancreas of E. sinensis. This study revealed differentially expressed genes related to insecticide metabolism and detoxification in E. sinensis for the first time and will help in understanding the toxicity and molecular metabolic mechanisms of deltamethrin in E. sinensis.
Project description:The Sequence Distance Graph (SDG) framework works with genome assembly graphs and raw data from paired, linked and long reads. It includes a simple deBruijn graph module, and can import graphs using the graphical fragment assembly (GFA) format. It also maps raw reads onto graphs, and provides a Python application programming interface (API) to navigate the graph, access the mapped and raw data and perform interactive or scripted analyses. Its complete workspace can be dumped to and loaded from disk, decoupling mapping from analysis and supporting multi-stage pipelines. We present the design and implementation of the framework, and example analyses scaffolding a short read graph with long reads, and navigating paths in a heterozygous graph for a simulated parent-offspring trio dataset. SDG is freely available under the MIT license at https://github.com/bioinfologics/sdg.
Project description:<i>Oedaleus decorus asiaticus</i> (Bey-Bienko) is a destructive pest in grasslands and adjacent farmland in northern China, Mongolia, and other countries in Asia. It has been supposed that this insect pest can migrate a long distance and then induce huge damages, however, the migration mechanism is still unrevealed. The current study uses insect light trap data from Yanqing (Beijing), together with regional meteorological data to determine how air flow contributes to the long-distance migration of <i>O. decorus asiaticus</i>. Our results indicate that sinking airflow is the main factor leading to the insects' forced landing, and the prevailing northwest wind was associated with <i>O. decorus asiaticus</i> taking off in the northwest and moving southward with the airflow from July to September. Meanwhile, the insects have a strong migratory ability, flying along the airflow for several nights. Thus, when the airflow from the northwest met the northward-moving warm current from the southwest, a large number of insects were dropped due to sinking airflow, resulting in a large outbreak. Our simulations suggest that the source of the grasshoppers involved in these outbreaks during early 2000s in northern China probably is in Mongolia, and all evidence indicates that there are two important immigrant routes for <i>O. decorus asiaticus</i> migration from Mongolia to Beijing. These findings improves our understanding of the factors guiding <i>O. decorus asiaticus</i> migration, providing valuable information to reduce outbreaks in China that have origins from outside the country.
Project description:Rafflesia cantleyi, known as one of the world's largest flowers, is a specialised holoparasite due to dramatic morphological modifications. It possesses highly reduced vegetative structure and only appears as a flower for sexual reproduction. Moreover, it has an unusual life cycle in that its floral bud development takes up to nine months. In order to fully understand the highly modified floral organ structure and long life cycle of R. cantleyi, we used Illumina sequencing technology (HiSeq) for sequence generation followed by de novo assembly of sequence reads. We obtained the RNA-seq data from three different stages of floral bud, representing the early, mid and advanced developmental stages. These data are available via BioProject accession number PRJNA378435. More than 10.3 Gb raw sequence data were generated, corresponding to 102,203,042 raw reads. Following removal of low-quality reads and trimming of adapter sequences, a total of 91,638,836 reads were obtained. De novo assembly of these sequences using Trinity resulted in 89,690 unique transcripts with an N50 of 1653 bp. The obtained transcriptomic data will be useful for further study to understand the molecular interactions that result in R. cantleyi floral development.
Project description:The CXCR5 (C-X-C motif chemokine receptor 5) is chemokine transmembrane receptor, acting via its ligand CXCL13 and plays a crucial role in controlling the trafficking of inflammatory cells into and from the sub-retinal space, which contributes to the pathogenesis of AMD. We have previously described the genetic ablation of CXCR5 deficiency causes RPE/choroid abnormalities and retinal degeneration (RD) in aged mice. Here we report the transcriptome data (RNA-Seq) of 24 months old CXCR5 knockout (KO) and age-matched C57BL/6 controls (WT). RNA sequencing was performed on the Illumina HiSeq 2500, providing up to 300 GB of sequence information per flow cell. The quality of RNA-seq libraries, RNA intensity were validated by Agilent Technologies Bioanalyzer-2100. The raw datasets contains on average 292,004,59 reads (after trimming 284,862,43 reads) in retina and 272,527,90 reads (after trimming 266,173,11 reads) in choroid samples. The mapped reads showed that a total of 1586 genes in retina and 1462 genes in choroid are differentially expressed in this experiment. The raw datasets were deposited into NCBI Sequence Read Archive (SRA) database and can be accessed via accession number PRJNA588421.
Project description:BACKGROUND:Analyses that use genome assemblies are critically affected by the contiguity, completeness, and accuracy of those assemblies. In recent years single-molecule sequencing techniques generating long-read information have become available and enabled substantial improvement in contig length and genome completeness, especially for large genomes (>100 Mb), although bioinformatic tools for these applications are still limited. FINDINGS:We developed a software tool to close sequence gaps in genome assemblies, TGS-GapCloser, that uses low-depth (?10×) long single-molecule reads. The algorithm extracts reads that bridge gap regions between 2 contigs within a scaffold, error corrects only the candidate reads, and assigns the best sequence data to each gap. As a demonstration, we used TGS-GapCloser to improve the scaftig NG50 value of 3 human genome assemblies by 24-fold on average with only ?10× coverage of Oxford Nanopore or Pacific Biosciences reads, covering with sequence data up to 94.8% gaps with 97.7% positive predictive value. These improved assemblies achieve 99.998% (Q46) single-base accuracy with final inserted sequences having 99.97% (Q35) accuracy, despite the high raw error rate of single-molecule reads, enabling high-quality downstream analyses, including up to a 31-fold increase in the scaftig NGA50 and up to 13.1% more complete BUSCO genes. Additionally, we show that even in ultra-large genome assemblies, such as the ginkgo (?12 Gb), TGS-GapCloser can cover 71.6% of gaps with sequence data. CONCLUSIONS:TGS-GapCloser can close gaps in large genome assemblies using raw long reads quickly and cost-effectively. The final assemblies generated by TGS-GapCloser have improved contiguity and completeness while maintaining high accuracy. The software is available at https://github.com/BGI-Qingdao/TGS-GapCloser.
Project description:Monotropa hypopitys (pinesap) is a non-photosynthetic obligately mycoheterotrophic plant of the family Ericaceae. It obtains the carbon and other nutrients from the roots of surrounding autotrophic trees through the associated mycorrhizal fungi. In order to understand the evolutionary changes in the plant genome associated with transition to a heterotrophic lifestyle, we performed de novo transcriptomic analysis of M. hypopitys using next-generation sequencing. We obtained the RNA-Seq data from flowers, flower bracts and roots with haustoria using Illumina HiSeq2500 platform. The raw data obtained in this study can be available in NCBI SRA database with accession number of SRP069226. A total of 10.3 GB raw sequence data were obtained, corresponding to 103,357,809 raw reads. A total of 103,025,683 reads were filtered after removing low-quality reads and trimming the adapter sequences. The Trinity program was used to de novo assemble 98,349 unigens with an N50 of 1342 bp. Using the TransDecoder program, we predicted 43,505 putative proteins. 38,416 unigenes were annotated in the Swiss-Prot protein sequence database using BLASTX. The obtained transcriptomic data will be useful for further studies of the evolution of plant genomes upon transition to a non-photosynthetic lifestyle and the loss of photosynthesis-related functions.