Project description:Variants with known or possible pathogenicity located in genes that are unrelated to primary disease conditions are defined as secondary findings. Secondary findings are not the primary targets of whole exome and genome sequencing (WES/WGS) assay but can be of great practical value in early disease prevention and intervention. The driving force for this study was to investigate the impact of racial difference and disease background on secondary findings. Here, we analyzed secondary findings frequencies in 421 whole exome-sequenced Chinese children who are phenotypically normal or bear congenital heart diseases/juvenile obesity. In total, 421 WES datasets were processed for potential deleterious variant screening. A reference gene list was defined according to the American College of Medical Genetics and Genomics (ACMG) recommendations for reporting secondary findings v2.0 (ACMG SF v2.0). The variant classification was performed according to the evidence-based guidelines recommended by the joint consensus of the ACMG and the Association for Molecular Pathology (AMP).Among the 421 WES datasets, we identified 11 known/expected pathogenic variants in 12 individuals, accounting for 2.85% of our samples, which is much higher than the reported frequency in a Caucasian population. In conclusion, secondary findings are not so rare in Chinese children, which means that we should pay more attention to the clinical interpretation of sequencing results.
Project description:Most statistical methods for quantitative trait loci (QTL) mapping focus on a single phenotype. However, multiple phenotypes are commonly measured, and recent technological advances have greatly simplified the automated acquisition of numerous phenotypes, including function-valued phenotypes, such as growth measured over time. While methods exist for QTL mapping with function-valued phenotypes, they are generally computationally intensive and focus on single-QTL models. We propose two simple, fast methods that maintain high power and precision and are amenable to extensions with multiple-QTL models using a penalized likelihood approach. After identifying multiple QTL by these approaches, we can view the function-valued QTL effects to provide a deeper understanding of the underlying processes. Our methods have been implemented as a package for R, funqtl.
Project description:Polyploidy, or whole-genome duplication often with hybridization, is common in eukaryotes and is thought to drive ecological and evolutionary success, especially in plants. The mechanisms of polyploid success in ecologically relevant contexts, however, remain largely unknown. We conducted an extensive test of functional trait divergence and plasticity in conferring polyploid fitness advantage in heterogeneous environments, by growing clonal replicates of a worldwide genotype collection of six allopolyploid and five diploid wild strawberry (Fragaria) taxa in three climatically different common gardens. Among leaf functional traits, we detected divergence in trait means but not plasticities between polyploids and diploids, suggesting that increased genomic redundancy in polyploids does not necessarily translate into greater trait plasticity in response to environmental change. Across the heterogeneous garden environments, however, polyploids exhibited fitness advantage, which was conferred by both trait means and adaptive trait plasticities, supporting a 'jack-and-master' hypothesis for polyploids. Our findings elucidate essential ecological mechanisms underlying polyploid adaptation to heterogeneous environments, and provide an important insight into the prevalence and persistence of polyploid plants.
Project description:Forward genetic screens of induced mutant plant populations are powerful tools to identify genes underlying phenotypes of interest. Using traditional techniques, mapping causative mutations from forward screens is a lengthy, multi-step process, requiring the identification of a broad genetic region followed by candidate gene sequencing to characterize the causal variant. Mapping by whole genome sequencing accelerates the identification of causal mutations by simultaneously defining a mapping region and providing information on the induced genetic variants. In wheat, although the availability of a high-quality draft genome assembly facilitates mapping and mutation calling, whole genome resequencing remains prohibitively expensive due to its large genome. In the current study, we used exome sequencing as a complexity reduction strategy to detect mutations associated with a target phenotype. In a segregating wheat EMS population, we identified a clear peak region on chromosome arm 4BS associated with increased plant height. Although none of the significant SNPs seemed causative for the mutant phenotype, they were sufficient to identify a linked ~ 1.9 Mb deletion encompassing nine genes. These genes included Rht-B1, which is known to have a strong effect on plant height and is a strong candidate for the observed phenotype. We performed simulation experiments to determine the impacts of sequencing depth and bulk size and discuss the importance of considering each factor when designing mapping-by-sequencing experiments in wheat. This approach can accelerate the identification of candidate causal point mutations or linked deletions underlying important phenotypes.
Project description:It is still challenging to identify causal genes governing obesity. Pbwg1.5, a quantitative trait locus (QTL) for resistance to obesity, was previously discovered from wild Mus musculus castaneus mice and was fine-mapped to a 2.1-Mb genomic region of mouse chromosome 2, where no known gene with an effect on white adipose tissue (WAT) has been reported. The aim of this study was to identify a strong candidate gene for Pbwg1.5 by an integration approach of transcriptome analysis (RNA-sequencing followed by real-time PCR analysis) and the causal inference test (CIT), a statistical method to infer causal relationships between diplotypes, gene expression and trait values. Body weight, body composition and biochemical traits were measured in F2 mice obtained from an intercross between the C57BL/6JJcl strain and a congenic strain carrying Pbwg1.5 on the C57BL/6JJcl background. The F2 mice showed significant diplotype differences in 12 traits including body weight, WAT weight and serum cholesterol/triglyceride levels. The transcriptome analysis revealed that Ly75, Pla2r1, Fap and Gca genes were differentially expressed in the liver and that Fap, Ifih1 and Grb14 were differentially expressed in WAT. However, CITs indicated statistical evidence that only the liver Ly75 gene mediated between genotype and WAT. Ly75 expression was negatively associated with WAT weight. The results suggested that Ly75 is a putative quantitative trait gene for the obesity-resistant Pbwg1.5 QTL discovered from the wild M. m. castaneus mouse. The finding provides a novel insight into a better understanding of the genetic basis for prevention of obesity.
Project description:The yeast Saccharomyces cerevisiae, widely used for ethanol production, is one of the best-understood biological systems. Diploid strains of S. cerevisiae are preferred for industrial use due to the better fermentation efficiency, in terms of vitality and endurance as compared to those of haploid strains. Whole-genome duplications is known to promote adaptive mutations in microorganisms, and allelic variations considerably contribute to the product composition in ethanol fermentation. Although fermentation can be regulated using various strains of yeast, it is quite difficult to make fine adjustment of each component in final products. In this study, we demonstrate the use of polyploids with varying gene dosage (the number of copies of a particular gene present in a genome) in the regulation of ethanol fermentation. Ethyl caproate is one of the major flavouring agents in a Japanese alcoholic beverage called sake. A point mutation in FAS2 encoding the α subunit of fatty acid synthetase induces an increase in the amount of caproic acid, a precursor of ethyl caproate. Using the FAS2 as a model, we generated and evaluated yeast strains with varying mutant gene dosage. We demonstrated the possibility to increase mutant gene dosage via loss of heterozygosity in diploid and tetraploid strains. Productivity of ethyl caproate gradually increased with mutant gene dosage among tetraploid strains. This approach can potentially be applied to a variety of yeast strain development via growth-based screening.
Project description:Understanding traits underlying colonization and niche breadth of invasive plants is key to developing sustainable management solutions to curtail invasions at the establishment phase, when efforts are often most effective. The aim of this study was to evaluate how two invasive congeners differing in ploidy respond to high and lowresource availability following establishment from asexual fragments. Because polyploids are expected to have wider niche breadths than diploid ancestors, we predicted that a decaploid species would have superior ability to maximize resource uptake and use, and outperform a diploid congener when colonizing environments with contrasting light and nutrient availability. A mesocosm experiment was designed to test the main and interactive effects of ploidy (diploid and decaploid) and soil nutrient availability (low and high) nested within light environments (shade and sun) of two invasive aquatic plant congeners. Counter to our predictions, the diploid congener outperformed the decaploid in the early stage of growth. Although growth was similar and low in the cytotypes at low nutrient availability, the diploid species had much higher growth rate and biomass accumulation than the polyploid with nutrient enrichment, irrespective of light environment. Our results also revealed extreme differences in time to anthesis between the cytotypes. The rapid growth and earlier flowering of the diploid congener relative to the decaploid congener represent alternate strategies for establishment and success.
Project description:As part of our ongoing efforts to sequence and map the watermelon (Citrullus spp.) genome, we have constructed a high density genetic linkage map. The map positioned 234 watermelon genome sequence scaffolds (an average size of 1.41 Mb) that cover about 330 Mb and account for 93.5% of the 353 Mb of the assembled genomic sequences of the elite Chinese watermelon line 97103 (Citrullus lanatus var. lanatus). The genetic map was constructed using an F(8) population of 103 recombinant inbred lines (RILs). The RILs are derived from a cross between the line 97103 and the United States Plant Introduction (PI) 296341-FR (C. lanatus var. citroides) that contains resistance to fusarium wilt (races 0, 1, and 2). The genetic map consists of eleven linkage groups that include 698 simple sequence repeat (SSR), 219 insertion-deletion (InDel) and 36 structure variation (SV) markers and spans ?800 cM with a mean marker interval of 0.8 cM. Using fluorescent in situ hybridization (FISH) with 11 BACs that produced chromosome-specifc signals, we have depicted watermelon chromosomes that correspond to the eleven linkage groups constructed in this study. The high resolution genetic map developed here should be a useful platform for the assembly of the watermelon genome, for the development of sequence-based markers used in breeding programs, and for the identification of genes associated with important agricultural traits.
Project description:Genomic prediction (GP) is the procedure whereby the genetic merits of untested candidates are predicted using genome wide marker information. Although numerous examples of GP exist in plants and animals, applications to polyploid organisms are still scarce, partly due to limited genome resources and the complexity of this system. Deep learning (DL) techniques comprise a heterogeneous collection of machine learning algorithms that have excelled at many prediction tasks. A potential advantage of DL for GP over standard linear model methods is that DL can potentially take into account all genetic interactions, including dominance and epistasis, which are expected to be of special relevance in most polyploids. In this study, we evaluated the predictive accuracy of linear and DL techniques in two important small fruits or berries: strawberry and blueberry. The two datasets contained a total of 1,358 allopolyploid strawberry (2n=8x=112) and 1,802 autopolyploid blueberry (2n=4x=48) individuals, genotyped for 9,908 and 73,045 single nucleotide polymorphism (SNP) markers, respectively, and phenotyped for five agronomic traits each. DL depends on numerous parameters that influence performance and optimizing hyperparameter values can be a critical step. Here we show that interactions between hyperparameter combinations should be expected and that the number of convolutional filters and regularization in the first layers can have an important effect on model performance. In terms of genomic prediction, we did not find an advantage of DL over linear model methods, except when the epistasis component was important. Linear Bayesian models were better than convolutional neural networks for the full additive architecture, whereas the opposite was observed under strong epistasis. However, by using a parameterization capable of taking into account these non-linear effects, Bayesian linear models can match or exceed the predictive accuracy of DL. A semiautomatic implementation of the DL pipeline is available at https://github.com/lauzingaretti/deepGP/.
Project description:Breast and ovarian cancers harboring homologous recombination deficiency (HRD) are sensitive to PARP inhibitors and platinum chemotherapy. Conventionally, detecting HRD involves screening for defects in BRCA1, BRCA2, and other relevant genes. Recent analyses have shown that HRD cancers exhibit characteristic mutational patterns due to the activities of HRD-associated mutational signatures. At least three machine learning tools exist for detecting HRD based on mutational patterns. Here, using sequencing data from 1,043 breast and 182 ovarian cancers, we trained Homologous Recombination Proficiency Profiler (HRProfiler), a machine learning method for detecting HRD using six mutational features. HRProfiler's performance is assessed against prior approaches using additional independent datasets of 417 breast and 115 ovarian cancers, including retrospective data from a clinical trial involving patients treated with PARP inhibitors. Our results demonstrate that HRProfiler is the only tool that robustly and consistently predicts clinical response from whole-exome sequenced breast and ovarian cancers.