Project description:BACKGROUND: Pre-implantation genetic screening (PGS) has been used in an attempt to determine embryonic aneuploidy. Techniques that use new molecular methods to determine the karyotype of an embryo are expanding the scope of PGS. METHODS: We introduce a new method for PGS, termed “Parental Support” (PS), which leverages microarray measurements from parental DNA to “clean” single cell microarray measurements on embryonic cells and explicitly computes confidence in each copy number call. The method distinguishes mitotic and meiotic copy errors, and determines parental source of aneuploidy. RESULTS: Validation with 459 single cells of known karyotype indicated that per-cell false positive and false negative rates are roughly equivalent to the “gold standard” metaphase karyotype. The majority of the cells were run in parallel with a clinical commercial PGS service. Computed confidences were conservative and roughly concordant with accuracy. To examine ploidy in human embryos, the method was then applied to 26 disaggregated cryopreserved cleavage stage embryos, for a total of 134 single blastomeres. Only 23.1% of the embryos were euploid, though 46.2% of embryos were mosaic euploid. Mosaicism affected 57.7% of the embryos. Counts of mitotic and meiotic errors were roughly equivalent. Maternal meiotic trisomy predominated over paternal trisomy, and maternal meiotic trisomies were negatively predictive of mosaic euploid embryos. CONCLUSIONS: We have performed a major preclinical validation of a new method for PGS and found that the technology performs approximately as well as a metaphase karyotype. We also directly measured the mechanism of aneuploidy in cleavage stage human embryos and found high rates and distinct patterns of mitotic and meiotic aneuploidy.
Project description:In recent years, Next Generation Sequencing (NGS) has become a cornerstone of clinical genetics and diagnostics. Many clinical applications require high precision, especially if rare events such as somatic mutations in cancer or genetic variants causing rare diseases need to be identified. Although random sequencing errors can be modeled statistically and deep sequencing minimizes their impact, systematic errors remain a problem even at high depth of coverage. Understanding their source is crucial to increase precision of clinical NGS applications. In this work, we studied the relation between recurrent biases in allele balance (AB), systematic errors and false positive variant calls across a large cohort of human samples analyzed by whole exome sequencing (WES). We have modeled the allele balance distribution for biallelic genotypes in 987 WES samples in order to identify positions recurrently deviating significantly from the expectation, a phenomenon we termed allele balance bias (ABB). Furthermore, we have developed a genotype callability score based on ABB for all positions of the human exome, which detects false positive variant calls that passed state-of-the-art filters. Finally, we demonstrate the use of ABB for detection of false associations proposed by rare variant association studies (RVAS).
Project description:Logistic regression classification models were fit to manually classified quality control (QC) LC-MS/MS datasets to develop a model that can predict whether a dataset is in or out of control. Model parameters were optimized by minimizing a loss function that accounts for the tradeoff between false positive and false negative errors. In addition to the 1152 training/testing datasets, we are including 2662 additional datasets, all of the same QC sample (whole cell lysate of Shewanella oneidensis). Datasets originate from 6 Thermo instrument platforms: Exactive, LTQ, VelosPro, Orbitrap, Q-Exactive, and Velos Orbitrap.
Project description:Logistic regression classification models were fit to manually classified quality control (QC) LC-MS/MS datasets to develop a model that can predict whether a dataset is in or out of control. Model parameters were optimized by minimizing a loss function that accounts for the tradeoff between false positive and false negative errors. In addition to the 1152 training/testing datasets, we are including 2662 additional datasets, all of the same QC sample (whole cell lysate of Shewanella oneidensis). Datasets originate from 6 Thermo instrument platforms: Exactive, LTQ, VelosPro, Orbitrap, Q-Exactive, and Velos Orbitrap.
Project description:Logistic regression classification models were fit to manually classified quality control (QC) LC-MS/MS datasets to develop a model that can predict whether a dataset is in or out of control. Model parameters were optimized by minimizing a loss function that accounts for the tradeoff between false positive and false negative errors. In addition to the 1152 training/testing datasets, we are including 2662 additional datasets, all of the same QC sample (whole cell lysate of Shewanella oneidensis). Datasets originate from 6 Thermo instrument platforms: Exactive, LTQ, VelosPro, Orbitrap, Q-Exactive, and Velos Orbitrap.
Project description:Logistic regression classification models were fit to manually classified quality control (QC) LC-MS/MS datasets to develop a model that can predict whether a dataset is in or out of control. Model parameters were optimized by minimizing a loss function that accounts for the tradeoff between false positive and false negative errors. In addition to the 1152 training/testing datasets, we are including 2662 additional datasets, all of the same QC sample (whole cell lysate of Shewanella oneidensis). Datasets originate from 6 Thermo instrument platforms: Exactive, LTQ, VelosPro, Orbitrap, Q-Exactive, and Velos Orbitrap.
Project description:Logistic regression classification models were fit to manually classified quality control (QC) LC-MS/MS datasets to develop a model that can predict whether a dataset is in or out of control. Model parameters were optimized by minimizing a loss function that accounts for the tradeoff between false positive and false negative errors. In addition to the 1152 training/testing datasets, we are including 2662 additional datasets, all of the same QC sample (whole cell lysate of Shewanella oneidensis). Datasets originate from 6 Thermo instrument platforms: Exactive, LTQ, VelosPro, Orbitrap, Q-Exactive, and Velos Orbitrap.
Project description:To evaluate the specificity of long dsRNAs used in high-throughput RNAi screens performed at the Drosophila RNAi Screening Center (DRSC), we performed a global analysis of their activity in 30 genome-wide screens completed at our facility. Surprisingly, our analysis predicts that dsRNAs containing ≥19 nucleotide perfect matches identified in silico to unintended targets may contribute to a significant false positive error rate arising from off-target effects. We confirmed experimentally that such sequences in dsRNAs lead to false positives and to the efficient knockdown of a cross-hybridizing transcript, raising a cautionary note when interpreting results based on the use of a single dsRNA per gene. Although a full appreciation of all causes of false positive errors remains to be determined, we suggest simple guidelines to help ensure high quality information from RNAi high-throughput screens Keywords: Specificity of long dsRNAs, Drosophila melanogaster, SL2 cell line, custom cDNA arrays, off-target effects, Drosophila RNAi Screening Center (DRSC)