Project description:VCF2CNA is a tool (Linux commandline or web-interface) for copy-number alteration (CNA) analysis and tumor purity estimation of paired tumor-normal VCF variant file formats. It operates on whole genome and whole exome datasets. To benchmark its performance, we applied it to 46 adult glioblastoma and 146 pediatric neuroblastoma samples sequenced by Illumina and Complete Genomics (CGI) platforms respectively. VCF2CNA was highly consistent with a state-of-the-art algorithm using raw sequencing data (mean F1-score?=?0.994) in high-quality whole genome glioblastoma samples and was robust to uneven coverage introduced by library artifacts. In the whole genome neuroblastoma set, VCF2CNA identified MYCN high-level amplifications in 31 of 32 clinically validated samples compared to 15 found by CGI's HMM-based CNA model. Moreover, VCF2CNA achieved highly consistent CNA profiles between WGS and WXS platforms (mean F1 score 0.97 on a set of 15 rhabdomyosarcoma samples). In addition, VCF2CNA provides accurate tumor purity estimates for samples with sufficient CNAs. These results suggest that VCF2CNA is an accurate, efficient and platform-independent tool for CNA and tumor purity analyses without accessing raw sequence data.
Project description:Neuroblastoma is a highly heterogeneous tumor accounting for 15 % of all pediatric cancer deaths. Clinical behavior ranges from the spontaneous regression of localized, asymptomatic tumors, as well as metastasized tumors in infants, to rapid progression and resistance to therapy. Genomic amplification of the MYCN oncogene has been used to predict outcome in neuroblastoma for over 30 years, however, recent methodological advances including miRNA and mRNA profiling, comparative genomic hybridization (array-CGH), and whole-genome sequencing have enabled the detailed analysis of the neuroblastoma genome, leading to the identification of new prognostic markers and better patient stratification. In this review, we will describe the main genetic factors responsible for these diverse clinical phenotypes in neuroblastoma, the chronology of their discovery, and the impact on patient prognosis.
Project description:The Cancer Genome Atlas (TCGA) and International Cancer Genome Consortium (ICGC) curated consensus somatic mutation calls using whole exome sequencing (WES) and whole genome sequencing (WGS), respectively. Here, as part of the ICGC/TCGA Pan-Cancer Analysis of Whole Genomes (PCAWG) Consortium, which aggregated whole genome sequencing data from 2,658 cancers across 38 tumour types, we compare WES and WGS side-by-side from 746 TCGA samples, finding that ~80% of mutations overlap in covered exonic regions. We estimate that low variant allele fraction (VAF?<?15%) and clonal heterogeneity contribute up to 68% of private WGS mutations and 71% of private WES mutations. We observe that ~30% of private WGS mutations trace to mutations identified by a single variant caller in WES consensus efforts. WGS captures both ~50% more variation in exonic regions and un-observed mutations in loci with variable GC-content. Together, our analysis highlights technological divergences between two reproducible somatic variant detection efforts.
Project description:Neuroblastoma is one of the most genomically heterogeneous childhood malignances studied to date, and the molecular events that occur during the course of the disease are not fully understood. Genomic studies in neuroblastoma have showed only a few recurrent mutations and a low somatic mutation burden. However, none of these studies has examined the mutations arising during the course of disease, nor have they systemically examined the expression of mutant genes. Here we performed genomic analyses on tumors taken during a 3.5 years disease course from a neuroblastoma patient (bone marrow biopsy at diagnosis, adrenal primary tumor taken at surgical resection, and a liver metastasis at autopsy). Whole genome sequencing of the index liver metastasis identified 44 non-synonymous somatic mutations in 42 genes (0.85 mutation/MB) and a large hemizygous deletion in the ATRX gene which has been recently reported in neuroblastoma. Of these 45 somatic alterations, 15 were also detected in the primary tumor and bone marrow biopsy, while the other 30 were unique to the index tumor, indicating accumulation of de novo mutations during therapy. Furthermore, transcriptome sequencing on the 3 tumors demonstrated only 3 out of the 15 commonly mutated genes (LPAR1, GATA2, and NUFIP1) had high level of expression of the mutant alleles, suggesting potential oncogenic driver roles of these mutated genes. Among them, the druggable G-protein coupled receptor LPAR1 was highly expressed in all tumors. Cells expressing the LPAR1 R163W mutant demonstrated a significantly increased motility through elevated Rho signaling, but had no effect on growth. Therefore, this study highlights the need for multiple biopsies and sequencing during progression of a cancer and combinatorial DNA and RNA sequencing approach for systematic identification of expressed driver mutations.
Project description:Mycobacterium tuberculosis is the leading cause of death from bacterial infection. Improved rapid diagnosis and antimicrobial resistance determination, such as by whole-genome sequencing, are required. Our aim was to develop a simple, low-cost method of preparing DNA for sequencing direct from M. tuberculosis-positive clinical samples (without culture). Simultaneous sputum liquefaction, bacteria heat inactivation (99°C/30?min), and enrichment for mycobacteria DNA were achieved using an equal volume of thermo-protection buffer (4 M KCl, 0.05 M HEPES buffer, pH 7.5, 0.1% dithiothreitol [DTT]). The buffer emulated intracellular conditions found in hyperthermophiles, thus protecting DNA from rapid thermodegradation, which renders it a poor template for sequencing. Initial validation experiments employed mycobacteria DNA, either extracted or intracellular. Next, mock clinical samples (infection-negative human sputum spiked with 0 to 105 Mycobacterium bovis BCG cells/ml) underwent liquefaction in thermo-protection buffer and heat inactivation. DNA was extracted and sequenced. Human DNA degraded faster than mycobacteria DNA, resulting in target enrichment. Four replicate experiments achieved M. tuberculosis detection at 101 BCG cells/ml, with 31 to 59 M. tuberculosis complex reads. Maximal genome coverage (>97% at 5× depth) occurred at 104 BCG cells/ml; >91% coverage (1× depth) occurred at 103 BCG cells/ml. Final validation employed M. tuberculosis-positive clinical samples (n?=?20), revealing that initial sample volumes of ?1 ml typically yielded higher mean depths of M. tuberculosis genome coverage, with an overall range of 0.55 to 81.02. A mean depth of 3 gave >96% 1-fold tuberculosis (TB) genome coverage (in 15/20 clinical samples). A mean depth of 15 achieved >99% 5-fold genome coverage (in 9/20 clinical samples). In summary, direct-from-sample sequencing of M. tuberculosis genomes was facilitated by a low-cost thermo-protection buffer.
Project description:<h4>Background</h4>Observations of recurrent somatic mutations in tumors have led to identification and definition of signaling and other pathways that are important for cancer progression and therapeutic targeting. As tumor cells contain both an individual's inherited genetic variants and somatic mutations, challenges arise in distinguishing these events in massively parallel sequencing datasets. Typically, both a tumor sample and a "normal" sample from the same individual are sequenced and compared; variants observed only in the tumor are considered to be somatic mutations. However, this approach requires two samples for each individual.<h4>Results</h4>We evaluate a method of detecting somatic mutations in tumor samples for which only a subset of normal samples are available. We describe tuning of the method for detection of mutations in tumors, filtering to remove inherited variants, and comparison of detected mutations to several matched tumor/normal analysis methods. Filtering steps include the use of population variation datasets to remove inherited variants as well a subset of normal samples to remove technical artifacts. We then directly compare mutation detection with tumor-only and tumor-normal approaches using the same sets of samples. Comparisons are performed using an internal targeted gene sequencing dataset (n = 3380) as well as whole exome sequencing data from The Cancer Genome Atlas project (n = 250). Tumor-only mutation detection shows similar recall (43-60%) but lesser precision (20-21%) to current matched tumor/normal approaches (recall 43-73%, precision 30-82%) when compared to a "gold-standard" tumor/normal approach. The inclusion of a small pool of normal samples improves precision, although many variants are still uniquely detected in the tumor-only analysis.<h4>Conclusions</h4>A detailed method for somatic mutation detection without matched normal samples enables study of larger numbers of tumor samples, as well as tumor samples for which a matched normal is not available. As sensitivity/recall is similar to tumor/normal mutation detection but precision is lower, tumor-only detection is more appropriate for classification of samples based on known mutations. Although matched tumor-normal analysis is preferred due to higher precision, we demonstrate that mutation detection without matched normal samples is possible for certain applications.
Project description:Purpose: To gain molecular insights of HBV integration that may contribute to HCC tumorigenesis, we performed whole transcriptome sequencing and whole genome copy number profiling of hepatocellular carcinoma (HCC) samples from 50 Chinese patients. Conclusions: This is the first report on the molecular basis of the MLL4 integration driving MLL4 over-expression. HBV-MLL4 integration occurred frequently in Chinese HCC patients, representing a unique molecular segment for HCC with HBV infection. We profiled 50 Chinese Hepatocellular Carcinoma patients and 14 adjacent tissues using Agilent 244K array CGH technology. 50 Tumor samples also did RNASeq profiling.
Project description:Neuroblastoma is the most common and deadly childhood tumor. Relapsed or refractory neuroblastoma has a very poor prognosis despite recent treatment advances. To investigate genomic alterations associated with relapse and therapy resistance, whole-genome sequencing was performed on diagnostic and relapsed lesions together with constitutional DNA from seven children. Sequencing of relapsed tumors indicates somatic alterations in diverse genes, including those involved in RAS-MAPK signaling, promoting cell cycle progression or function in telomere maintenance and immortalization. Among recurrent alterations, CCND1-gain, TERT-rearrangements, and point mutations in POLR2A, CDK5RAP, and MUC16 were shown in ??2 individuals. Our cohort contained examples of converging genomic alterations in primary-relapse tumor pairs, indicating dependencies related to specific genetic lesions. We also detected rare genetic germline variants in DNA repair genes (e.g., BARD1, BRCA2, CHEK2, and WRN) that might cooperate with somatically acquired variants in these patients with highly aggressive recurrent neuroblastoma. Our data indicate the importance of monitoring recurrent neuroblastoma through sequential genomic characterization and that new therapeutic approaches combining the targeting of MAPK signaling, cell cycle progression, and telomere activity are required for this challenging patient group.
Project description:BACKGROUND: Identification of disease susceptible genes requires access to DNA from numerous well-characterised subjects. Archived residual dried blood spot samples from national newborn screening programs may provide DNA from entire populations and medical registries the corresponding clinical information. The amount of DNA available in these samples is however rarely sufficient for reliable genome-wide scans, and whole-genome amplification may thus be necessary. This study assess the quality of DNA obtained from different amplification protocols by evaluating fidelity and robustness of the genotyping of 610,000 single nucleotide polymorphisms, using the Illumina Infinium HD Human610-Quad BeadChip. Whole-genome amplified DNA from 24 neonatal dried blood spot samples stored between 15 to 25 years was tested, and high-quality genomic DNA from 8 of the same individuals was used as reference. RESULTS: Using 3.2 mm disks from dried blood spot samples the optimal DNA-extraction and amplification protocol resulted in call-rates between 99.15% - 99.73% (mean 99.56%, N = 16), and conflicts with reference DNA in only three per 10,000 genotype calls. CONCLUSION: Whole-genome amplified DNA from archived neonatal dried blood spot samples can be used for reliable genome-wide scans and is a cost-efficient alternative to collecting new samples.
Project description:We developed an enrichment-free, metabolic-based assay for rapid detection of tumor cells in the pleural effusion and peripheral blood samples. All nucleated cells are plated on microwell chips that contain 200,000 addressable microwells and then screened the chips. After candidate tumor cells were identified, retrieved single tumor cells with micromanipultor. To detection and analysis molecular characterization of these circulating tumor cells, we performed single cell whole genome amplification with multiple displacement amplification (MDA) technology and whole exome sequencing.