Mutational History of a Human Cell Lineage from Somatic to Induced Pluripotent Stem Cells.
ABSTRACT: The accuracy of replicating the genetic code is fundamental. DNA repair mechanisms protect the fidelity of the genome ensuring a low error rate between generations. This sustains the similarity of individuals whilst providing a repertoire of variants for evolution. The mutation rate in the human genome has recently been measured to be 50-70 de novo single nucleotide variants (SNVs) between generations. During development mutations accumulate in somatic cells so that an organism is a mosaic. However, variation within a tissue and between tissues has not been analysed. By reprogramming somatic cells into induced pluripotent stem cells (iPSCs), their genomes and the associated mutational history are captured. By sequencing the genomes of polyclonal and monoclonal somatic cells and derived iPSCs we have determined the mutation rates and show how the patterns change from a somatic lineage in vivo through to iPSCs. Somatic cells have a mutation rate of 14 SNVs per cell per generation while iPSCs exhibited a ten-fold lower rate. Analyses of mutational signatures suggested that deamination of methylated cytosine may be the major mutagenic source in vivo, whilst oxidative DNA damage becomes dominant in vitro. Our results provide insights for better understanding of mutational processes and lineage relationships between human somatic cells. Furthermore it provides a foundation for interpretation of elevated mutation rates and patterns in cancer.
Project description:Efficiency of reprogramming of human cells into induced pluripotent stem cells (iPSCs) has remained low. We report that individual adult human CD49f+ long-term hematopoietic stem cells (LT-HSCs) can be reprogrammed into iPSCs at close to 50% efficiency using Sendai virus transduction. This exquisite sensitivity to reprogramming is specific to LT-HSCs, since it progressively decreases in committed progenitors. LT-HSC reprogramming can follow multiple paths and is most efficient when transduction is performed after the cells have exited G0. Sequencing of 75 paired skin fibroblasts/LT-HSC samples collected from nine individuals revealed that LT-HSCs contain a lower load of somatic single-nucleotide variants (SNVs) and indels than skin fibroblasts and accumulate about 12 SNVs/year. Mutation analysis revealed that LT-HSCs and fibroblasts have very different somatic mutation signatures and that somatic mutations in iPSCs generally exist prior to reprogramming. LT-HSCs may become the preferred cell source for the production of clinical-grade iPSCs.
Project description:Knowledge of the rate and pattern of new mutation is critical to the understanding of human disease and evolution. We used extensive autozygosity in a genealogically well-defined population of Hutterites to estimate the human sequence mutation rate over multiple generations. We sequenced whole genomes from 5 parent-offspring trios and identified 44 segments of autozygosity. Using the number of meioses separating each pair of autozygous alleles and the 72 validated heterozygous single-nucleotide variants (SNVs) from 512 Mb of autozygous DNA, we obtained an SNV mutation rate of 1.20 × 10(-8) (95% confidence interval 0.89-1.43 × 10(-8)) mutations per base pair per generation. The mutation rate for bases within CpG dinucleotides (9.72 × 10(-8)) was 9.5-fold that of non-CpG bases, and there was strong evidence (P = 2.67 × 10(-4)) for a paternal bias in the origin of new mutations (85% paternal). We observed a non-uniform distribution of heterozygous SNVs (both newly identified and known) in the autozygous segments (P = 0.001), which is suggestive of mutational hotspots or sites of long-range gene conversion.
Project description:Genetic integrity of induced pluripotent stem cells (iPSCs) is essential for their validity as disease models and for potential therapeutic use. We describe the comprehensive analysis in the ForIPS consortium: an iPSC collection from donors with neurological diseases and healthy controls. Characterization included pluripotency confirmation, fingerprinting, conventional and molecular karyotyping in all lines. In the majority, somatic copy number variants (CNVs) were identified. A subset with available matched donor DNA was selected for comparative exome sequencing. We identified single nucleotide variants (SNVs) at different allelic frequencies in each clone with high variability in mutational load. Low frequencies of variants in parental fibroblasts highlight the importance of germline samples. Somatic variant number was independent from reprogramming, cell type and passage. Comparison with disease genes and prediction scores suggest biological relevance for some variants. We show that high-throughput sequencing has value beyond SNV detection and the requirement to individually evaluate each clone.
Project description:Next generation sequencing has enabled systematic discovery of mutational spectra in cancer samples. Here, we used whole genome sequencing to characterize somatic mutations and structural variation in a primary acral melanoma and its lymph node metastasis. Our data show that the somatic mutational rates in this acral melanoma sample pair were more comparable to the rates reported in cancer genomes not associated with mutagenic exposure than in the genome of a melanoma cell line or the transcriptome of melanoma short-term cultures. Despite the perception that acral skin is sun-protected, the dominant mutational signature in these samples is compatible with damage due to ultraviolet light exposure. A nonsense mutation in ERCC5 discovered in both the primary and metastatic tumors could also have contributed to the mutational signature through accumulation of unrepaired dipyrimidine lesions. However, evidence of transcription-coupled repair was suggested by the lower mutational rate in the transcribed regions and expressed genes. The primary and the metastasis are highly similar at the level of global gene copy number alterations, loss of heterozygosity and single nucleotide variation (SNV). Furthermore, the majority of the SNVs in the primary tumor were propagated in the metastasis and one nonsynonymous coding SNV and one splice site mutation appeared to arise de novo in the metastatic lesion.
Project description:Autosomal-dominant polycystic kidney disease (ADPKD) is the most common inherited kidney disease, leading to kidney failure in most patients. In approximately 85% of cases, the disease is caused by mutations in PKD1. How dysregulation of PKD1 leads to cyst formation on a molecular level is unknown. Induced pluripotent stem cells (iPSCs) are a powerful tool for in vitro modeling of genetic disorders. Here, we established ADPKD patient-specific iPSCs to study the function of PKD1 in kidney development and cyst formation in vitro. Somatic mutations are proposed to be the initiating event of cyst formation, and therefore, iPSCs were derived from cystic renal epithelial cells rather than fibroblasts. Mutation analysis of the ADPKD iPSCs revealed germline mutations in PKD1 but no additional somatic mutations in PKD1/PKD2. Although several somatic mutations in other genes implicated in ADPKD were identified in cystic renal epithelial cells, only few of these mutations were present in iPSCs, indicating a heterogeneous mutational landscape, and possibly in vitro cell selection before and during the reprogramming process. Whole-genome DNA methylation analysis indicated that iPSCs derived from renal epithelial cells maintain a kidney-specific DNA methylation memory. In addition, comparison of PKD1+/- and control iPSCs revealed differences in DNA methylation associated with the disease history. In conclusion, we generated and characterized iPSCs derived from cystic and healthy control renal epithelial cells, which can be used for in vitro modeling of kidney development in general and cystogenesis in particular.
Project description:In mammals, the first cell fate decision is initialized by cell polarization at the 8- to 16-cell stage of the preimplantation embryo. At this stage, outside cells adopt a trophectoderm (TE) fate, whereas the inside cell population gives rise to the inner cell mass (ICM). Prior to implantation, transcriptional interaction networks and epigenetic modifications divide the extraembryonic and embryonic fate irrevocably. Here, we report that extraembryonic trophoblast stem cell (TSC) lines are converted to induced pluripotent stem cells (TSC-iPSCs) by overexpressing Oct4, Sox2, Klf4, and cMyc. Methylation studies and gene array analyses indicated that TSC-iPSCs had adopted a pluripotent potential. The rate of conversion was lower than those of somatic reprogramming experiments, probably due to the unique genetic network controlling extraembryonic lineage fixation. Both in vitro and in vivo, TSC-iPSCs differentiated into tissues representing all three embryonic germ layers, indicating that somatic cell fate could be induced. Finally, TSC-iPSCs chimerized the embryo proper and contributed to the germ line of mice, indicating that these cells had acquired full somatic differentiation potential. These results lead to a better understanding of the molecular processes that govern the first lineage decision in mammals.
Project description:Human endogenous retroviruses (HERVs) have been investigated for potential links with human cancer. However, the distribution of somatic nucleotide variations in HERV elements has not been explored in detail. This study aims to identify HERV elements with an over-representation of somatic mutations (hot spots) in cancer patients. Four HERV elements with mutation hotspots were identified that overlap with exons of four human protein coding genes. These hotspots were identified based on the significant over-representation (p<8.62e-4) of non-synonymous single-nucleotide variations (nsSNVs). These genes are TNN (HERV-9/LTR12), OR4K15 (HERV-IP10F/LTR10F), ZNF99 (HERV-W/HERV17/LTR17), and KIR2DL1 (MST/MaLR). In an effort to identify mutations that effect survival, all nsSNVs were further evaluated and it was found that kidney cancer patients with mutation C2270G in ZNF99 have a significantly lower survival rate (hazard ratio = 2.6) compared to those without it. Among HERV elements in the human non-protein coding regions, we found 788 HERVs with significantly elevated numbers of somatic single-nucleotide variations (SNVs) (p<1.60e-5). From this category the top three HERV elements with significantly over-represented SNVs are HERV-H/LTR7, HERV-9/LTR12 and HERV-L/MLT2. Majority of the SNVs in these 788 HERV elements are located in three DNA functional groups: long non-coding RNAs (lncRNAs) (60%), introns (22.2%) and transcriptional factor binding sites (TFBS) (14.8%). This study provides a list of mutational hotspots in HERVs, which could potentially be used as biomarkers and therapeutic targets.
Project description:To understand the mutational burden of human induced pluripotent stem cells (iPSCs), we sequenced genomes of 18 fibroblast-derived iPSC lines and identified different classes of somatic mutations based on structure, origin, and frequency. Copy-number alterations affected 295 kb in each sample and strongly impacted gene expression. UV-damage mutations were present in ?45% of the iPSCs and accounted for most of the observed heterogeneity in mutation rates across lines. Subclonal mutations (not present in all iPSCs within a line) composed 10% of point mutations and, compared with clonal variants, showed an enrichment in active promoters and increased association with altered gene expression. Our study shows that, by combining WGS, transcriptome, and epigenome data, we can understand the mutational burden of each iPSC line on an individual basis and suggests that this information could be used to prioritize iPSC lines for models of specific human diseases and/or transplantation therapy.
Project description:The generation of personalized induced pluripotent stem cells (iPSCs) followed by targeted genome editing provides an opportunity for developing customized effective cellular therapies for genetic disorders. However, it is critical to ascertain whether edited iPSCs harbor unfavorable genomic variations before their clinical application. To examine the mutation status of the edited iPSC genome and trace the origin of possible mutations at different steps, we have generated virus-free iPSCs from amniotic cells carrying homozygous point mutations in ?-hemoglobin gene (HBB) that cause severe ?-thalassemia (?-Thal), corrected the mutations in both HBB alleles by zinc finger nuclease-aided gene targeting, and obtained the final HBB gene-corrected iPSCs by excising the exogenous drug resistance gene with Cre recombinase. Through comparative genomic hybridization and whole-exome sequencing, we uncovered seven copy number variations, five small insertions/deletions, and 64 single nucleotide variations (SNVs) in ?-Thal iPSCs before the gene targeting step and found a single small copy number variation, 19 insertions/deletions, and 340 single nucleotide variations in the final gene-corrected ?-Thal iPSCs. Our data revealed that substantial but different genomic variations occurred at factor-induced somatic cell reprogramming and zinc finger nuclease-aided gene targeting steps, suggesting that stringent genomic monitoring and selection are needed both at the time of iPSC derivation and after gene targeting.
Project description:Numerous non-inherited somatic mutations, distinct from those of germ-line origin, occur in somatic cells during DNA replication per cell-division. The somatic mutations, recording the unique genetic cell-lineage 'history' of each proliferating normal cell, are important but remain to be investigated because of their ultra-low frequency hidden in the genetic background of heterogeneous cells. Luckily, the recent development of single-cell genomics biotechnologies enables the screening and collection of the somatic mutations, especial single nucleotide variations (SNVs), occurring in normal cells. Here, we established DSMNC: a database of somatic mutations in normal cells (http://dsmnc.big.ac.cn/), which provides most comprehensive catalogue of somatic SNVs in single cells from various normal tissues. In the current version, the database collected ?0.8 million SNVs accumulated in ?600 single normal cells (579 human cells and 39 mouse cells). The database interface supports the user-friendly capability of browsing and searching the SNVs and their annotation information. DSMNC, which serves as a timely and valuable collection of somatic mutations in individual normal cells, has made it possible to analyze the burdens and signatures of somatic mutations in various types of heterogeneous normal cells. Therefore, DSMNC will significantly improve our understanding of the characteristics of somatic mutations in normal cells.