Project description:We generated 20 iPSC lines from erythroid precursors (EP) and compared them to 3 lines derived from dermal fibroblasts (DF), H9 hESC, four precursor EP populations and one precursor DF population. This experiment describes the RRBS profiling of those samples performed in order to QC the lines. These Human samples are not consented for release of identifiable sequencing data, so processed methylation calls over CpG islands are provided.
Project description:We identified 1.96 million small insertions and deletions (INDELs) in the genomes of 79 diverse humans. 10,003 of these INDELs were probed on a custom INDEL genotyping array.
Project description:Mutect (SNVs), Platypus (indels), and InfoGenomeR (SVs and CNAs) calls from whole genome sequencing data of five patients with pediatric medulloblastoma.
Project description:We performed a large-scale genome-wide characterisation of indels generated following editing with CRISPR/Cas9. We used pools of sgRNAs and performed targeted capture and sequencing of the edited regions in HepG2 cells.
Project description:Amino acid insertions and deletions (indels) are an abundant class of genetic variants. However, compared to substitutions, the effects of indels are not well understood and poorly predicted. Here we address this shortcoming by performing deep indel mutagenesis (DIM) of structurally diverse proteins. Indel tolerance is strikingly different to substitution tolerance and varies extensively both between different proteins and within different regions of the same protein. Although state of the art variant effect predictors perform poorly on indels, we show that both experimentally-measured and computationally-predicted substitution scores can be repurposed as good indel variant effect predictors by incorporating information on protein secondary structures. Quantifying the effects of indels on protein-protein interactions reveals that insertions can be an important class of gain-of-function variants. Our results provide an overview of the impact of indels on proteins and a method to predict their effects genome-wide.
Project description:This is the validation data for candidate de novo CNV calls made in the CEU Hapmap by Itsara et al., Genome Research 2010. In this study, de novo CNV calls were initially made with Illumina 1M SNP arrays. Validation of CNV calls was performed with Nimblegen custom array CGH using the extended CEPH pedigrees. A truly de novo CNV would be unobserved in the first generation (CEU trio parents), validated in the second generation (CEU trio children), and assuming no selective effects, transmitted to approximately half of the individuals in the third generation. We attempted validation of 4 de novo CNVs in 3 extended CEPH pedigrees: 1358, 1408, and 1459.
Project description:<p>Improvement of variant calling in next-generation sequence data requires a comprehensive, genome-wide catalogue of high-confidence variants called in a set of genomes for use as a benchmark. We generated deep, whole-genome sequence data of seventeen individuals in a three-generation pedigree and called variants in each genome using a range of currently available algorithms. We used haplotype transmission information to create a phased "platinum" variant catalogue of 4.7 million single nucleotide variants (SNVs) plus 0.7 million small (1-50bp) insertions and deletions (indels) that are consistent with the pattern of inheritance in the parents and eleven children of this pedigree. Platinum genotypes are highly concordant with the current catalogue of the National Institute of Standards and Technology for both SNVs (>99.99%) and indels (99.92%), and add a validated truth catalogue that has 26% more SNVs and 45% more indels. Analysis of 334,652 SNVs that were consistent between informatics pipelines yet inconsistent with haplotype transmission ("non-platinum") revealed that the majority of these variants are <i>de novo</i> and cell-line mutations or reside within previously unidentified duplications and deletions. The reference materials from this study are a resource for objective assessment of the accuracy of variant calls throughout genomes.</p>