Estimating the human mutation rate using autozygosity in a founder population.
ABSTRACT: Knowledge of the rate and pattern of new mutation is critical to the understanding of human disease and evolution. We used extensive autozygosity in a genealogically well-defined population of Hutterites to estimate the human sequence mutation rate over multiple generations. We sequenced whole genomes from 5 parent-offspring trios and identified 44 segments of autozygosity. Using the number of meioses separating each pair of autozygous alleles and the 72 validated heterozygous single-nucleotide variants (SNVs) from 512 Mb of autozygous DNA, we obtained an SNV mutation rate of 1.20 × 10(-8) (95% confidence interval 0.89-1.43 × 10(-8)) mutations per base pair per generation. The mutation rate for bases within CpG dinucleotides (9.72 × 10(-8)) was 9.5-fold that of non-CpG bases, and there was strong evidence (P = 2.67 × 10(-4)) for a paternal bias in the origin of new mutations (85% paternal). We observed a non-uniform distribution of heterozygous SNVs (both newly identified and known) in the autozygous segments (P = 0.001), which is suggestive of mutational hotspots or sites of long-range gene conversion.
Project description:Inbreeding depression refers to lower fitness among offspring of genetic relatives. This reduced fitness is caused by the inheritance of two identical chromosomal segments (autozygosity) across the genome, which may expose the effects of (partially) recessive deleterious mutations. Even among outbred populations, autozygosity can occur to varying degrees due to cryptic relatedness between parents. Using dense genome-wide single-nucleotide polymorphism (SNP) data, we examined the degree to which autozygosity associated with measured cognitive ability in an unselected sample of 4854 participants of European ancestry. We used runs of homozygosity-multiple homozygous SNPs in a row-to estimate autozygous tracts across the genome. We found that increased levels of autozygosity predicted lower general cognitive ability, and estimate a drop of 0.6 s.d. among the offspring of first cousins (P=0.003-0.02 depending on the model). This effect came predominantly from long and rare autozygous tracts, which theory predicts as more likely to be deleterious than short and common tracts. Association mapping of autozygous tracts did not reveal any specific regions that were predictive beyond chance after correcting for multiple testing genome wide. The observed effect size is consistent with studies of cognitive decline among offspring of known consanguineous relationships. These findings suggest a role for multiple recessive or partially recessive alleles in general cognitive ability, and that alleles decreasing general cognitive ability have been selected against over evolutionary time.
Project description:Autozygosity occurs when two chromosomal segments that are identical from a common ancestor are inherited from each parent. This occurs at high rates in the offspring of mates who are closely related (inbreeding), but also occurs at lower levels among the offspring of distantly related mates. Here, we use runs of homozygosity in genome-wide SNP data to estimate the proportion of the autosome that exists in autozygous tracts in 9,388 cases with schizophrenia and 12,456 controls. We estimate that the odds of schizophrenia increase by ~17% for every 1% increase in genome-wide autozygosity. This association is not due to one or a few regions, but results from many autozygous segments spread throughout the genome, and is consistent with a role for multiple recessive or partially recessive alleles in the etiology of schizophrenia. Such a bias towards recessivity suggests that alleles that increase the risk of schizophrenia have been selected against over evolutionary time.
Project description:A central aim for studying runs of homozygosity (ROHs) in genome-wide SNP data is to detect the effects of autozygosity (stretches of the two homologous chromosomes within the same individual that are identical by descent) on phenotypes. However, it is unknown which current ROH detection program, and which set of parameters within a given program, is optimal for differentiating ROHs that are truly autozygous from ROHs that are homozygous at the marker level but vary at unmeasured variants between the markers.We simulated 120 Mb of sequence data in order to know the true state of autozygosity. We then extracted common variants from this sequence to mimic the properties of SNP platforms and performed ROH analyses using three popular ROH detection programs, PLINK, GERMLINE, and BEAGLE. We varied detection thresholds for each program (e.g., prior probabilities, lengths of ROHs) to understand their effects on detecting known autozygosity.Within the optimal thresholds for each program, PLINK outperformed GERMLINE and BEAGLE in detecting autozygosity from distant common ancestors. PLINK's sliding window algorithm worked best when using SNP data pruned for linkage disequilibrium (LD).Our results provide both general and specific recommendations for maximizing autozygosity detection in genome-wide SNP data, and should apply equally well to research on whole-genome autozygosity burden or to research on whether specific autozygous regions are predictive using association mapping methods.
Project description:Heterozygous mutations within homozygous sequences descended from a recent common ancestor offer a way to ascertain de novo mutations across multiple generations. Using exome sequences from 3222 British-Pakistani individuals with high parental relatedness, we estimate a mutation rate of 1.45?±?0.05?×?10-8 per base pair per generation in autosomal coding sequence, with a corresponding non-crossover gene conversion rate of 8.75?±?0.05?×?10-6 per base pair per generation. This is at the lower end of exome mutation rates previously estimated in parent-offspring trios, suggesting that post-zygotic mutations contribute little to the human germ-line mutation rate. We find frequent recurrence of mutations at polymorphic CpG sites, and an increase in C to T mutations in a 5' CCG 3'?to?5' CTG 3' context in the Pakistani population compared to Europeans, suggesting that mutational processes have evolved rapidly between human populations.Estimates of human mutation rates differ substantially based on the approach. Here, the authors present a multi-generational estimate from the autozygous segment in a non-European population that gives insight into the contribution of post-zygotic mutations and population-specific mutational processes.
Project description:Hydrallantois is the excessive accumulation of fluid within the allantoic cavity in pregnant animals and is associated with fetal mortality. Although the incidence of hydrallantois is very low in artificial insemination breeding programs in cattle, recently 38 cows with the phenotypic appearance of hydrallantois were reported in a local subpopulation of Japanese Black cattle. Of these, 33 were traced back to the same sire; however, both their parents were reported healthy, suggesting that hydrallantois is a recessive inherited disorder. To identify autozygous chromosome segments shared by individuals with hydrallantois and the causative mutation in Japanese Black cattle, we performed autozygosity mapping using single-nucleotide polymorphism (SNP) array and exome sequencing.Shared haplotypes of the affected fetuses spanned 3.52 Mb on bovine chromosome 10. Exome sequencing identified a SNP (g.62382825G?>?A, p.Pro372Leu) in exon 10 of solute carrier family 12, member 1 (SLC12A1), the genotype of which was compatible with recessive inheritance. SLC12A1 serves as a reabsorption molecule of Na(+)-K(+)-2Cl(-) in the apical membrane of the thick ascending limb of the loop of Henle in the kidney. We observed that the concentration of Na(+)-Cl(-) increased in allantoic fluid of homozygous SLC12A1 (g.62382825G?>?A) in a hydrallantois individual. In addition, SLC12A1-positive signals were localized at the apical membrane in the kidneys of unaffected fetuses, whereas they were absent from the apical membrane in the kidneys of affected fetuses. These results suggested that p.Pro372Leu affects the membrane localization of SLC12A1, and in turn, may impair its transporter activity. Surveillance of the risk-allele frequency revealed that the carriers were restricted to the local subpopulation of Japanese Black cattle. Moreover, we identified a founder individual that carried the mutation (g.62382825G?>?A).In this study, we mapped the shared haplotypes of affected fetuses using autozygosity mapping and identified a de novo mutation in the SLC12A1 gene that was associated with hydrallantois in Japanese Black cattle. In kidneys of hydrallantois-affected fetuses, the mutation in SLC12A1 impaired the apical membrane localization of SLC12A1 and reabsorption of Na(+)-K(+)-2Cl(-) in the thick ascending limb of the loop of Henle, leading to a defect in the concentration of urine via the countercurrent mechanism. Consequently, the affected fetuses exhibited polyuria that accumulated in the allantoic cavity. Surveillance of the risk-allele frequency indicated that carriers were not widespread throughout the Japanese Black cattle population. Moreover, we identified the founder individual, and thus could effectively manage the disorder in the population.
Project description:Runs of homozygosity (ROH) are continuous homozygous segments of the DNA sequence. They have been applied to quantify individual autozygosity and used as a potential inbreeding measure in livestock species. The aim of the present study was (i) to investigate genome-wide autozygosity to identify and characterize ROH patterns in Gyr dairy cattle genome; (ii) identify ROH islands for gene content and enrichment in segments shared by more than 50% of the samples, and (iii) compare estimates of molecular inbreeding calculated from ROH (FROH), genomic relationship matrix approach (FGRM) and based on the observed versus expected number of homozygous genotypes (FHOM), and from pedigree-based coefficient (FPED).ROH were identified in all animals, with an average number of 55.12?±?10.37 segments and a mean length of 3.17 Mb. Short segments (ROH1-2 Mb) were abundant through the genomes, which accounted for 60% of all segments identified, even though the proportion of the genome covered by them was relatively small. The findings obtained in this study suggest that on average 7.01% (175.28 Mb) of the genome of this population is autozygous. Overlapping ROH were evident across the genomes and 14 regions were identified with ROH frequencies exceeding 50% of the whole population. Genes associated with lactation (TRAPPC9), milk yield and composition (IRS2 and ANG), and heat adaptation (HSF1, HSPB1, and HSPE1), were identified. Inbreeding coefficients were estimated through the application of FROH, FGRM, FHOM, and FPED approaches. FPED estimates ranged from 0.00 to 0.327 and FROH from 0.001 to 0.201. Low to moderate correlations were observed between FPED-FROH and FGRM-FROH, with values ranging from -0.11 to 0.51. Low to high correlations were observed between FROH-FHOM and moderate between FPED-FHOM and FGRM-FHOM. Correlations between FROH from different lengths and FPED gradually increased with ROH length.Genes inside ROH islands suggest a strong selection for dairy traits and enrichment for Gyr cattle environmental adaptation. Furthermore, low FPED-FROH correlations for small segments indicate that FPED estimates are not the most suitable method to capture ancient inbreeding. The existence of a moderate correlation between larger ROH indicates that FROH can be used as an alternative to inbreeding estimates in the absence of pedigree records.
Project description:Mutation of the DNA molecule is one of the most fundamental processes in biology. In this study, we use 283 parent-offspring trios to estimate the rate of mutation for both single nucleotide variants (SNVs) and short length variants (indels) in humans and examine the mutation process. We found 17812 SNVs, corresponding to a mutation rate of 1.29 × 10-8 per position per generation (PPPG) and 1282 indels corresponding to a rate of 9.29 × 10-10 PPPG. We estimate that around 3% of human de novo SNVs are part of a multi-nucleotide mutation (MNM), with 558 (3.1%) of mutations positioned less than 20kb from another mutation in the same individual (median distance of 525bp). The rate of de novo mutations is greater in late replicating regions (p = 8.29 × 10-19) and nearer recombination events (p = 0.0038) than elsewhere in the genome.
Project description:Alzheimer's disease (AD) is highly prevalent in Wadi Ara despite the low frequency of apolipoprotein E ?4 in this genetically isolated Arab community in northern Israel. We hypothesized that the reduced genetic variability in combination with increased homozygosity would facilitate identification of genetic variants that contribute to the high rate of AD in this community. AD cases (n = 124) and controls (n = 142) from Wadi Ara were genotyped for a genome-wide set of more than 300,000 single nucleotides polymorphisms (SNPs) which were used to calculate measures of population stratification and inbreeding, and to identify regions of autozygosity. Although a high degree of relatedness was evident in both AD cases and controls, controls were significantly more related and contained more autozygous regions than AD cases (p = 0.004). Eight autozygous regions on seven different chromosomes were more frequent in controls than the AD cases, and 116 SNPs in these regions, primarily on chromosomes 2, 6, and 9, were nominally associated with AD. The association with rs3130283 in AGPAT1 on chromosome 6 was observed in a meta-analysis of seven genome-wide association study (GWAS) datasets. Analysis of the full Wadi Ara GWAS dataset revealed 220 SNP associations with AD at p ? 10??, and seven of these were confirmed in the replication GWAS datasets (p < 0.05). The unique population structure of Wadi Ara enhanced efforts to identify genetic variants that might partially explain the high prevalence of AD in the region. Several of these variants show modest evidence for association in other Caucasian populations.
Project description:FILTUS is a stand-alone tool for working with annotated variant files, e.g. when searching for variants causing Mendelian disease. Very flexible in terms of input file formats, FILTUS offers efficient filtering and a range of downstream utilities, including statistical analysis of gene sharing patterns, detection of de novo mutations in trios, quality control plots and autozygosity mapping. The autozygosity mapping is based on a hidden Markov model and enables accurate detection of autozygous regions directly from exome-scale variant files.FILTUS is written in Python and runs on Windows, Mac and Linux. Binaries and source code are freely available at http://folk.uio.no/magnusv/filtus.html and on GitHub: https://github.com/magnusdv/filtus Automatic installation is available via PyPI (e.g. pip install filtus).firstname.lastname@example.orgSupplementary data are available at Bioinformatics online.
Project description:Genome-wide runs of homozygosity (ROH) are suitable for understanding population history, calculating genomic inbreeding, deciphering genetic architecture of complex traits and diseases as well as identifying genes linked with agro-economic traits. Autozygosity and ROH islands, genomic regions with elevated ROH frequencies, were characterized in 112 animals of seven Indian native cattle breeds (B. indicus) using BovineHD BeadChip. In total, 4138 ROH were detected. The average number of ROH per animal was maximum in draft breed, Kangayam (63.62 ± 22.71) and minimum in dairy breed, Sahiwal (24.62 ± 11.03). The mean ROH length was maximum in Vechur (6.97 Mb) and minimum in Hariana (4.04 Mb). Kangayam revealed the highest ROH based inbreeding (FROH > 1Mb = 0.113 ± 0.059), whereas Hariana (FROH > 1Mb = 0.042 ± 0.031) and Sahiwal (FROH > 1Mb = 0.043 ± 0.048) showed the lowest. The high standard deviation observed in each breed highlights a considerable variability in autozygosity. Out of the total autozygous segments observed in each breed except Vechur, > 80% were of short length (< 8 Mb) and contributed almost 50% of the genome proportion under ROH. However, in Vechur cattle, long ROH contributed 75% of the genome proportion under ROH. ROH patterns revealed Hariana and Sahiwal breeds as less consanguineous, while recent inbreeding was apparent in Vechur. Maximum autozygosity observed in Kangayam is attributable to both recent and ancient inbreeding. The ROH islands were harbouring higher proportion of QTLs for production traits (20.68% vs. 14.64%; P? 0.05) but lower for reproductive traits (11.49% vs. 15.76%; P? 0.05) in dairy breeds compared to draft breed. In draft cattle, genes associated with resistant to diseases/higher immunity (LYZL1, SVIL, and GPX4) and stress tolerant (CCT4) were identified in ROH islands; while in dairy breeds, for milk production (PTGFR, CSN1S1, CSN2, CSN1S2, and CSN3). Significant difference in ROH islands among large and short statured breeds was observed at chromosome 3 and 5 involving genes like PTGFR and HMGA2 responsible for milk production and stature, respectively. PCA analysis on consensus ROH regions revealed distinct clustering of dairy, draft and short stature cattle breeds.