Project description:The body of human genomic and proteomic evidence continues to grow at ever-increasing rates, while annotation efforts struggle to keep pace. A surprisingly small fraction of human genes have clear, documented associations with specific functions, and new functions continue to be found for characterized genes. Here we assembled an integrated collection of diverse genomic and proteomic data for 21,341 human genes and make quantitative associations of each to 4333 Gene Ontology terms. We combined guilt-by-profiling and guilt-by-association approaches to exploit features unique to the data types. Performance was evaluated by cross-validation, prospective validation, and by manual evaluation with the biological literature. Functional-linkage networks were also constructed, and their utility was demonstrated by identifying candidate genes related to a glioma FLN using a seed network from genome-wide association studies. Our annotations are presented-alongside existing validated annotations-in a publicly accessible and searchable web interface.
Project description:The southern African indigenous Khoe-San populations harbor the most divergent lineages of all living peoples. Exploring their genomes is key to understanding deep human history. We sequenced 25 full genomes from five Khoe-San populations, revealing many novel variants, that 25% of variants are unique to the Khoe-San, and that the Khoe-San group harbors the greatest level of diversity across the globe. In line with previous studies, we found several gene regions with extreme values in genome-wide scans for selection, potentially caused by natural selection in the lineage leading to Homo sapiens and more recent in time. These gene regions included immunity-, sperm-, brain-, diet-, and muscle-related genes. When accounting for recent admixture, all Khoe-San groups display genetic diversity approaching the levels in other African groups and a reduction in effective population size starting around 100,000?years ago. Hence, all human groups show a reduction in effective population size commencing around the time of the Out-of-Africa migrations, which coincides with changes in the paleoclimate records, changes that potentially impacted all humans at the time.
Project description:Copy-number-variable (CNV) loci differ from single nucleotide polymorphic (SNP) sites in size, mutation rate, and mechanisms of maintenance in natural populations. It is therefore hypothesized that population genetic divergence at CNV loci will differ from that found at SNP sites. Here, we test this hypothesis by analysing 856 CNV loci from the genomes of 1184 healthy individuals from 11 HapMap populations with a wide range of ancestry. The results show that population genetic divergence at the CNV loci is generally more than three times lower than at genome-wide SNP sites. Populations generally exhibit very small genetic divergence (Gst?=?0.05?±?0.049). The smallest divergence is among African populations (Gst?=?0.0081?±?0.0025), with increased divergence among non-African populations (Gst?=?0.0217?±?0.0109) and then among African and non-African populations (Gst?=?0.0324?±?0.0064). Genetic diversity is high in African populations (~0.13), low in Asian populations (~0.11), and intermediate in the remaining 11 populations. Few significant linkage disequilibria (LDs) occur between the genome-wide CNV loci. Patterns of gametic and zygotic LDs indicate the absence of epistasis among CNV loci. Mutation rate is about twice as large as the migration rate in the non-African populations, suggesting that the high mutation rates play dominant roles in producing the low population genetic divergence at CNV loci.
Project description:The continuous characterization of genome-wide diversity in population and case-cohort samples, allied to the development of new algorithms, are shedding light on host ancestry impact and selection events on various infectious diseases. Especially interesting are the long-standing associations between humans and certain bacteria, such as the case of <i>Helicobacter pylori</i>, which could have been strong drivers of adaptation leading to coevolution. Some evidence on admixed gastric cancer cohorts have been suggested as supporting <i>Homo</i>-<i>Helicobacter</i> coevolution, but reliable experimental data that control both the bacterium and the host ancestries are lacking. Here, we conducted the first in vitro coinfection assays with dual human- and bacterium-matched and -mismatched ancestries, in African and European backgrounds, to evaluate the genome wide gene expression host response to <i>H. pylori</i>. Our results showed that: (1) the host response to <i>H. pylori</i> infection was greatly shaped by the human ancestry, with variability on innate immune system and metabolism; (2) African human ancestry showed signs of coevolution with <i>H. pylori</i> while European ancestry appeared to be maladapted; and (3) mismatched ancestry did not seem to be an important differentiator of gene expression at the initial stages of infection as assayed here.
Project description:Understanding the timing and character of the expansion of Homo sapiens out of Africa is critical for inferring the colonization and admixture processes that underpin global population history. It has been argued that dispersal out of Africa had an early phase, particularly ~130-90 thousand years ago (ka), that reached only the East Mediterranean Levant, and a later phase, ~60-50?ka, that extended across the diverse environments of Eurasia to Sahul. However, recent findings from East Asia and Sahul challenge this model. Here we show that H. sapiens was in the Arabian Peninsula before 85?ka. We describe the Al Wusta-1 (AW-1) intermediate phalanx from the site of Al Wusta in the Nefud desert, Saudi Arabia. AW-1 is the oldest directly dated fossil of our species outside Africa and the Levant. The palaeoenvironmental context of Al Wusta demonstrates that H. sapiens using Middle Palaeolithic stone tools dispersed into Arabia during a phase of increased precipitation driven by orbital forcing, in association with a primarily African fauna. A Bayesian model incorporating independent chronometric age estimates indicates a chronology for Al Wusta of ~95-86?ka, which we correlate with a humid episode in the later part of Marine Isotope Stage 5 known from various regional records. Al Wusta shows that early dispersals were more spatially and temporally extensive than previously thought. Early H. sapiens dispersals out of Africa were not limited to winter rainfall-fed Levantine Mediterranean woodlands immediately adjacent to Africa, but extended deep into the semi-arid grasslands of Arabia, facilitated by periods of enhanced monsoonal rainfall.
Project description:<h4>Background</h4>Protein-coding regions in a genome evolve by sequence divergence and gene gain and loss, altering the gene content of the organism. However, it is not well understood how this has given rise to the enormous diversity of metazoa present today.<h4>Results</h4>To obtain a global view of human genomic evolution, we quantify the divergence of proteins by functional category at different evolutionary distances from human.<h4>Conclusion</h4>This analysis highlights some general systems-level characteristics of human evolution: regulatory processes, such as signal transducers, transcription factors and receptors, have a high degree of plasticity, while core processes, such as metabolism, transport and protein synthesis, are largely conserved. Additionally, this study reveals a dynamic picture of selective forces at short, medium and long evolutionary timescales. Certain functional categories, such as 'development' and 'organogenesis', exhibit temporal patterns of sequence divergence in eukaryotes relative to human. This framework for a grammar of human evolution supports previously postulated theories of robustness and evolvability.