Project description:The allele frequency spectrum of polymorphisms in DNA sequences can be used to test for signatures of natural selection that depart from the expected frequency spectrum under the neutral theory. We observed a significant (P = 0.001) correlation between the Tajima's D test statistic in full resequencing data and Tajima's D in a dense, genome-wide data set of genotyped polymorphisms for a set of 179 genes. Based on this, we used a sliding window analysis of Tajima's D across the human genome to identify regions putatively subject to strong, recent, selective sweeps. This survey identified seven Contiguous Regions of Tajima's D Reduction (CRTRs) in an African-descent population (AD), 23 in a European-descent population (ED), and 29 in a Chinese-descent population (XD). Only four CRTRs overlapped between populations: three between ED and XD and one between AD and ED. Full resequencing of eight genes within six CRTRs demonstrated frequency spectra inconsistent with neutral expectations for at least one gene within each CRTR. Identification of the functional polymorphism (and/or haplotype) responsible for the selective sweeps within each CRTR may provide interesting insights into the strongest selective pressures experienced by the human genome over recent evolutionary history.
Project description:Peru hosts extremely diverse ecosystems which can be broadly classified into the following three major ecoregions: the Pacific desert coast, the Andean highlands, and the Amazon rainforest. Since its initial peopling approximately 12,000 years ago, the populations inhabiting such ecoregions might have differentially adapted to their contrasting environmental pressures. Previous studies have described several candidate genes underlying adaptation to hypobaric hypoxia among Andean highlanders. However, the adaptive genetic diversity of coastal and rainforest populations has been less studied. Here, we gathered genome-wide single-nucleotide polymorphism-array data from 286 Peruvians living across the three ecoregions and analyzed signals of recent positive selection through population differentiation and haplotype-based selection scans. Among highland populations, we identify candidate genes related to cardiovascular function (TLL1, DUSP27, TBX5, PLXNA4, SGCD), to the Hypoxia-Inducible Factor pathway (TGFA, APIP), to skin pigmentation (MITF), as well as to glucose (GLIS3) and glycogen metabolism (PPP1R3C, GANC). In contrast, most signatures of adaptation in coastal and rainforest populations comprise candidate genes related to the immune system (including SIGLEC8, TRIM21, CD44, and ICAM1 in the coast; CBLB and PRDM1 in the rainforest; and BRD2, HLA-DOA, HLA-DPA1 regions in both), possibly as a result of strong pathogen-driven selection. This study identifies candidate genes related to human adaptation to the diverse environments of South America.
Project description:The Genetic Association Information Network (GAIN) Data Access Committee was established in June 2007 to provide prompt and fair access to data from six genome-wide association studies through the database of Genotypes and Phenotypes (dbGaP). Of 945 project requests received through 2011, 749 (79%) have been approved; median receipt-to-approval time decreased from 14 days in 2007 to 8 days in 2011. Over half (54%) of the proposed research uses were for GAIN-specific phenotypes; other uses were for method development (26%) and adding controls to other studies (17%). Eight data-management incidents, defined as compromises of any of the data-use conditions, occurred among nine approved users; most were procedural violations, and none violated participant confidentiality. Over 5 years of experience with GAIN data access has demonstrated substantial use of GAIN data by investigators from academic, nonprofit, and for-profit institutions with relatively few and contained policy violations. The availability of GAIN data has allowed for advances in both the understanding of the genetic underpinnings of mental-health disorders, diabetes, and psoriasis and the development and refinement of statistical methods for identifying genetic and environmental factors related to complex common diseases.
Project description:In this study, genome-wide single-nucleotide polymorphism (SNP) data (Illumina Infinium® MEGA array) was gathered for 200 individuals distributed across Perú, and then analyzed to investigate signals of recent positive selection specific to populations living in the high-altitude environments of the Andes, the arid Pacific coast and the Amazon rainforest.
Project description:Data access committees (DACs) are critical players in the data sharing ecosystem. DACs review requests for access to data held in one or more repositories and where specific constraints determine how the data may be used and by whom. Our team surveyed DAC members affiliated with genomic data repositories worldwide to understand standard processes and procedures, operational metrics, bottlenecks, and efficiencies, as well as their perspectives on possible improvements to quality review. We found that DAC operations and systemic issues were common across repositories globally. In general, DAC members endeavored to achieve an appropriate balance of review efficiency, quality, and compliance. Our results suggest a similarly proportionate path forward that helps DACs pursue mutual improvements to efficiency and compliance without sacrificing review quality.