Project description:The Genetic Association Information Network (GAIN) Data Access Committee was established in June 2007 to provide prompt and fair access to data from six genome-wide association studies through the database of Genotypes and Phenotypes (dbGaP). Of 945 project requests received through 2011, 749 (79%) have been approved; median receipt-to-approval time decreased from 14 days in 2007 to 8 days in 2011. Over half (54%) of the proposed research uses were for GAIN-specific phenotypes; other uses were for method development (26%) and adding controls to other studies (17%). Eight data-management incidents, defined as compromises of any of the data-use conditions, occurred among nine approved users; most were procedural violations, and none violated participant confidentiality. Over 5 years of experience with GAIN data access has demonstrated substantial use of GAIN data by investigators from academic, nonprofit, and for-profit institutions with relatively few and contained policy violations. The availability of GAIN data has allowed for advances in both the understanding of the genetic underpinnings of mental-health disorders, diabetes, and psoriasis and the development and refinement of statistical methods for identifying genetic and environmental factors related to complex common diseases.
Project description:The epigenome is the dynamic interface between our changing environment and the static genome, and understanding it is a goal of immense importance to human health. We will map reference cell epigenomes of the brain, breast, blood and approved embryonic stem cells, inclusive of males and females and different racial groups. This cooperative work will transform our understanding of the short and long-lasting consequences of environment impact on human health and disease. We are working cooperatively with other Mapping Centers and the Data Coordination Center (EDACC) to comprehensively map epigenomes of select human cells with significant relevance to complex human disease. Our group, consisting of scientists at UCSF, UC Davis, UCSC and the British Columbia Genome Sciences Centre will focus on cells relevant to human health and complex disease including cells from the blood, brain, breast and U.S. Government-approved lines of human embryonic stem cells. We will incorporate high quality, homogeneous cells from males and females, and two predominant racial groups, and biological replicates of each cell type. Production of comprehensive maps will include 6 histone modifications selected for their opposing roles in regulating active and inactive chromatin, DNA methylation and miRNA and gene expression. This epigenetic data, along with genetic and expression data will be integrated using advanced informatics to address fundamental roles of epigenetics in differentiation, maintenance of cell-type identity and gene expression. Our cell and data production pipeline will incorporate verification and data validation with independent methods, and will operate under a model motivated by increased data production and decrease cost. We summarize the analysis capacity of our instruments and our explicit strategy for data sharing of our proposed REMC-generated resources including biological specimens, protocols, data, software tools and intellectual resources. We envision that our group in conjunction with the other REMC teams, the EDACC, ENCODE, future EHHD (Epigenetics of Human Health and Disease) centers and the NIH Roadmap program will develop methods, tools and reference epigenome maps for the research community that will make the promise of epigenetics in understand and treating human complex disease a reality. Our reference epigenomes will enable new disciplines including human population epigenetics, comparative epigenomics, neuroepigenetics, and therapeutic epigenetics for tissue regeneration and reversal of disease. Contributor: BCCA Genome Sciences Centre **************** For data usage terms and conditions, please refer to: http://www.drugabuse.gov/funding/funding-opportunities/nih-common-fund/epigenomics-data-access-policies ****************
Project description:Integrative analysis of multi-omics data is a powerful approach for gaining functional insights into biological and medical processes. Conducting these multifaceted analyses on human samples is often complicated by the fact that the raw sequencing output is rarely available under open access. The Personal Genome Project UK (PGP-UK) is one of few resources that recruits its participants under open consent and makes the resulting multi-omics data freely and openly available. As part of this resource, we describe the PGP-UK multi-omics reference panel consisting of ten genomic, methylomic and transcriptomic data. Specifically, we outline the data processing, quality control and validation procedures which were implemented to ensure data integrity and exclude sample mix-ups. In addition, we provide a REST API to facilitate the download of the entire PGP-UK dataset. The data are also available from two cloud-based environments, providing platforms for free integrated analysis. In conclusion, the genotype-validated PGP-UK multi-omics human reference panel described here provides a valuable new open access resource for integrated analyses in support of personal and medical genomics.
Project description:Brain tumors are the most common solid tumors of childhood, and the genetic drivers and optimal therapeutic strategies for many of the different subtypes remain unknown. We performed targeted next-generation sequencing of approximately 500 cancer-associated genes on a cohort of 13 pediatric bithalamic diffuse gliomas, a lethal brain tumor of childhood for which the genetic basis is largely unknown. We identified that bithalamic diffuse gliomas harbor frequent mutations in the EGFR oncogene in the absence of accompanying gene amplification and only rare histone H3 mutation. These EGFR mutations were either small in-frame insertions within exon 20 (intracellular tyrosine kinase domain) or missense mutations within exon 7 (extracellular ligand-binding domain). Accompanying alterations included frequent TP53 mutation, CDK6 amplification or CDKN2C mutation, and BCOR and BCORL1 mutation or deletion.
Project description:Data access committees (DACs) are critical players in the data sharing ecosystem. DACs review requests for access to data held in one or more repositories and where specific constraints determine how the data may be used and by whom. Our team surveyed DAC members affiliated with genomic data repositories worldwide to understand standard processes and procedures, operational metrics, bottlenecks, and efficiencies, as well as their perspectives on possible improvements to quality review. We found that DAC operations and systemic issues were common across repositories globally. In general, DAC members endeavored to achieve an appropriate balance of review efficiency, quality, and compliance. Our results suggest a similarly proportionate path forward that helps DACs pursue mutual improvements to efficiency and compliance without sacrificing review quality.
Project description:The 1,000 plants (1KP) project is an international multi-disciplinary consortium that has generated transcriptome data from over 1,000 plant species, with exemplars for all of the major lineages across the Viridiplantae (green plants) clade. Here, we describe how to access the data used in a phylogenomics analysis of the first 85 species, and how to visualize our gene and species trees. Users can develop computational pipelines to analyse these data, in conjunction with data of their own that they can upload. Computationally estimated protein-protein interactions and biochemical pathways can be visualized at another site. Finally, we comment on our future plans and how they fit within this scalable system for the dissemination, visualization, and analysis of large multi-species data sets.
Project description:The 1000 Genomes Project was launched as one of the largest distributed data collection and analysis projects ever undertaken in biology. In addition to the primary scientific goals of creating both a deep catalog of human genetic variation and extensive methods to accurately discover and characterize variation using new sequencing technologies, the project makes all of its data publicly available. Members of the project data coordination center have developed and deployed several tools to enable widespread data access.
Project description:The Alzheimer's Disease Sequencing Project (ADSP) is a national initiative to understand the genetic architecture of Alzheimer's Disease and Related Dementias (AD/ADRD) by sequencing whole genomes of affected participants and age-matched cognitive controls from diverse populations. The Genome Center for Alzheimer's Disease (GCAD) processed whole-genome sequencing data from 36,361 ADSP participants, including 35,014 genetically unique participants of which 45% are from non-European ancestry, across 17 cohorts in 14 countries in this fourth release (R4). This sequencing effort identified 387 million bi-allelic variants, 42 million short insertions/deletions, and 2.2 million structural variants. Annotations and quality control data are available for all variants and samples. Additionally, detailed phenotypes from 15,927 participants across 10 domains are also provided. A linkage disequilibrium panel was created using unrelated AD cases and controls. Researchers can access and analyze the genetic data via NIAGADS Data Sharing Service, the VariXam tool, or NIAGADS GenomicsDB.
Project description:Chordoid glioma is a rare brain tumor thought to arise from specialized glial cells of the lamina terminalis along the anterior wall of the third ventricle. Despite being histologically low-grade, chordoid gliomas are often associated with poor outcome, as their stereotypic location in the third ventricle makes resection challenging and efficacious adjuvant therapies have not been developed. Here we performed genomic profiling on 13 chordoid gliomas and identified a recurrent D463H missense mutation in PRKCA in all tumors, which localizes in the kinase domain of the encoded protein kinase C alpha (PKCα). Expression of mutant PRKCA in immortalized human astrocytes led to increased phospho-ERK and anchorage-independent growth that could be blocked by MEK inhibition. These studies define PRKCA as a recurrently mutated oncogene in human cancer and identify a potential therapeutic vulnerability in this uncommon brain tumor.