Project description:Predicting dairy bull fertility is a current challenge for the dairy industry. The goal of this study was to integrate DNA methylation data with previously published RNA sequencing results in order to identify candidate markers for sire fertility.
Project description:We report a novel high-throughput method to empirically quantify individual-specific regulatory element activity at the population scale. The approach combines targeted DNA capture with a high-throughput reporter-gene expression assay. As demonstration, we have measured the activity of more than 100 putative regulatory elements from 95 individuals in a single experiment. We found that, in agreement with previous reports, most genetic variants have weak effects on distal regulatory element activity. Because haplotypes are typically maintained within but not between assayed regulatory elements, the approach can be used to identify likely causal regulatory haplotypes that contribute to human phenotypes. Finally, we demonstrate the utility of the method to functionally fine map causal regulatory variants in regions of high linkage disequilibrium identified by expression quantitative trait loci (eQTL) analyses. 104 candidate regulatory elements from 95 individuals were resequenced using Illumina custom amplicon sequencing. We then cloned the resulting DNA fragments into a massively parallel reporter assay to quantify allele-specific regulatory activity from that population. SNP-fdr.txt contains output of significance evaluation haplotype.fasta.gz contains the reference used to generate alignment files
Project description:This experiment was conducted to generate targeted resequencing data covering a region associated with osteosarcoma in greyhounds. 8 greyhounds diagnosed with osteosarcoma and 7 greyhounds without tumors were sequenced. DNA from the 15 dogs was used to prepare libraries and hybrid capture performed to enrich the region of interest prior to paired-end sequencing using Illumina Genome Analyzer II. The reads were aligned to the dog-genome CanFam2.0 using bwa and pre-processed using Picard and GATK. Variant discovery was performed using GATK. The resulting list of variants were used in the study to finemap the associated region and look for causal variants. We submit the preprocessed BAM-files that still have all reads although some reads are flagged. We also submit the resulting vcf-file with called and filtered variants in all individuals.
Project description:Purpose: Screening the sperm sncRNAs that are responsible for dairy cattle fertility is of great interest, however, exploring the fertility-associated sncRNAs in sperm and linking them with the epigenetic inheritance in bovine has not been performed yet. Here in this study, we hypothesized that some sncRNAs in bovine sperm have a great potential to be linked with direct and immediate bull fertility data and could later influence the embryo and possibly impacting the daughter fertility. Methods: 12 bovine cryopreserved semen (high bull fertility, n=3 VS low bull fertility, n=3; high daughter fertility, n=3 vs low daughter fertility, n=3) that came from a pre-filtered 100 bull list (Figure 1) had been selected to extract total sperm RNA, the somatic cell lysis buffer had been added during the RNA extraction process to avoid the somatic cell pollution. The maternal and other confounding factors had been taken into consideration during the calculation of the phenotype criteria index.After the library construction, the library size that was smaller than 200 base pairs (adapter size around 125 nt) had been cut and sent for next-generation sequencing Results: bull fertility and daughter fertility related sncRNAs had been identified. Conclusions: providing promising epigenetic biomarker for cattle fertility improvement in the future, although these small non-coding RNAs need to be validated in larger sample sizes before being used as biomarkers.
Project description:Transcriptome studies in patients with rare genetic diseases can potentially aid in the interpretation of likely causal genetic variation through identification of altered transcript abundance and/or structure. RNA-Seq is the most sensitive assay for both investigating transcript structure and abundance. The primary aim of this pilot project is to investigate to what degree integrating exome-Seq and RNA-Seq data on the same individual can accelerate the identification of causal alleles for rare genetic diseases. There are two main strands to this: (i) identifying which variants discovered in exome-seq appear to be having a functional impact on transcripts, and (ii)identifying transcript outliers, especially among known causal genes, that may not necessarily have a causal variant identified from exome sequencing. The latter may identify the presence of causal variants that lie far from coding regions (e.g. the formation of cryptic splice sites deep within introns, or loss of long range regulatory elements), which can be confirmed with further targeted genetic assays. Just over 50% of all disease-causing variants recorded in theHuman Gene Mutation Database (HGMD) affect transcript structure and abundance (e.g.nonsense SNVs, essential splice site SNVs, frame shifting indels, CNVs).This pilot project will study RNA from lymphoblastoid cell-lines from 12 patients with primordial dwarfism syndromes, for 10 of these samples we have previously generate exome data as part of our collaboration with the group of Prof Andrew Jackson. The two remaining samples are positive controls where the causal mutation is known, and is known to affect transcript structure and/or abundance. Primordial dwarfism is a prime candidate for these RNA-seq studies because all known causal mutations to date have key roles in DNA replication and thus, unsurprisingly, the products of the causal genes are typically ubiquitously expressed. Each RNA will be sequenced, with two technical replicates (independent RT-PCR and libraries) per sample, and each replicate run in 1/2 of a HiSeq lane using 100bp paired reads. Samples preparation was as follows :The cells were grown to confluency, then pellets frozen at -80. RNA samples were prepared using the Qiagen RNeasy kit, then nanodropped and analyzed using the bioanalyzer to determine concentration and purity. This data is part of a pre-publication release. For information on the proper use of pre-publication data shared by the Wellcome Trust Sanger Institute (including details of any publication moratoria), please see http://www.sanger.ac.uk/datasharing/
Project description:Assisted reproduction technologies (ART) and high selection pressure observed in the dairy industry are leading towards the use of younger females for reproduction purpose, reducing the interval between generations. This situation might have an impact on embryo quality, which can affect the success rate of the procedures. This study aims to document the effect of donor age on embryo quality, on transcriptomic level, in order to caracterize the effect of using young animals for reproduction purpose. 10 young Holstein cows were used 3 times each at different ages for ovarian stimulation protocols and oocyte collections (at 8, 11 and 14 months). These oocytes were then fertilized in vitro with adult bull semen, generating 3 lots of embryos per animal. Semen from the same bull was used for all females, at all times. Each animal was used as its own control for age-effect evaluation. EmbryoGENE plateform allowed whole genome assessment of gene expression patterns at the blastocyst stage. Embryos from animals at 8 vs 14 months and at 11 vs 14 months were used for microarray hybridization. Validation was done performing RT-qPCR tests on 7 candidate genes.
Project description:Genome-wide association studies (GWAS) have identified more than 40 loci associated with Alzheimer’s disease (AD), but the causal variants, regulatory elements, genes and pathways remain largely unknown, impeding a mechanistic understanding of AD pathogenesis. Previously, we showed that AD risk alleles are enriched in myeloid-specific epigenomic annotations. Here, we show that they are specifically enriched in active enhancers of monocytes, macrophages and microglia. We integrated AD GWAS with myeloid epigenomic and transcriptomic datasets using analytical approaches to link myeloid enhancer activity to target gene expression regulation and AD risk modification. We identify AD risk enhancers and nominate candidate causal genes among their likely targets (including AP4E1, AP4M1, APBB3, BIN1, MS4A4A, MS4A6A, PILRA, RABEP1, SPI1, TP53INP1, and ZYX) in twenty loci. Fine-mapping of these enhancers nominates candidate functional variants that likely modify AD risk by regulating gene expression in myeloid cells. In the MS4A locus we identified a single candidate functional variant and validated it in human induced pluripotent stem cell (hiPSC)-derived microglia and brain. Taken together, this study integrates AD GWAS with multiple myeloid genomic datasets to investigate the mechanisms of AD risk alleles and nominates candidate functional variants, regulatory elements and genes that likely modulate disease susceptibility.
Project description:Purpose: To identify the genetic basis of posterior polymorphous corneal dystrophy 1 (PPCD1). Methods: Next-generation sequencing was performed on DNA samples from 4 affected and 4 unaffected members of a previously reported family with PPCD1 linked to chromosome 20 between D20S182 and D20S195. Custom capture probes were utilized for targeted region capture of the linked interval. Single nucleotide variants (SNVs) and insertions/deletions (indels) were identified using two bioinformatics pipelines and two annotation databases. Candidate variants met the following criteria: quality score ≥20, read depth ≥5X, heterozygous, novel or rare (minor allele frequency (MAF) ≤ 0.05), present in each affected individual and absent in each unaffected individual. Structural variants were detected with two different microarray platforms to identify indels of varying sizes. Results: Sequencing reads aligned to the linked region on chromosome 20, and high coverage was obtained across the sequenced region. The majority of identified variants were detected with both pipelines and annotation databases, although unique variants were identified. Twelve SNVs in 10 genes (2 synonymous variants and 10 noncoding variants) and 9 indels in 7 genes met the filtering criteria and were considered candidate variants for PPCD1. Conclusions: Next-generation sequencing of the PPCD1 interval has identified 17 genes containing novel or rare SNVs and indels that segregate with the affected phenotype in an affected family previously mapped to the PPCD1 locus. We anticipate that screening of these candidate genes in other families previously mapped to the PPCD1 locus will result in the identification of the genetic basis of PPCD1.
Project description:Assisted reproduction technologies (ART) and high selection pressure observed in the dairy industry are leading towards the use of younger females for reproduction purpose, reducing the interval between generations. This situation might have an impact on 10 young Holstein cows were used 3 times each at different ages for ovarian stimulation protocols and oocyte collections (at 8, 11 and 14 months). These oocytes were then fertilized in vitro with adult bull semen, generating 3 lots of embryos per animal.