Project description:We sequenced and analyzed the genome of a highly inbred miniature Chinese pig strain, the Banna Minipig Inbred Line (BMI). we conducted whole genome screening using next generation sequencing (NGS) technology and performed SNP calling using Sus Scrofa genome assembly Sscrofa11.1.
Project description:More than 2x10E9 sequences made on Illumina platform derived from the genome of E14 embryonic stem cells cultured in our laboratory were used to build a database of about 2.7x10E6 single nucleotide variant. The database was validated using other two sequencing datasets from other laboratory and high overlap was observed. The identified variant are enriched on intergenic regions, but several thousands reside on gene exons and regulatory regions, such as promoters, enhancers, splicing site and untranslated regions of RNA, thus indicating high probability of an important functional impact on the molecular biology of this cells. We created a new E14 genome assembly including the new identified variants and used it to map reads from next generation sequencing data generated in our laboratory or in others on E14 cell line. We observed an increase in the number of mapped reads of about 5%. CpG dinucleotide showed the higher variation frequency, probably because of it could be target of DNA methylation. We performed a reduced representation bisulfite sequencing on E14 cell line to test our new genome assembly with respect to the mm9 genome reference. After mapping and methylation status calling, we obtained an increase of about 120,000 called CpG and we avoided about 20,000 wrong CpG calling. genotyping of E14 embryonic stem cells (ESCs) and Reduced representation Bisulfite Sequencing (RRBS) of E14 ESCs.
Project description:More than 2x10E9 sequences made on Illumina platform derived from the genome of E14 embryonic stem cells cultured in our laboratory were used to build a database of about 2.7x10E6 single nucleotide variant. The database was validated using other two sequencing datasets from other laboratory and high overlap was observed. The identified variant are enriched on intergenic regions, but several thousands reside on gene exons and regulatory regions, such as promoters, enhancers, splicing site and untranslated regions of RNA, thus indicating high probability of an important functional impact on the molecular biology of this cells. We created a new E14 genome assembly including the new identified variants and used it to map reads from next generation sequencing data generated in our laboratory or in others on E14 cell line. We observed an increase in the number of mapped reads of about 5%. CpG dinucleotide showed the higher variation frequency, probably because of it could be target of DNA methylation. We performed a reduced representation bisulfite sequencing on E14 cell line to test our new genome assembly with respect to the mm9 genome reference. After mapping and methylation status calling, we obtained an increase of about 120,000 called CpG and we avoided about 20,000 wrong CpG calling.
Project description:This dataset includes somatic small variant calling files derived from fifteen metastatic samples from cutaneous squamous cell carcinoma matched to normal blood samples. These samples were whole-genome sequenced by HiSeq X Ten and the resulting reads were mapped against the human genome (hg37) using BWA-MEM 0.7.10-r789. Somatic variant calling was then performed using strelka 1 (version 2.0.17).
Project description:The study included 15 patients (7 males, 8 females) with JMML. Peripheral blood and/or bone marrow aspirates were collected on EDTA at diagnosis. Non-hematopoietic tissues (fibroblasts) was derived from skin biopsy for each patient. Exome sequencing was performed in several distinct series between 2012 and 2017, which explains the differences in capture kit versions and reference genome version.Targeted enrichment and massive parallel sequencing were performed on paired genomic DNA from leukocytes and fibroblasts. Exome capture was carried out using the SureSelect Human All Exon V4+UTRs or V5 or V5+UTRs or SureSelect Clinical Research (Agilent Technologies, Santa Clara, CA, USA) according to manufacturer’s instruction and protocols by IntegraGen (Evry, France). Paired-end 75 bases sequencing was performed on a HiSeq2000 or HiSeq4000 instrument (Illumina, San Diego, CA, USA). Image analysis and base calling were performed using the Real Time Analysis (RTA) pipeline v. 1.14 (Illumina) with default parameters. The alignment of paired-end reads to the reference human genome (UCSC GRCh37/hg19 or UCSC GRCh38), variant calling and generation of Quality variants scores were carried out using the CASAVA v.1.8 pipeline (Illumina).
Project description:Single-cell sequencing methodologies such as scRNA-seq and scATAC-seq have become widespread and effective tools to interrogate tissue composition. Increasingly, variant callers are being applied to these methodologies to resolve the genetic heterogeneity of a sample, especially in the case of detecting the clonal architecture of a tumor. Typically, traditional bulk DNA variant callers are applied to the pooled reads of a single-cell library to detect candidate mutations. Recently, multiple studies have applied such callers on reads from individual cells, with some citing the ability to detect rare variants with higher sensitivity. Many studies apply these two approaches to the Chromium (10x Genomics) scRNA-seq and scATAC-seq methodologies. However, Chromium-based libraries may offer additional challenges to variant calling compared to existing single-cell methodologies, raising questions for the validity of variants obtained from such a workflow. To determine the merits and challenges of various variant-calling approaches on Chromium scRNA-seq and scATAC-seq libraries, we use sample libraries with matched bulk whole-genome-sequencing to evaluate the performance of callers. We review caller performance, finding that bulk callers applied on pooled reads significantly outperform individual-cell approaches. We also evaluate variants unique to scRNA-seq and scATAC-seq methodologies, finding patterns of noise but also potential capture of RNA-editing events. Finally, we review the notion that variant calling at the single-cell level can detect rare somatic variants, providing empirical results that suggest resolving such variants is infeasible in single-cell Chromium libraries.
Project description:Single-cell sequencing methodologies such as scRNA-seq and scATAC-seq have become widespread and effective tools to interrogate tissue composition. Increasingly, variant callers are being applied to these methodologies to resolve the genetic heterogeneity of a sample, especially in the case of detecting the clonal architecture of a tumor. Typically, traditional bulk DNA variant callers are applied to the pooled reads of a single-cell library to detect candidate mutations. Recently, multiple studies have applied such callers on reads from individual cells, with some citing the ability to detect rare variants with higher sensitivity. Many studies apply these two approaches to the Chromium (10x Genomics) scRNA-seq and scATAC-seq methodologies. However, Chromium-based libraries may offer additional challenges to variant calling compared to existing single-cell methodologies, raising questions for the validity of variants obtained from such a workflow. To determine the merits and challenges of various variant-calling approaches on Chromium scRNA-seq and scATAC-seq libraries, we use sample libraries with matched bulk whole-genome-sequencing to evaluate the performance of callers. We review caller performance, finding that bulk callers applied on pooled reads significantly outperform individual-cell approaches. We also evaluate variants unique to scRNA-seq and scATAC-seq methodologies, finding patterns of noise but also potential capture of RNA-editing events. Finally, we review the notion that variant calling at the single-cell level can detect rare somatic variants, providing empirical results that suggest resolving such variants is infeasible in single-cell Chromium libraries.
Project description:Purpose: Recent development in high-throughput sequencing techniques (RNA-seq) has enabled large scale analysis of genetic variations and gene expression in different tissues and species, but gene expression patterns and genetic variations in livestock have not yet been well characterized. In this study we have used high-throughput transcriptomic sequencing of the Finnish Yorkshire to identify gene expression patterns within the breed in the testis and oviduct. Methods: The Solid 4 reads were mapped against the pig genome build 10.2 using the colorspace alignment tool provided by Applied Biosystems and distributed with the instrument (LifeScope v2.1). Reads associated with ribosomal RNA, transfer RNA, repeats and other uninformative reads were filtered out during the process as well as reads with more than 10 potential alignments. After alignment to the reference genome, low mapping quality reads were discarded (mapQV(<10)) and unique reads were associated with known genes based on UCSC annotations, and the number of reads aligned within each gene was counted. FPKM values were calculated for normalization of the data to remove variation between samples caused by non-biological reasons using the Cufflinks software v2.0.2. Results: The analysis of gene expression differences between the testis and oviduct highlighted 1,234 genes up-regulated in the testis and 1,501 in the oviduct. Conclusions: The RNA-seq technology used in this study provides novel information about transcript expression and differential gene expression in the testis and oviduct. The produced data will assist in the identification of candidate genes based on association mapping results within the pig population and provides insights into the expression of genes in the two reproductive organs studied.