Project description:More than 2x10E9 sequences made on Illumina platform derived from the genome of E14 embryonic stem cells cultured in our laboratory were used to build a database of about 2.7x10E6 single nucleotide variant. The database was validated using other two sequencing datasets from other laboratory and high overlap was observed. The identified variant are enriched on intergenic regions, but several thousands reside on gene exons and regulatory regions, such as promoters, enhancers, splicing site and untranslated regions of RNA, thus indicating high probability of an important functional impact on the molecular biology of this cells. We created a new E14 genome assembly including the new identified variants and used it to map reads from next generation sequencing data generated in our laboratory or in others on E14 cell line. We observed an increase in the number of mapped reads of about 5%. CpG dinucleotide showed the higher variation frequency, probably because of it could be target of DNA methylation. We performed a reduced representation bisulfite sequencing on E14 cell line to test our new genome assembly with respect to the mm9 genome reference. After mapping and methylation status calling, we obtained an increase of about 120,000 called CpG and we avoided about 20,000 wrong CpG calling. genotyping of E14 embryonic stem cells (ESCs) and Reduced representation Bisulfite Sequencing (RRBS) of E14 ESCs.
Project description:Porcine 60K BeadChip genotyping arrays (Illumina) are increasingly being applied in pig genomics to validate SNPs identified by re-sequencing or assembly-versus-assembly method. Here we report that more than 98% SNPs identified from the porcine 60K BeadChip genotyping array (Illumina) were consistent with the SNPs identified from the assembly-based method. This result demonstrates that whole-genome de novo assembly is a reliable approach to deriving accurate maps of SNPs. To compare SNPs identified by genotyping arrays and de novo assembly method, we genotyped 10 pig breeds by porcine 60K BeadChip genotyping array (Illumina), including 1 berkshire pig, 1 hampshire pig, 1 landrace pig, 1 large white pig,1 piétrain pig, 1 bamei pig,1 jinhua pig, 1 meishan pig, 1 rongchang pig and 1 Tibetan wild boar.
Project description:Using RNA sequencing and de novo transcript assembly, we identified 4516 lncRNAs expressed in 8 different stages of B cell development and activation. Chromatin immuno-precipitation sequencing was used to classify a substantial fraction (38%) of these lncRNAs as enhancer-associated or promoter-associated RNAs (eRNAs or pRNAs). A catalogue of lncRNAs expressed in eight murine B cell populations
Project description:PAPD5 is one of the seven members of non-canonical poly(A) polymerases in human cells. There are previous reports about polyadenylation dependent degradation of pre-ribosomal RNAs and uridylation dependent degradation of histone mRNAs in vivo. In this study, we observed polyadenylation but not polyuridylation activity of PAPD5 with in vitro assays. We aimed to get genome-wide targets of PAPD5 and used PAR-CLIP and deep sequencing for this purpose. Recombinant version of PAPD5 is expressed in HEK293 human cell lines and its genome wide targets are obtained with PAR-CLIP and deep sequencing as two replicate experiments. The short reads in the deep sequencing libraries of PAPD5 replicates and an unrelated protein to polymerization from a previous study, IGF2BP1, are aligned to the hg18 human genome assembly. The biological variance of the read counts in overlapping 100-nucleotide-long-windows is estimated between the PAPD5 replicates and further used in the differential expression estimations between the 100-nucleotide windows in PAPD5 replicates and IGF2BP1. The top differentially expressed windows in PAPD5 and IGF2BP1 are further annotated using gene and repeat tracks from UCSC.