Project description:Purpose: In order to understand the functional significance of sperm transcriptome in stallion fertility, the aim of this study was to generate a detailed body of knowledge about the sperm RNA profile that defines a normal fertile stallion. Methods: The 50 bp single-end ABI SOLiD raw reads were directly aligned with the horse reference sequence EcuCab2 using ABI aligner software (NovoalignCS version 1.00.09, novocraft.com) which uses multiple indexes in the reference genome, identifies candidate alignment locations for each primary read, and allows completion of the alignment. Results: Next generation sequencing (NGS) of total RNA from the sperm of two reproductively normal stallions generated about 70 million raw reads and more than 3 Gb of sequence per sample; over half of these aligned with the EcuCab2 reference genome. Altogether, 19,257 sequence tags with average coverage ≥1 (normalized number of transcripts) were mapped in the horse genome. Conclusion: The sequence of stallion sperm transcriptome is an important foundation for the discovery of transcripts of known and novel genes, and non-coding RNAs, thus improving the annotation of the horse genome sequence draft and providing markers for evaluating stallion fertility. Reproductively fertile Stallion sperm transcriptome as revealed by RNA sequencing
Project description:Whole exome sequencing of 5 HCLc tumor-germline pairs. Genomic DNA from HCLc tumor cells and T-cells for germline was used. Whole exome enrichment was performed with either Agilent SureSelect (50Mb, samples S3G/T, S5G/T, S9G/T) or Roche Nimblegen (44.1Mb, samples S4G/T and S6G/T). The resulting exome libraries were sequenced on the Illumina HiSeq platform with paired-end 100bp reads to an average depth of 120-134x. Bam files were generated using NovoalignMPI (v3.0) to align the raw fastq files to the reference genome sequence (hg19) and picard tools (v1.34) to flag duplicate reads (optical or pcr), unmapped reads, reads mapping to more than one location, and reads failing vendor QC.
Project description:We use nucleosome maps obtained by high-throughput sequencing to study sequence specificity of intrinsic histone-DNA interactions. In contrast with previous approaches, we employ an analogy between a classical one-dimensional fluid of finite-size particles in an arbitrary external potential and arrays of DNA-bound histone octamers. We derive an analytical solution to infer free energies of nucleosome formation directly from nucleosome occupancies measured in high-throughput experiments. The sequence-specific part of free energies is then captured by fitting them to a sum of energies assigned to individual nucleotide motifs. We have developed hierarchical models of increasing complexity and spatial resolution, establishing that nucleosome occupancies can be explained by systematic differences in mono- and dinucleotide content between nucleosomal and linker DNA sequences, with periodic dinucleotide distributions and longer sequence motifs playing a secondary role. Furthermore, similar sequence signatures are exhibited by control experiments in which genomic DNA is either sonicated or digested with micrococcal nuclease in the absence of nucleosomes, making it possible that current predictions based on highthroughput nucleosome positioning maps are biased by experimental artifacts. Included are raw (eland) and mapped (wig) reads. The mapped reads are provided in eland and wiggle formats, and the raw reads are included in the eland file. This series includes only Mnase control data. The sonicated control is part of this already published accession, as is a in vitro nucleosome map: http://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE15188 We also studied data (in vitro and in vivo maps as well as a model) from http://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE13622 and from: http://www.ncbi.nlm.nih.gov/sra/?term=SRA001023
Project description:Here, we performed deep transcriptome sequencing for the aerial-tissues and the roots of S. japonica, generating over 2 billion raw reads with an average length of 101 nt by using an Illumina paired-end sequencing by HiSeq2000 platform. Using a combined approach of three popular assemblers, de novo transcriptome assembly for S. japonica was obtained, yielding in 81,729 unigenes with an average length as 884bps and N50-value as 1,452bps, with 46,963 unigenes being annotated based on the sequence similarity against NCBI-nr protein database. Transcriptome profiling of the aerial-tissues and the roots of Swertia japonica
Project description:Purpose: Here we describe the modulation of a gene expression program involved in cell fate. Methods: We depleted U2AF1 in human induced pluripotent stem cells (hiPSCs) to the level found in differentiated cells using an inducible shRNA system, followed by high-throughput RNAseq, revealing a gene expression program involved in cell fate determination. Results: Approximately 85% of the total raw reads were mapped to the human genome sequence (GRCh37), giving an average of 200 million human reads per sample for total RNA and 15 million human reads per sample for small RNA libraries. Conclusions: Our results show that transcriptional control of gene expression in hiPSCs can be set by the CSF U2AF1, establishing a direct link between transcription and AS during cell fate determination. Overall design: hiPSCs were differentiated into the three germ layers following the described protocol in the study (Gifford et al., 2013).
Project description:10^7 HL-60 cells were treated with 10 uM ATRA for two and five days. 10% whole cell lysates were saved as input after genomic DNA was broken into 200-500 bp by sonication. 1 μg IP grade antibodies of CTCF, H3K4me3 or H3K27me3 (CST, Boston, USA) were incubated with the rest of the lysate overnight, followed by 2 h protein-A beads incubation at 4 °C for target protein pull down. The CTCF enriched or H3K4/27me3 modified DNA or input DNA were repaired to 3’-dA overhang and added the ligated adapter. The DNA library was eliminated the unligated adapters and selected the appropriate size for sequence using an Illumina X Ten platform. The raw sequence reads of input and IP were trimmed adaptors and filter out low quality reads using Cutadapt (v1.9.1) and Trimmomatic (v0.35), and checked the quality of clean reads using Fastqc. Next, clean reads were mapped to the human genome (assembly hg38) using the Bowtie 2 (v2.2.6) algorithm. The process of peak calling (p<0.01) were performed by MACS 2 (v2.1.1) and analyzed the different binding domains based on FDR value less than 0.05 and annotated by DiffBind. De novo motif were analyzed using the R language and MEME. The peaks on certain genomic loci were visualized by Integrative Genomics Viewer (IGV). Gene ontology (GO) Analysis was used to interpret the biological function of the genes associated with differential peaks.
Project description:Purpose: The goals of this study are using RNA-seq to obtain cucumber and Botrytis cinerea transcriptome changes during infection Methods: mRNA profiles of anti-infection samples and interaction sample were generate by deep sequencing,using Illumina Hiseq 2500. The sequence reads that passed quality filters were analyzed at the transcript isoform level with two methods: Burrows–Wheeler Aligner (BWA) followed by ANOVA (ANOVA) and TopHat followed by Cufflinks. qRT–PCR validation was performed using SYBR Green assays Results: Using an optimized data analysis workflow,In total, 248,908,688 raw reads were generated; after removing low-quality reads and those containing adapter and poly-N, 238,341,648 clean reads remained to map the reference genome. There were 3,512 cucumber (differential expression genes) DEGs and 1,735 B. cinerea DEGs. GO enrichment and KEGG enrichment analysis were performed on these DEGs to study the interaction between cucumber and B. cinerea. To verify the reliability and accuracy of our transcriptome data, 5 cucumber DEGs and 5 B. cinerea DEGs were chosen for RT-PCR verification. Conclusions:To the best of our knowledge, this is the first analysis of large-scale transcriptome changes of cucumber during the infection of Botrytis cinerea. These results will increase our understanding of the molecular mechanisms of the cucumber defense Botrytis cinerea and may be used to protect plants against disasters caused by necrotrophic fungal pathogens. mRNA profiles of infection and anti-infection cucumber were generated by deep sequencing, using Illumina Hiseq 2500 .
Project description:Purpose: The goals of this study are to identify the putative mRNA targets that are regulated by the 6C sRNA. We constuct an inducible vector to transiently overexpressed the 6C sRNA in M. smegmatis, and then we perform RNA-Seq to look for genes that are differicently expressed upon the over-expression of 6C sRNA, which we think these genes are the potential targets of the 6C sRNA. Overall design: Methods: Purified RNA was used to construct cDNA library according to the TruSeq Stranded RNA LT Guide from Illumina. High-throughput sequencing was carried out on an Illumina HiSeq 2000 system according to the manufacturer's instructions (Illumina HiSeq 2000 User Guide) and 150-bp paired-end reads were obtained. The raw reads were filtered by Seqtk and then mapped to the M. smegmatis MC2 155 strain reference sequence (GenBank NC_008596) using Bowtie2 (version: 2-2.0.5). Counting of reads per gene was performed using HTSeq followed by TMM (trimmed mean of M-values) normalization. Differentially expressed genes were defined as those with a false discovery rate < 0.05 and fold-change >2 using the edgeR software.
Project description:We report the total number of differentially expressed enzyme genes involved in starch biosynthesis of banana fruit using transcriptome profiling analysis. A sequencing depth of over 4.4 billion raw reads for each of six libraries was obtained using RNA-seq analysis.
Project description:Chromatin accessibility captures the binding status of protein factors to chromosomes in vivo, and has been considered a highly informative proxy for functional protein-DNA interactions. Existing DNase I and Tn5 transposase based assays generally require tens of thousands to millions of fresh cells. Applying Tn5 tagmentation to single cells yields very sparse maps. Here we present a transposome hypersensitive sites sequencing assay (THS-seq) for highly sensitive characterizations of chromatin accessibility. Validation of THS-seq method, and comparison of DNase-seq, ATAC-seq and THS-seq methods for quantitation of chromatin accessibility. GSE47753, GSM1155957 GM12878_ATACseq_50k_Rep1, data downsampled to 8,351,125 reads for analysis: pub_SRR891268_ATAC-seq_50k_cells_Rep1_downsampled_dfilter_peaks.bed.gz GSE47753, GSM1155958 GM12878_ATACseq_50k_Rep2, data downsampled to 8,351,125 reads for analysis: pub_SRR891269_ATAC-seq_50k_cells_Rep2_downsampled_dfilter_peaks.bed.gz GSE47753, GSM1155959 GM12878_ATACseq_50k_Rep3, data downsampled to 8,351,125 reads for analysis: pub_SRR891270_ATAC-seq_50k_cells_Rep3_downsampled_dfilter_peaks.bed.gz GSE47753, GSM1155960 GM12878_ATACseq_50k_Rep4, data downsampled to 8,351,125 reads for analysis: pub_SRR891271_ATAC-seq_50k_cells_Rep4_downsampled_dfilter_peaks.bed.gz GSE47753, GSM1155961 GM12878_ATACseq_500_Rep1, data downsampled to 8,351,125 reads for analysis: pub_SRR891272_ATAC-seq_500_cells_Rep1_downsampled_dfilter_peaks.bed.gz GSE47753, GSM1155962 GM12878_ATACseq_500_Rep2, data downsampled to 8,351,125 reads for analysis: pub_SRR891273_ATAC-seq_500_cells_Rep2_downsampled_dfilter_peaks.bed.gz raw data files were merged, alignment with bowtie 1.3 using parameters, bowtie -n1 -k1 --best --chunkmbs 10240 --strata -l32 -m1 -p4 --nomaqround --sam, followed by clonal read removal for the final data file for further analysis. GSM816665, raw data obtained from UCSC genome browser, https://genome.ucsc.edu/cgi-bin/hgFileUi?db=hg19&g=wgEncodeOpenChromDnase: wgEncodeOpenChromDnaseGm12878RawData_merged_unique_dfilter_peaks.bed.gz raw data files were merged, alignment with bowtie 1.3 using parameters, bowtie -n1 -k1 --best --chunkmbs 10240 --strata -l32 -m1 -p4 --nomaqround --sam, followed by clonal read removal for the final data file for further analysis. GSM864360, raw data obtained from UCSC genome browser, https://genome.ucsc.edu/cgi-bin/hgFileUi?db=hg19&g=wgEncodeOpenChromFaire: wgEncodeOpenChromFaireGm12878RawData_merged_unique_dfilter_peaks.bed.gz raw data files were merged, alignment with bowtie 1.3 using parameters, bowtie -n1 -k1 --best --chunkmbs 10240 --strata -l32 -m1 -p4 --nomaqround --sam, followed by clonal read removal for the final data file for further analysis. GSM736496, GSM736620, raw data obtained from UCSC genome browser, https://genome.ucsc.edu/cgi-bin/hgFileUi?db=hg19&g=wgEncodeUwDnase: wgEncodeUwDnaseGm12878RawData_merged_unique_dfilter_peaks.bed.gz