Dataset Information

Improvements to the rice genome annotation through large-scale analysis of RNA-Seq and proteomics data sets

ABSTRACT: We have performed a Proteogenomics meta-analysis of data sets deposited in ProteomeXchange: PXD000265, PXD000313, PXD000923, PXD001030, PXD001058, PXD002291, PXD002739, PXD002740 and PXD003156 and using 29 RNA-Seq data sets on rice (Oryza sativa). We created a search database comprising translated reads that had been mapped onto the rice genome, as well as officially annotated rice proteins sequences. The RNA Seq database was pre-processed to identify “novel transcripts” for those not mapping fully to an existing exon, and “novel junctions” for those reads mapped with a gap, implying a potential novel splice site that was not annotated in the official gene set. Confidentially identified “novel peptides” i.e. those mapping to a novel junction or novel transcript were post-processed to ensure that there were no other better explanations for the corresponding spectra e.g. peptide from a canonical gene with a modification or amino acid substitution. Data were exported from the pipeline in PSI mzIdentML 1.2 format, containing chromosomal coordinates, and further converted to PSI proBed format for genome visualisation. Novel peptides were searched against other plant databases using BLAST to see if they had predicted in genes from other species. A total of 1584 novel peptides were identified, mapping to ~700 genomic loci in which either new genes have been predicted (~100) or updates to existing gene models have been predicted (~600).

INSTRUMENT(S):

ORGANISM(S): Oryza Sativa (rice)

TISSUE(S): Plant Cell, Callus, Flower, Embryo, Seedling, Egg, Semen, Sperm, Seed, Pollen, Endosperm, Meiotic Cell

SUBMITTER: Da Qi

LAB HEAD: Andrew R Jones

PROVIDER: PXD008960 | Pride | 2018-10-30

REPOSITORIES: Pride

ACCESS DATA

Dataset's files

Source:

Items per page:

1 - 5 of 492

Publications

Improvements to the Rice Genome Annotation Through Large-Scale Analysis of RNA-Seq and Proteomics Data Sets.

Ren Zhe Z Qi Da D Pugh Nina N Li Kai K Wen Bo B Zhou Ruo R Xu Shaohang S Liu Siqi S Jones Andrew R AR

Molecular & cellular proteomics : MCP 20181006 1

Rice (<i>Oryza sativa</i>) is one of the most important worldwide crops. The genome has been available for over 10 years and has undergone several rounds of annotation. We created a comprehensive database of transcripts from 29 public RNA sequencing data sets, officially predicted genes from Ensembl plants, and common contaminants in which to search for protein-level evidence. We re-analyzed nine publicly accessible rice proteomics data sets. In total, we identified 420K peptide spectrum matches ...[more]

PMID: 30293062

Similar Datasets

Project description:Biotrophic plant pathogens have evolved sophisticated strategies to manipulate their host. They derive all of their nutrients from living plant tissues, by making intimate contact with their host while avoiding a resistance response. Rice is one of the most important crop plants worldwide and an excellent model system for studying monocotyledonous plants. Estimates of annual yield losses due to plant-parasitic nematodes on this crop range from 10 to 25% worldwide. One of the agronomically most important nematodes attacking rice is the rice root knot nematode Meloidogyne graminicola. Attack of plant roots by sedentary plant parasitic nematodes, like the root knot nematodes (RKN; Meloidogyne spp.) involves the development of specialized feeding cells in the vascular tissue. The second stage juvenile of the RKN punctures selected vascular cells with its stylet, injects pharyngeal secretions, and this ultimately leads to the reorganisation of these cells into typical feeding structures called giant cells (GCs), from which the nematode feeds for the remainder of its sedentary life cycle (Gheysen & Mitchum, 2011). Morphological and physiological reprogramming of the initial feeding cell leads to nucleus enlargement, proliferation of mitochondria and plastids, metabolic activation, cell cycle alterations and cell wall changes (Gheysen and Mitchum, 2011). The hyperplasia and hypertrophy of the surrounding cells leads to the formation of a root gall, which is typically formed at the root tips in the case of the rice RKN M. graminicola. In comparison with other RKN, M. graminicola has a very fast life cycle, with swelling of the root tips observed as early as 1 day after infection (dai). At 3 dai, terminal hook-like galls are clearly visible (Bridge et al., 2005). After 3 moults the nematodes are mature, around 10 dai. The M. graminicola females lay their eggs inside the galls, while most other RKN deposit egg masses at the gall surface, and hatched juveniles can reinfect the same or adjacent roots. In well-drained soil at 22-29 degrees C the life cycle of M. graminicola is completed in 19 days. 2 biological replicates of nematode infected giant cells and control vascular cells were sampled at two time points: 7 and 14 dai

Project description:Endogenous small RNAs, including microRNAs (miRNAs) and short-interfering RNAs (siRNAs), function as posttranscriptional or transcriptional regulators in plants. miRNA function is essential for normal development and therefore likely to be important in the growth of the rice grain. To investigate the likely roles of miRNAs in rice grain development we carried out deep sequencing of the small RNA populations of rice grains. A total of 96,091 (including 23,867 reads from vegetative tissues) and 5,379,724 small RNA sequences that are longer than 17nt were generated. Approximately 94% of these small RNAs were 20-24nt in length. The majority of the small RNAs were singletons, indicating that rice genome has a very complex small RNA population, which is harder to be saturated. From these smal RNA sequences we found representatives of all 20 conserved plant miRNA families and evidence for changes in expression of miRNAs during rice grain development. Using an approach based on the presence of the miRNA and miRNA* sequences, we identified 51 novel, non-conserved rice miRNA families expressed in grains with functionally diverse predicted target genes. miRNA-guided cleavage was confirmed for a number of targets genes including ones with roles in sugar signalling and restoration of cytoplasmic male sterility. We identified a likely mirtron, indicating that plants can also use spliced introns as a source of miRNAs. Our sequencing results revealed four TAS3 loci; these all contain dual miR390 sites of which only the 3? site is cleaved. We also found a miRNA-like long hairpin generating phased 21nt small RNAs, strongly expressed in developing grains and show that these small RNAs act in trans to cleave target mRNAs. Keywords: high throughput pyrosequencing, small RNA, microRNA, grain development, rice Small RNA populations of shoots and roots of 7 days old seedlings, and 1-5 and 6-10 days after fertilization grains were determined using high throughput sequencing technology. The abundance of known miRNAs were compared based on the number of sequence reads. To give a whole picture of the rice small RNA populations and to reflect un-biasly the sequencing results, small RNAs that are longer than 17nt no matter whether or not they matched with the rice genome were included in this submitted dataset. The unmatched sequences may be derived from un-sequenced regions or sequencing errors.

			Action	DRS
	10_1.mgf	Mgf
	10_1.pride.mgf.gz	Mgf
	10_1.raw	Raw
	10_2.mgf	Mgf
	10_2.pride.mgf.gz	Mgf