Rice Annotation Project Database (RAP-DB): an integrative and interactive database for rice genomics.
ABSTRACT: The Rice Annotation Project Database (RAP-DB, http://rapdb.dna.affrc.go.jp/) has been providing a comprehensive set of gene annotations for the genome sequence of rice, Oryza sativa (japonica group) cv. Nipponbare. Since the first release in 2005, RAP-DB has been updated several times along with the genome assembly updates. Here, we present our newest RAP-DB based on the latest genome assembly, Os-Nipponbare-Reference-IRGSP-1.0 (IRGSP-1.0), which was released in 2011. We detected 37,869 loci by mapping transcript and protein sequences of 150 monocot species. To provide plant researchers with highly reliable and up to date rice gene annotations, we have been incorporating literature-based manually curated data, and 1,626 loci currently incorporate literature-based annotation data, including commonly used gene names or gene symbols. Transcriptional activities are shown at the nucleotide level by mapping RNA-Seq reads derived from 27 samples. We also mapped the Illumina reads of a Japanese leading japonica cultivar, Koshihikari, and a Chinese indica cultivar, Guangluai-4, to the genome and show alignments together with the single nucleotide polymorphisms (SNPs) and gene functional annotations through a newly developed browser, Short-Read Assembly Browser (S-RAB). We have developed two satellite databases, Plant Gene Family Database (PGFD) and Integrative Database of Cereal Gene Phylogeny (IDCGP), which display gene family and homologous gene relationships among diverse plant species. RAP-DB and the satellite databases offer simple and user-friendly web interfaces, enabling plant and genome researchers to access the data easily and facilitating a broad range of plant research topics.
Project description:The Rice Annotation Project Database (RAP-DB) was created to provide the genome sequence assembly of the International Rice Genome Sequencing Project (IRGSP), manually curated annotation of the sequence, and other genomics information that could be useful for comprehensive understanding of the rice biology. Since the last publication of the RAP-DB, the IRGSP genome has been revised and reassembled. In addition, a large number of rice-expressed sequence tags have been released, and functional genomics resources have been produced worldwide. Thus, we have thoroughly updated our genome annotation by manual curation of all the functional descriptions of rice genes. The latest version of the RAP-DB contains a variety of annotation data as follows: clone positions, structures and functions of 31 439 genes validated by cDNAs, RNA genes detected by massively parallel signature sequencing (MPSS) technology and sequence similarity, flanking sequences of mutant lines, transposable elements, etc. Other annotation data such as Gnomon can be displayed along with those of RAP for comparison. We have also developed a new keyword search system to allow the user to access useful information. The RAP-DB is available at: http://rapdb.dna.affrc.go.jp/ and http://rapdb.lab.nig.ac.jp/.
Project description:BACKGROUND:Rice research has been enabled by access to the high quality reference genome sequence generated in 2005 by the International Rice Genome Sequencing Project (IRGSP). To further facilitate genomic-enabled research, we have updated and validated the genome assembly and sequence for the Nipponbare cultivar of Oryza sativa (japonica group). RESULTS:The Nipponbare genome assembly was updated by revising and validating the minimal tiling path of clones with the optical map for rice. Sequencing errors in the revised genome assembly were identified by re-sequencing the genome of two different Nipponbare individuals using the Illumina Genome Analyzer II/IIx platform. A total of 4,886 sequencing errors were identified in 321 Mb of the assembled genome indicating an error rate in the original IRGSP assembly of only 0.15 per 10,000 nucleotides. A small number (five) of insertions/deletions were identified using longer reads generated using the Roche 454 pyrosequencing platform. As the re-sequencing data were generated from two different individuals, we were able to identify a number of allelic differences between the original individual used in the IRGSP effort and the two individuals used in the re-sequencing effort. The revised assembly, termed Os-Nipponbare-Reference-IRGSP-1.0, is now being used in updated releases of the Rice Annotation Project and the Michigan State University Rice Genome Annotation Project, thereby providing a unified set of pseudomolecules for the rice community. CONCLUSIONS:A revised, error-corrected, and validated assembly of the Nipponbare cultivar of rice was generated using optical map data, re-sequencing data, and manual curation that will facilitate on-going and future research in rice. Detection of polymorphisms between three different Nipponbare individuals highlights that allelic differences between individuals should be considered in diversity studies.
Project description:Omics research in Oryza sativa (rice) relies on the use of multiple databases to obtain different types of information to define gene function. We present Rice DB, an Oryza information portal that is a functional genomics database, linking gene loci to comprehensive annotations, expression data and the subcellular location of encoded proteins. Rice DB has been designed to integrate the direct comparison of rice with Arabidopsis (Arabidopsis thaliana), based on orthology or 'expressology', thus using and combining available information from two pre-eminent plant models. To establish Rice DB, gene identifiers (more than 40 types) and annotations from a variety of sources were compiled, functional information based on large-scale and individual studies was manually collated, hundreds of microarrays were analysed to generate expression annotations, and the occurrences of potential functional regulatory motifs in promoter regions were calculated. A range of computational subcellular localization predictions were also run for all putative proteins encoded in the rice genome, and experimentally confirmed protein localizations have been collated, curated and linked to functional studies in rice. A single search box allows anything from gene identifiers (for rice and/or Arabidopsis), motif sequences, subcellular location, to keyword searches to be entered, with the capability of Boolean searches (such as AND/OR). To demonstrate the utility of Rice DB, several examples are presented including a rice mitochondrial proteome, which draws on a variety of sources for subcellular location data within Rice DB. Comparisons of subcellular location, functional annotations, as well as transcript expression in parallel with Arabidopsis reveals examples of conservation between rice and Arabidopsis, using Rice DB (http://ricedb.plantenergy.uwa.edu.au).
Project description:Oryza sativa (rice) plays an essential food security role for more than half of the world's population. Obtaining crops with high levels of disease resistance is a major challenge for breeders, especially today, given the urgent need for agriculture to be more sustainable. Plant resistance genes are mainly encoded by three large leucine-rich repeat (LRR)-containing receptor (LRR-CR) families: the LRR-receptor-like kinase (LRR-RLK), LRR-receptor-like protein (LRR-RLP) and nucleotide-binding LRR receptor (NLR). Using lrrprofiler, a pipeline that we developed to annotate and classify these proteins, we compared three publicly available annotations of the rice Nipponbare reference genome. The extended discrepancies that we observed for LRR-CR gene models led us to perform an in-depth manual curation of their annotations while paying special attention to nonsense mutations. We then transferred this manually curated annotation to Kitaake, a cultivar that is closely related to Nipponbare, using an optimized strategy. Here, we discuss the breakthrough achieved by manual curation when comparing genomes and, in addition to 'functional' and 'structural' annotations, we propose that the community adopts this approach, which we call 'comprehensive' annotation. The resulting data are crucial for further studies on the natural variability and evolution of LRR-CR genes in order to promote their use in breeding future resilient varieties.
Project description:Rice has developed several morphological and physiological strategies to adapt to phosphate starvation in the soil. In order to elucidate the molecular basis of response to phosphate starvation, we performed mRNA sequencing of 4 rice cultivars with variation in growth response to Pi starvation as indicated by the shoot/root dry weight ratio. Approximately 254 million sequence reads were mapped onto the IRGSP-1.0 reference rice genome sequence and an average of about 5,000 transcripts from each cultivar were found to be responsive under phosphate starvation. Comparative analysis of the RNA-Seq profiles of the 4 cultivars revealed similarities as well as distinct differences in expression of these responsive transcripts. We elucidated a set of core responsive transcripts including annotated and unannotated transcripts commonly expressed in the 4 cultivars but with different levels of expression. De novo assembly of unmapped reads to the Nipponbare genome generated a set of sequence contigs representing potential new transcripts that may be involved in tolerance to phosphate starvation. This study can be used for identification of genes and gene networks associated with environmental stress and the development of novel strategies for improving tolerance to phosphate starvation in rice and other cereal crops.
Project description:IR64 is a rice variety with high-yield that has been widely cultivated around the world. IR64 has been replaced by modern varieties in most growing areas. Given that modern varieties are mostly progenies or relatives of IR64, genetic analysis of IR64 is valuable for rice functional genomics. However, chromosome-level genome sequences of IR64 have not been available previously. Here, we sequenced the IR64 genome using synthetic long reads obtained by linked-read sequencing and ultra-long reads obtained by nanopore sequencing. We integrated these data and generated the de novo assembly of the IR64 genome of 367 Mb, equivalent to 99% of the estimated size. Continuity of the IR64 genome assembly was improved compared with that of a publicly available IR64 genome assembly generated by short reads only. We annotated 41,458 protein-coding genes, including 657 IR64-specific genes, that are missing in other high-quality rice genome assemblies IRGSP-1.0 of japonica cultivar Nipponbare or R498 of indica cultivar Shuhui498. The IR64 genome assembly will serve as a genome resource for rice functional genomics as well as genomics-driven and/or molecular breeding.
Project description:BACKGROUND:The availability of thousands of complete rice genome sequences from diverse varieties and accessions has laid the foundation for in-depth exploration of the rice genome. One drawback to these collections is that most of these rice varieties have long life cycles, and/or low transformation efficiencies, which limits their usefulness as model organisms for functional genomics studies. In contrast, the rice variety Kitaake has a rapid life cycle (9?weeks seed to seed) and is easy to transform and propagate. For these reasons, Kitaake has emerged as a model for studies of diverse monocotyledonous species. RESULTS:Here, we report the de novo genome sequencing and analysis of Oryza sativa ssp. japonica variety KitaakeX, a Kitaake plant carrying the rice XA21 immune receptor. Our KitaakeX sequence assembly contains 377.6?Mb, consisting of 33 scaffolds (476 contigs) with a contig N50 of 1.4?Mb. Complementing the assembly are detailed gene annotations of 35,594 protein coding genes. We identified 331,335 genomic variations between KitaakeX and Nipponbare (ssp. japonica), and 2,785,991 variations between KitaakeX and Zhenshan97 (ssp. indica). We also compared Kitaake resequencing reads to the KitaakeX assembly and identified 219 small variations. The high-quality genome of the model rice plant KitaakeX will accelerate rice functional genomics. CONCLUSIONS:The high quality, de novo assembly of the KitaakeX genome will serve as a useful reference genome for rice and will accelerate functional genomics studies of rice and other species.
Project description:BACKGROUND:The endoplasmic reticulum (ER) stress response is widely known to function in eukaryotes to maintain the homeostasis of the ER when unfolded or misfolded proteins are overloaded in the ER. To understand the molecular mechanisms of the ER stress response in rice (Oryza sativa L.), we previously analyzed the expression profile of stably transformed rice in which an ER stress sensor/transducer OsIRE1 was knocked-down, using the combination of preliminary microarray and quantitative RT-PCR. In this study, to obtain more detailed expression profiles of genes involved in the initial stages of the ER stress response in rice, we performed RNA sequencing of wild-type and transgenic rice plants produced by homologous recombination in which endogenous genomic OsIRE1 was replaced by missense alleles defective in ribonuclease activity. RESULTS:At least 38,076 transcripts were investigated by RNA sequencing, 380 of which responded to ER stress at a statistically significant level (195 were upregulated and 185 were downregulated). Furthermore, we successfully identified 17 genes from the set of 380 ER stress-responsive genes that were not included in the probe set of the currently available microarray chip in rice. Notably, three of these 17 genes were non-annotated genes, even in the latest version of the Rice Annotation Project Data Base (RAP-DB, version IRGSP-1.0). CONCLUSIONS:Therefore, RNA sequencing-mediated expression profiling provided valuable information about the ER stress response in rice plants and led to the discovery of new genes related to ER stress.
Project description:Purpose: Next-generation sequencing (NGS) has been utilized for systems-based analysis of rice plants. The goals of this study were to compare the transcriptome between non-transgenic (NT) control and OsTZF8 overexpressing transgenic plants. Methods: Total RNAs were extracted from the whole plants of OsTZF8 overexpressing plants (T4 generation, line number #20) and non-transgenic (NT) plant using RNeasy plant mini kit (Qiagen, Germany) according to the manufacturer’s instruction. cDNA libraries were prepared from total RNAs using TruSeq RNA sample Prep kit (v2) (Macrogen, Korea). Two biological replicates were analyzed by RNA-sequencing analysis. Single-end sequences were obtained using IRGSP (v 1.0) and raw sequence reads were trimmed to remove adaptor sequence, and those with a quality lower than Q20 were removed using the Trimmomatic 0.32 software (Bolger et al., 2014). To map the reads to reference genome, all reads were assembled with annotated genes from the Rap-DB database [http://rapdb.dna.affrc.go.jp; IRGSP (v 1.0)] using TopHat software (https://ccb.jhu.edu/software/tophat/index.shtml). Conclusions: Our study has identified downstream candidate genes regulated by overexpression of OsTZF8.
Project description:Gramene (http://www.gramene.org) is a knowledgebase for comparative functional analysis in major crops and model plant species. The current release, #54, includes over 1.7 million genes from 44 reference genomes, most of which were organized into 62,367 gene families through orthologous and paralogous gene classification, whole-genome alignments, and synteny. Additional gene annotations include ontology-based protein structure and function; genetic, epigenetic, and phenotypic diversity; and pathway associations. Gramene's Plant Reactome provides a knowledgebase of cellular-level plant pathway networks. Specifically, it uses curated rice reference pathways to derive pathway projections for an additional 66 species based on gene orthology, and facilitates display of gene expression, gene-gene interactions, and user-defined omics data in the context of these pathways. As a community portal, Gramene integrates best-of-class software and infrastructure components including the Ensembl genome browser, Reactome pathway browser, and Expression Atlas widgets, and undergoes periodic data and software upgrades. Via powerful, intuitive search interfaces, users can easily query across various portals and interactively analyze search results by clicking on diverse features such as genomic context, highly augmented gene trees, gene expression anatomograms, associated pathways, and external informatics resources. All data in Gramene are accessible through both visual and programmatic interfaces.